October 13, 2017
After your upgrade to OpenShift v3.6 did the deployment of cluster metrics wind up with empty graphs? Check if the heapster pod failed to start due to a missing secret called heapster-certs
in the openshift-infra
namespace.
Problem
Heapster pod is failing to start
$ oc get pods
NAME READY STATUS RESTARTS AGE
hawkular-cassandra-1-l1f3s 1/1 Running 0 9m
hawkular-metrics-rdl07 1/1 Running 0 9m
heapster-cfpcj 0/1 ContainerCreating 0 3m
Check what volumes it is attempting to mount
$ oc volume rc/heapster
replicationcontrollers/heapster
secret/heapster-secrets as heapster-secrets
mounted at /secrets
secret/hawkular-metrics-account as hawkular-metrics-account
mounted at /hawkular-account
secret/hawkular-metrics-certs as hawkular-metrics-certs
mounted at /hawkular-metrics-certs
secret/heapster-certs as heapster-certs
mounted at /heapster-certs
Check for the existence of the heapster-certs secret
$ oc get secrets heapster-certs
Error from server (NotFound): secrets "heapster-certs" not found
Solution
Maybe you, like I, overlooked a v3.3 tech preview feature called service serving certificates. You missed that this became mandatory in v3.6 because it is not yet in the release notes. See also this bug.
However, even if you have /etc/origin/master/service-signer.crt
in my case it was not visible because of this commit to the v3.3 upgrade playbook had a typo placing servicesServingCert
instead of serviceServingCert
in /etc/origin/master/master-config.yaml
. e.g.
controllerConfig:
serviceServingCert:
signer:
certFile: service-signer.crt
keyFile: service-signer.key
And now it has been fixed in PR 5765