Linkerd can't be installed when updating K8s to 1.19 #5586
Unanswered
yuzsun
asked this question in
Show and tell
Replies: 1 comment 1 reply
-
@yuzsun This doesn't give us much information to go on. I suggest opening an issue that describes the exact steps you took when you encountered this problem. It would also be useful to include the output of |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
after we updated the AKS from 1.18 to 1.19, we re-installed linkerd with HA mode, and several linkerd pods were in "CrashLookBackOff" status, however, after we re-run the "helm chart", the pods turned int/o "running".
++ added the detailed story:
we were upgrading AKS from 1.17 to 1.19, at first we upgrade AKS from 1.17 to 1.18, and everything is fine. Then we upgraded from 1.18 to 1.19, however, after updating to 1.19, we found there are several pods disappeared.
We confirmed there were “node failed to drain” errors during the AKS updating process, and we solved the error by drain the node manually, thus there were several pods disappeared. we tried to get those pods back by re-deploying the deployment but failed.
By checking the backend logs, we found there were issues when the pod “linkerd-destination” connecting to the service, with the error message: “sync "linkerd/linkerd-destination-67db6c788f" failed with Internal error occurred: failed calling webhook "linkerd-proxy-injector.linkerd.io": Post "https://linkerd-proxy-injector.linkerd.svc:443/?timeout=30s": dial tcp 10.100.10.251:443: connect: connection refused”.
However, we checked the service and confirmed there were no pods in the backend of the service, and there were no webhook “linkerd-proxy-injector.linkerd.io”. When running the command “linkerd check”, we found the same error “FailedCreate: Internal error occurred: failed calling webhook "linkerd-proxy-injector.linkerd.io": Post "https://linkerd-proxy-injector.linkerd.svc:443/?timeout=30s": dial tcp 10.100.10.251:443: connect: connection refused”.
So then we moved the linkerd, and found all the missing pods were back. however, then when we tried to re-install the linkerd following the document: https://linkerd.io/2/tasks/install-helm/#setting-high-availability, we found all the linkerd pods were in “imagePullBackOff” status. However, after we delete the namespace and re-run helm chart, the pods turned to “running” status.
Beta Was this translation helpful? Give feedback.
All reactions