notes on cluster problems when taking my beta exam tonight :-(


Caio Begotti
 

Hi, I finally took my beta exam and at some point when I was working my way to question 29, with about 1h remaining, the cluster stopped responding to all kubectl commands. Cluster-info showed only master was up, no signal of etcd so I suppose the whole thing was not going to play ball anymore.

I tried to take a screenshot of that to show you guys here but the proctor freaked out immediately and treated to terminate my exam. I tried to explain to him what was going on but it seems proctors have zero context of what is going on with the exam and its content and its environment... bummer.

So, bottom line is that I think I did well with about 21 questions answered. As for the remaining ones, question 29 was answered on disk /home/student/q29.conf but not applied to the cluster due to the problem above. I left other 2 pretty questions (one to sort CPU usage and another one I can't recall now) for last but I guess I was too dumb in doing so, I could have got 2 more answered fine. All the others I had left for last (basically the troubleshooting ones) were left behind for good...

I don't now how this will be graded, nor it anything done to the cluster was my fault or what but I tried to look around for some Juju and Snap fu tricks to restore etcd from a working unit in the cluster but I guess the one we log in to is set up differently than the Juju units via CDK. CDK I think protects "local" snaps like etcd so there is not way to bork the cluster, but somehow the outer env where /home/student is I think people will quite easily bork it if it keeps the way it is today when the exam goes live!

— Caio Begotti

Join cncf-kubernetescertwg@lists.cncf.io to automatically receive all group messages.