Hacker News new | past | comments | ask | show | jobs | submit login

We have hit similar issues with GKE. GKE has a soon to be deprecated feature called "metadata concealment"[1], it runs a proxy[2] that intercepts the GCE metadata calls. Some of Google's own libraries made metadata requests at such a high rate that the proxy would lock up and not service any requests. New pods couldn't start on nodes with locked up metadata proxies, because those same libraries that overloaded the proxy would hang if metadata wasn't available.

That was compounded by the metadata requests using DNS and the metadata IP, and until recently Kubernetes didn't have any built-in local DNS cache[3] (GKE still doesn't), which in turn overloaded kube-dns, making other DNS requests fail.

We worked around the issues by disabling metadata concealment, and added metadata to /etc/hosts using pod hostAliases:

      - ip: ""
          - "metadata.google.internal"
          - "metadata"
[1] https://cloud.google.com/kubernetes-engine/docs/how-to/prote...

[2] https://github.com/GoogleCloudPlatform/k8s-metadata-proxy

[3] https://kubernetes.io/docs/tasks/administer-cluster/nodeloca...

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact