Cilium has this fancy feature where it can filter the outgoing DNS traffic. Basically, you’d have a policy like this:
|
|
It will proxy your DNS requests through cilium and reject everything that’s not allowed. This way, the pod can’t even resolve the external domains it’s not supposed to talk to (unless it’s not using DoH, DoT or any other resolution strategy).
This comes with a (slightly annoying) problem: by default, kubernetes has ndots set to 5 and a search path so that you could use myservice
and hit myservice.namespace.svc.cluster.local
, etc. This means that nslookup google.com
will try:
-
google.com.default.svc.cluster.local.
-
google.com.svc.cluster.local.
-
google.com.cluster.local.
-
google.com.silly-fish.ts.net. (oh look, if you run tailscale, it will sneak in here too!)
And will finally resolve on google.com.
However, with cilium in the mix and the policy above, cilium will return the error code REFUSED:
|
|
And at this point no further lookups will be made.
While cilium says it’s an alpine problem, I saw this on a glibc debian too. Luckily, both workarounds are sutable, too.
You can either add ndots:
|
|
This way a single dot is enough to make the domain fully qualified. google.com
will be a FQDN, localservice
will still use a search path.
The other option is to ask cilium to return NXDOMAIN via its helm chart config:
|
|
Now the failing lookups will return NXDOMAIN instead of REFUSED, and the resolver will continue trying.
Of course, there’s an option to properly qualify your domains (e.g. by using google.com.
), but it might be trickier for services resolved implicitly in your dependencies.