K3S & NetworkPolicy: a short story of pain

I’ve been running a small K3S setup to toy with various things, and recently I’d added a deny-all NetworkPolicy to prevent different namespaces from talking to each other:

1
2
3
4
5
6


spec: 
  podSelector:
     matchLabels: {}
  ingress:
  - from:
    - podSelector: {}

It’s one of those example policies and it does exactly what it says: allows all pods to talk to all pods, thus isolating them from other namespaces.

Now, I run WP in one of those namespaces, and I have a simple CronJob to make WP tick. Basically, it runs a container with curl: curl -v -H 'X-Forwarded-Proto: https' -H 'Host: example.net' http://wordpress-nginx/wp-cron.php?doing_wp_cron >/dev/null.

To my surprise, the aforementioned NetworkPolicy broke the cron:

curl: (7) Failed to connect to wordpress-nginx port 80: Connection refused

Now, it’s a single-machine “cluster”, so the debugging is somewhat simplified. First we dump the iptables state and then investigate the rules. It’s easy to find what we need as the rules are heavily commented, and I quickly discover that:

# iptables -vnL | grep -C30 wp-deny-ingress
Chain KUBE-NWPLCY-N3UYLMSM6TD3OPID (14 references)
 pkts bytes target     prot opt in     out     source               destination
    0   0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* rule to ACCEPT traffic from source pods to dest pods selected by policy name wp-deny-ingress namespace wp */ match-set KUBE-SRC-VPMOAVRHXRG3DQAX src match-set KUBE-DST-5CRKWDN5P3PSUNBY dst

wp-deny-ingress is my NetworkPolicy. From here we can see that the governing ipset is KUBE-SRC-VPMOAVRHXRG3DQAX. Let’s look it up:

# ipset list | grep -A30 KUBE-SRC-VPMOAVRHXRG3DQAX
Name: KUBE-SRC-VPMOAVRHXRG3DQAX
Type: hash:ip
Revision: 4
Header: family inet hashsize 1024 maxelem 65536 timeout 0
Size in memory: 1624
References: 1
Number of entries: 17
Members:
10.42.0.222 timeout 0
10.42.0.130 timeout 0
10.42.0.132 timeout 0
10.42.0.140 timeout 0
10.42.0.126 timeout 0
10.42.0.182 timeout 0
10.42.0.229 timeout 0
10.42.0.151 timeout 0
10.42.0.157 timeout 0
10.42.0.187 timeout 0
10.42.0.152 timeout 0
10.42.0.221 timeout 0
10.42.0.216 timeout 0
10.42.0.153 timeout 0
10.42.0.150 timeout 0
10.42.0.186 timeout 0
10.42.0.128 timeout 0

It all looks reasonable, apart from the fact that a freshly started pod that exhibits the problem has an IP of 10.42.0.230 and it’s not in the set, thus we see the REJECT behaviour. In a couple of seconds though, the IP is added to the ipset and the network traffic flows as intended!

Apparently there’s a race condition before the pod startup and firewall updates, and a simple container that only runs curl can launch (and terminate) faster than the rules will be properly propagated. Unfortunately is the specifics of the CNI stack. In a better world the networking would have reported it’s good to go, or the container would have retried a couple of times. In fact, even having something like istio makes the failure chance less likely, as istio proxies the traffic and is tolerant to transient failures like this.

My case? It’s a straightforward hack:

1

sleep 20; curl -v -H 'X-Forwarded-Proto: https' -H 'Host: example.net' http://wordpress-nginx/wp-cron.php?doing_wp_cron >/dev/null

With this the pod waits 20 seconds before running curl, and it seems that the firewall rules propagate in about 10 seconds, thus the failures went away.

You’d think that kubernetes is an overkill for a simple WP blog…