SELinux is preventing K3s components from communicating with each other on your nodes.
Here’s why it’s breaking and how to fix it:
Common Causes and Fixes
-
SELinux Policy Mismatch: The most frequent culprit is that SELinux, by default, doesn’t have the necessary rules to allow K3s processes to operate correctly. K3s needs to bind to specific ports, create files in certain locations, and communicate between its agent and server components, all of which can be blocked by a restrictive SELinux policy.
-
Diagnosis: Check the audit logs for SELinux denials related to K3s processes (e.g.,
k3s,containerd,runc).sudo ausearch -m avc -ts recentLook for lines containing
comm="k3s"orcomm="containerd". -
Fix: Install the K3s SELinux policy module.
sudo dnf install k3s-selinux # or for Debian/Ubuntu: # sudo apt-get install k3s-selinuxThis package provides pre-compiled SELinux policy rules specifically for K3s. After installation, reload the policy.
sudo semodule -BThis command rebuilds the SELinux policy database, incorporating the new rules.
-
Why it works: The
k3s-selinuxpackage contains a.ppfile (policy package) that defines the security contexts and permissions required for K3s components. Installing it makes these rules active, allowing K3s processes to perform their necessary operations without being denied by SELinux.
-
-
Incorrect File Contexts: Even with the policy module installed, if K3s or its related files (like certificates or configuration directories) have been created or moved with incorrect SELinux contexts, SELinux will block access.
-
Diagnosis: Verify the SELinux context of K3s data directories and critical files.
ls -Zd /var/lib/rancher/k3s/ ls -Zd /etc/rancher/k3s/ ls -Zd /var/lib/containerd/You should see contexts like
container_file_torvar_lib_tfor data directories and potentiallyk3s_etc_tfor configuration. -
Fix: Recursively set the correct SELinux contexts.
sudo semanage fcontext -a -t container_file_t "/var/lib/rancher/k3s(/.*)?" sudo semanage fcontext -a -t k3s_etc_t "/etc/rancher/k3s(/.*)?" sudo restorecon -Rv /var/lib/rancher/k3s/ sudo restorecon -Rv /etc/rancher/k3s/semanage fcontextdefines the default context for files, andrestoreconapplies these contexts to existing files. -
Why it works: SELinux uses file contexts (labels) to enforce policies. By ensuring K3s related files and directories have the correct labels, you align them with the rules defined in the SELinux policy, allowing authorized access.
-
-
Containerd Runtime Issues: K3s uses containerd as its container runtime. SELinux can interfere with containerd’s ability to manage containers, pull images, or interact with the kernel.
-
Diagnosis: Examine audit logs for denials related to
containerdorrunc.sudo ausearch -m avc -ts recent | grep containerd -
Fix: Ensure the
containerdSELinux policy is correctly applied. This is usually covered by thek3s-selinuxpackage, but sometimes requires specific adjustments. If you installed containerd separately or have a custom setup, you might need to manage its policy.# This might be needed if k3s-selinux isn't enough for containerd # sudo dnf install containerd-selinux # or apt equivalent # sudo semodule -B -
Why it works: Just like K3s, containerd needs specific SELinux permissions to operate its low-level functions, such as creating namespaces or managing storage. The correct policy allows these actions.
-
-
K3s Server/Agent Network Binding: K3s server needs to listen on specific ports (e.g., 6443 for the API, 8472 for Flannel VXLAN). SELinux might prevent K3s from binding to these ports if the policy doesn’t allow it.
-
Diagnosis: Check logs for errors related to network binding.
sudo journalctl -u k3s -fLook for messages like "bind: permission denied."
-
Fix: The
k3s-selinuxpackage should handle this. If not, you might need to usesemanage portto explicitly allow K3s to bind to the necessary ports.# Example for port 6443 if it were an issue, though k3s-selinux should cover it. # sudo semanage port -a -t k3s_server_port_t -p tcp 6443 # sudo semodule -B -
Why it works:
semanage portallows you to define SELinux port types, associating specific network ports with security contexts that K3s processes are allowed to bind to.
-
-
Custom Network Plugins or CNI: If you’re using a CNI plugin other than the default Flannel, or have custom network configurations, SELinux might block the CNI daemon or Kubelet from setting up network interfaces or routing rules.
-
Diagnosis: Audit logs showing denials for processes like
kubelet,cni, or specific CNI binaries.sudo ausearch -m avc -ts recent | grep -E 'kubelet|cni|calico|cilium' -
Fix: Ensure your CNI plugin has SELinux support or create custom policy modules for it. Some CNIs provide their own SELinux policy packages.
# Example if using Calico and it had a policy package # sudo dnf install calico-selinux # sudo semodule -BFor custom CNI configurations, you’d need to write and load your own SELinux policy.
-
Why it works: Network plugins require extensive permissions to manipulate network interfaces, routing tables, and firewall rules. SELinux needs explicit rules to permit these actions for the CNI components.
-
-
Temporary Permissive Mode (for quick debugging): While not a permanent fix, temporarily setting SELinux to permissive mode can help isolate if SELinux is indeed the sole cause of the problem.
-
Diagnosis: If K3s starts working after this, SELinux is the issue.
sudo setenforce 0Crucially, remember to re-enable it immediately after testing.
sudo setenforce 1 -
Fix: This is not a fix, but a diagnostic step. The actual fix involves one of the above methods.
# To make permissive permanent (not recommended for production): # sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config -
Why it works: In permissive mode, SELinux logs denials but does not enforce them, allowing operations that would otherwise be blocked. This helps confirm SELinux is the source of the problem, guiding you to the correct policy adjustment.
-
After applying the k3s-selinux package and ensuring correct file contexts, you should see K3s components start and communicate successfully. If you encounter further issues, check the audit logs (/var/log/audit/audit.log) for new SELinux denials.
The next error you’ll hit is likely a Kubernetes API server timeout if the K3s agent cannot reach the server.