Namespaces are how you carve up a Kubernetes cluster into logical chunks. ResourceQuotas are how you make sure those chunks don’t hog all the cluster’s resources.
Let’s see what a ResourceQuota actually does.
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: dev
spec:
hard:
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi
pods: "10"
This ResourceQuota object, applied to the dev namespace, is saying:
- "No pod in this namespace can request more than 1 CPU core or 1Gi of memory."
- "No pod in this namespace can have its limit set higher than 2 CPU cores or 2Gi of memory."
- "You can’t create more than 10 pods in this namespace."
The key here is "requests" vs. "limits."
- Requests are what Kubernetes uses to schedule your pods. If a pod requests 500m CPU and 1Gi of memory, Kubernetes will only place that pod on a node that has at least that much available CPU and memory. This is a guarantee.
- Limits are the absolute ceiling. A pod can never exceed its defined limits. If a pod tries to use more CPU than its limit, it will be throttled. If it tries to use more memory than its limit, it will be OOMKilled (Out Of Memory Killed).
If a pod doesn’t specify requests or limits in its own definition, some defaults might kick in depending on your cluster’s configuration, but generally, it’s best practice to define them explicitly for predictable behavior.
When you create a new pod in the dev namespace, Kubernetes checks the compute-quota. If creating this pod would violate any of the hard constraints (e.g., exceeding the total requested CPU or the pod count), the pod creation will be rejected with an error.
Consider a scenario where you have a node with 4 CPU cores and 8Gi of memory.
- If you have a
ResourceQuotaforrequests.cpu: "2"andrequests.memory: "4Gi", you could potentially schedule two pods, each requesting 1 CPU and 2Gi of memory, or four pods, each requesting 500m CPU and 1Gi of memory. - The
limitsare checked too. If a pod is created withrequests.cpu: "500m"andlimits.cpu: "3", but yourResourceQuotahaslimits.cpu: "2", that pod creation will fail. This prevents a single pod from potentially destabilizing the node even if its request was small.
What about storage? You can also quota PersistentVolumeClaims.
apiVersion: v1
kind: ResourceQuota
metadata:
name: storage-quota
namespace: staging
spec:
hard:
requests.storage: 10Gi
persistentvolumeclaims: "5"
This ResourceQuota limits the staging namespace to a total of 10Gi of requested storage across all PersistentVolumeClaims and allows a maximum of 5 PVCs. Note that this is for requested storage, not the actual provisioned storage. If a PVC requests 5Gi and another requests 6Gi, the second one will be rejected.
You can also set quotas on the number of objects. The pods: "10" in the first example is a count quota. You can also limit services, replicationcontrollers, secrets, configmaps, and persistentvolumeclaims.
The most surprising thing about ResourceQuota is how it interacts with LimitRange. A LimitRange object defines default resource requests and limits for pods that don’t specify them, and also enforces minimums and maximums for individual pods. If a pod does specify its own requests/limits, it must still satisfy the ResourceQuota and the LimitRange’s min/max constraints. A ResourceQuota enforces aggregate consumption for the namespace, while a LimitRange enforces individual pod constraints and defaults. They work together, but they solve slightly different problems.
The ResourceQuota is enforced by the Kubernetes API server. When you kubectl apply a YAML file that creates or updates a pod, deployment, or other resource that consumes compute or storage, the API server checks if the namespace has an active ResourceQuota. If the proposed change would violate the quota, the API server rejects the request before it even gets to the kube-scheduler or kubelet. This is why you see errors like exceeded quota immediately upon trying to create a resource.
This means that if you’re trying to create a pod and it’s being rejected, the first thing you should check is the ResourceQuota for that namespace. You can see the current usage against the quota with kubectl describe resourcequota <quota-name> -n <namespace>. This will show you the hard limits and the current used values.
If you’re hitting a quota, you have a few options:
- Increase the quota: Edit the
ResourceQuotaobject and increase thehardvalues. - Reduce consumption: Delete unnecessary pods, reduce the requests/limits of existing pods, or delete unused PVCs.
- Move to another namespace: If another namespace has available quota, you might be able to move your workload there.
The next thing you’ll likely run into after managing resource consumption is how to ensure specific workloads get priority when resources are scarce, which leads into the concept of PriorityClass.