Node Pressure Eviction in Kubernetes: When Your Node Says "Enough is Enough"

What Is Node-Pressure Eviction?

Node-pressure eviction is the process where the kubelet (the Kubernetes agent on each node) proactively terminates Pods to reclaim resources on that node. The goal is to prevent the node from becoming unstable or completely unresponsive when key resources are running low.

Key points:

It is automatic and driven by the kubelet on the node.
It is different from API-initiated eviction (like kubectl drain) and does not honor PodDisruptionBudgets.
It focuses on protecting node health first, individual Pods second.

When eviction happens, the kubelet marks the selected Pods as Failed and terminates them so that the scheduler can place new replicas on healthier nodes.

What Triggers Node-Pressure Eviction?

The kubelet continuously monitors a set of resource signals for each node. When those signals cross certain thresholds, the node is considered under pressure, and eviction can begin.

Common pressure types:

Memory pressure
- Signal: memory.available
- Node condition: MemoryPressure
- Example: Available memory on the node falls below a configured limit, like memory.available<500Mi.

Disk pressure
- Signals: nodefs.available, nodefs.inodesFree, imagefs.available, imagefs.inodesFree
- Node condition: DiskPressure
- Example: Root filesystem free space drops below a threshold (e.g., nodefs.available<10%).

PID pressure
- Signal: pid.available
- Node condition: PIDPressure
- Example: The node is running so many processes that there are too few PIDs left for new workloads.

These thresholds are configured as eviction thresholds. When a signal crosses its threshold, the kubelet knows it has to reclaim resources by evicting Pods.

Soft vs Hard Eviction Thresholds

Kubernetes supports two kinds of eviction thresholds: soft and hard. Understanding them helps you tune how aggressive evictions should be.

Soft eviction thresholds
- Example: memory.available<500Mi with a grace period like 1m30s.
- The kubelet waits for a configured grace period before evicting Pods.
- Evictions use a non-zero termination grace period.

Hard eviction thresholds
- Example: memory.available<100Mi or nodefs.available<10%.
- Trigger immediate eviction once crossed, with a 0s grace period by default.
- Designed as a last-resort safety net to save the node from crashing.

You configure these thresholds in the kubelet's configuration. Here is a typical kubelet config snippet:

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
evictionHard:
  memory.available: "100Mi"
  nodefs.available: "10%"
evictionSoft:
  memory.available: "500Mi"
evictionSoftGracePeriod:
  memory.available: "1m30s"

How Kubernetes Chooses Which Pods to Evict

Once a node is under pressure, the kubelet must pick which Pods to remove. The selection is not random: it relies on Pod priority, QoS class, and a few other factors.

1. QoS Classes

Pods are assigned a QoS (Quality of Service) class based on their resource requests and limits:

Guaranteed - All containers have memory and CPU requests equal to their limits. Evicted last under memory pressure.
Burstable - At least one container has requests set, but not all are equal to limits. Evicted after BestEffort, before Guaranteed.
BestEffort - No resource requests or limits specified. Evicted first when memory runs low.

2. Priority

Kubernetes Pod Priority and Preemption also influence eviction order. Higher-priority Pods are protected; lower-priority Pods are more likely to be evicted first.

3. Age and Usage

Factors like Pod age, recent resource usage, and critical system Pods can influence eviction decisions. System-critical Pods with special annotations or priority classes are usually protected from eviction.

Node Conditions and Oscillation

When a threshold is crossed, the kubelet sets the corresponding node condition (like MemoryPressure: True). The control plane and scheduler can then see that the node is unhealthy and may avoid placing new Pods there.

However, nodes can oscillate around soft thresholds: they repeatedly go above and below the limit in a short time. That can cause node conditions to flap and lead to poor eviction decisions. To prevent this, Kubernetes provides eviction-pressure-transition-period, which forces a minimum time before flipping a node condition again (default is 5 minutes).

Example: Memory Pressure in Action

Consider a node with 10 GiB of memory. You configure:

evictionSoft: memory.available<500Mi with a 90s grace period.
evictionHard: memory.available<100Mi.

If your workloads spike and free memory drops to 400 MiB:

The soft threshold is crossed, so the kubelet marks the node under memory pressure and starts the soft eviction countdown.
If memory stays below 500 MiB for 90 seconds, the kubelet begins evicting low-priority, BestEffort Pods first.
If memory plunges below 100 MiB, hard eviction kicks in, and Pods can be terminated almost immediately to save the node.

How to Reduce Unexpected Evictions

To survive Node-pressure evictions in production, you need both good configuration and good application design. Here are practical tips:

Set realistic resource requests and limits - Avoid running everything as BestEffort; give critical workloads proper requests and limits so they get at least Burstable, preferably Guaranteed QoS.
Use Pod Priority wisely - Assign higher priority to business-critical services, lower to batch or test workloads.
Tune kubelet eviction thresholds - Align soft and hard thresholds with your node size and typical usage patterns, not just defaults.
Monitor node conditions and evictions - Watch MemoryPressure, DiskPressure, PIDPressure, and track Pod reason: Evicted in your monitoring stack.
Clean up disk usage - Use log rotation and image garbage collection to avoid disk pressure evictions.

Quick Reference Table

Aspect	What It Means	Example / Notes
Node-pressure eviction	Kubelet evicts Pods to protect node when resources are low.	Automatically triggered on resource pressure.
Key signals	`memory.available`, `nodefs.available`, `imagefs.available`, `pid.available`	Mapped to Memory/Disk/PID pressure conditions.
Soft threshold	Eviction after grace period if pressure continues.	Example: `memory.available<500Mi` for `1m30s`.
Hard threshold	Immediate eviction with 0s grace by default.	Example: `memory.available<100Mi`.
QoS eviction order	BestEffort -> Burstable -> Guaranteed.	BestEffort gets evicted first under memory pressure.
Priority impact	Lower priority Pods evicted before higher priority.	Protects critical workloads.
Node condition flapping	Rapid toggling of pressure conditions.	Controlled with `eviction-pressure-transition-period`.
Typical eviction reason	Pod status shows `Evicted` with explanation in events.	Check `kubectl describe pod`.

Wrapping Up

Node-pressure eviction is Kubernetes' way of saying enough is enough. By understanding the triggers, thresholds, and eviction priorities, you can design your workloads and cluster configurations to handle pressure gracefully. Set your QoS classes wisely, monitor your node conditions, and tune those eviction thresholds to match your environment. Your Pods (and your on-call pager) will thank you.

Have you ever had a Pod mysteriously evicted in production? Drop a comment below and share your war stories!

Header Ads

Node Pressure Eviction in Kubernetes: When Your Node Says "Enough is Enough"

What Is Node-Pressure Eviction?

What Triggers Node-Pressure Eviction?

Soft vs Hard Eviction Thresholds

How Kubernetes Chooses Which Pods to Evict

1. QoS Classes

2. Priority

3. Age and Usage

Node Conditions and Oscillation

Example: Memory Pressure in Action

How to Reduce Unexpected Evictions

Quick Reference Table

Wrapping Up

No comments

Fashion

Nature

Tags

Popular

Recent

Comments

Followers

Blog Archive

Categories