Pod Topology Spread Now Honors Taints and Affinity
Kubernetes has long offered Pod Topology Spread Constraints to distribute pods evenly across failure domains. However, a notable gap persisted: the scheduler would consider all nodes, even those tainted and untolerated by the pod, while calculating skew. This often led to unexpected Pending
states, even when tolerable nodes existed.
With KEP-3094, Kubernetes v1.33 addresses this with two new optional fields:
NodeAffinityPolicy
NodeTaintsPolicy
These allow fine-grained control over which nodes are considered during pod distribution.
What Changed?¶
The TopologySpreadConstraint
now includes:
When set to Honor
:
- Only nodes satisfying the pod’s
nodeAffinity
ornodeSelector
are considered. - Only nodes that are either untainted or tolerated by the pod are considered.
This behavior is behind the feature gate NodeInclusionPolicyInPodTopologySpread
(enabled by default from v1.33).
Real-World Impact¶
Consider this deployment:
Without nodeTaintsPolicy: Honor
, a pod might be expected to land on a node it cannot tolerate, resulting in a Pending
state. With it, tainted nodes are excluded from the skew calculation unless tolerated—making scheduling behavior intuitive and predictable.
Why It Matters¶
- Predictability: No more surprise pending pods due to unreachable tainted nodes.
- Explicit Control: Node inclusion policies now mirror the flexibility of other scheduling primitives like affinity.
- Gradual Adoption: Defaults maintain backward compatibility. If unset, the scheduler behaves as before.
Observability & Stability¶
- Metrics such as
plugin_execution_duration_seconds{plugin="PodTopologySpread"}
help monitor performance. - Scheduling decisions are observable through logs and pod status.
- Feature has passed integration and upgrade tests with no impact on existing APIs or cluster resources.
Final Thoughts¶
This enhancement may seem minor, but it's a key step toward making Kubernetes scheduling more accurate, transparent, and user-driven. For platforms relying on strict node segregation (e.g., GPU pools, burstable zones), it eliminates a major source of scheduling surprises.
For cluster operators, enabling NodeTaintsPolicy
and NodeAffinityPolicy
gives finer control with no risk to existing workloads—an opt-in that brings measurable value.
FAQs
What issue does KEP-3094 solve in Kubernetes scheduling?
KEP-3094 addresses the problem where the scheduler previously considered all nodes, including tainted or affinity-mismatched ones, when calculating Pod Topology Spread. This often led to Pending pods even if tolerable nodes were available.
What are nodeAffinityPolicy and nodeTaintsPolicy in topologySpreadConstraints?
These new optional fields in topologySpreadConstraints
allow the scheduler to honor pod-level node affinity and taint toleration rules during skew calculations.
nodeAffinityPolicy: Honor
considers only nodes that match the pod's node affinity.nodeTaintsPolicy: Honor
excludes tainted nodes unless the pod tolerates them.
How does this change affect scheduling behavior?
When these policies are set to Honor
, the scheduler excludes nodes the pod cannot land on from skew calculations. This ensures pods aren't marked as Pending due to unreachable nodes, resulting in more predictable and accurate scheduling.
Is this feature enabled by default in Kubernetes v1.33?
Yes. The feature gate NodeInclusionPolicyInPodTopologySpread
is enabled by default in Kubernetes v1.33. If the new fields are unset, behavior remains unchanged for backward compatibility.
Why is this important for production clusters?
It provides predictable distribution, especially in environments with node taints (e.g., GPU pools) or affinity rules. It ensures spread constraints reflect actual node eligibility, eliminating scheduling inconsistencies and reducing manual debugging.