Kubernetes v1.36: In-Place Vertical Scaling for Pod-Level Resources Graduates to Beta

Following the graduation of Pod-Level Resources to Beta in v1.34 and the General Availability (GA) of In-Place Pod Vertical Scaling in v1.35, the Kubernetes community is thrilled to announce the next major milestone in resource management: In-Place Pod-Level Resources Vertical Scaling has graduated to Beta in Kubernetes v1.36!

This feature is now enabled by default via the InPlacePodLevelResourcesVerticalScaling feature gate. It represents the final piece of the puzzle for dynamic resource management, combining the aggregate budget of Pod-level resources with the non-disruptive scaling capabilities of in-place resize.

Why Pod-Level In-Place Resize?

Until now, resizing a Pod's resource budget required updating individual containers. While powerful, this added complexity for Pods with many containers (like those with service mesh sidecars or logging agents). Pod-level resizing allows you to adjust the "aggregate envelope" of the Pod in a single operation.

Key benefits include:

  • Simplified Operations: Scale the entire Pod's capacity without calculating individual container shares.
  • Sidecar Resilience: Allow sidecars to consume spare CPU/memory from the main application during spikes, and increase that total pool on-the-fly.
  • Zero Downtime: Like container-level in-place resize, this feature avoids Pod restarts when the underlying runtime and resource limits allow.

The Concept: The Shared Envelope

When you specify spec.resources at the Pod level, you define a hard ceiling for the entire Pod. Containers within the Pod that do not have their own explicit limits "float" within this envelope, sharing the aggregate budget.

graph TD subgraph PodEnvelope ["Pod Envelope (spec.resources)"] direction TB C1["Main App (Explicit: 1 CPU)"] C2["Sidecar (No Limit - Inherits Pod Limit)"] C3["Helper (No Limit - Inherits Pod Limit)"] end C1 --- |Capped at 1 CPU| Aggregate_Enforcement C2 --- |Shares remaining pool| Aggregate_Enforcement C3 --- |Shares remaining pool| Aggregate_Enforcement Aggregate_Enforcement --> |In-Place Resize| New_Limit["New Pod Limit"]

In v1.36, you can now PATCH this spec.resources field on a running Pod, and the Kubelet will dynamically update the Pod-level cgroups.

Example: A Resizable Pod

Here is how you define a Pod that uses Pod-level resources and is ready for in-place resizing:

apiVersion: v1
kind: Pod
metadata:
  name: resizable-app
spec:
  # Pod-level resource budget
  resources:
    requests:
      cpu: "1"
      memory: "2Gi"
    limits:
      cpu: "2"
      memory: "4Gi"
  containers:
  - name: main
    image: my-app:v1
    resources:
      requests:
        cpu: "500m"
    # To ensure no restart on resize, set resizePolicy
    resizePolicy:
    - resourceName: "cpu"
      restartPolicy: "NotRequired"
    - resourceName: "memory"
      restartPolicy: "NotRequired"
  - name: sidecar
    image: sidecar:v1
    # This container inherits the Pod-level limit

Technical Nuance: Interaction with resizePolicy

One of the most critical aspects for engineers to understand is how Pod-level resizes interact with container-level restart policies. While you are resizing the Pod, the decision to restart a container is still governed by the resizePolicy defined within each individual container.

How Inheritance Triggers Restarts

If a container does not have an explicit resource limit, it effectively inherits the Pod-level limit as its boundary. When you resize spec.resources at the Pod level, the Kubelet recognizes this as a change to the effective resources of every container that lacks its own limit.

Crucially, the Kubelet will then consult the resizePolicy for that specific container and resource:

  • Per-Container: Each container in the Pod can have a different policy.
  • Per-Resource: A container can have a NotRequired policy for CPU but a RestartContainer policy for memory.

For example, if you increase the Pod's memory limit, any container that inherits that limit and has resourceName: memory set to restartPolicy: RestartContainer will be restarted to safely apply the new memory boundary.

Best Practice for Zero-Downtime

To ensure a truly in-place, restart-free resize of your Pod's aggregate budget, you should explicitly set the resizePolicy for your containers to NotRequired (or use the default PreferNoRestart for CPU):

    resizePolicy:
    - resourceName: "cpu"
      restartPolicy: "NotRequired"
    - resourceName: "memory"
      restartPolicy: "NotRequired"

Observability: Tracking the Convergence

The Pod status API has been extended to provide transparency into the resize process. When a resize is requested, three fields tell the story:

  1. spec.resources: Your desired target.
  2. status.allocatedResources: What the Node has currently reserved (updated as soon as the Kubelet admits the resize).
  3. status.resources: What is actually enforced in the cgroups (updated after the runtime confirms the change).
status:
  # 'allocatedResources' represents the node's reservation.
  allocatedResources:
    cpu: "2"
    memory: "4Gi"
  # 'resources' shows what the Kubelet has actually applied.
  resources:
    requests:
      cpu: "1"
      memory: "2Gi"
    limits:
      cpu: "2"
      memory: "4Gi"

How to use it in v1.36

Resizing is performed using the resize subresource. This ensures that only resource fields are modified and provides a cleaner RBAC model.

Example Command:

kubectl patch pod resizable-app --subresource resize --patch \
  '{"spec":{"resources":{"requests":{"cpu":"2"}, "limits":{"cpu":"4"}}}}'

Requirement: You must use kubectl v1.32+ to access the --subresource resize flag.

Important Constraints & Caveats

  • cgroup v2 Only: This feature relies on the unified hierarchy of cgroup v2 for accurate aggregate enforcement. It is not supported on cgroup v1 nodes.
  • No Windows Support: Currently, Pod-level resource enforcement and in-place resize are exclusive to Linux-based nodes.
  • Feature Dependencies: This feature requires the following feature gates to be enabled: PodLevelResources, InPlacePodVerticalScaling, InPlacePodLevelResourcesVerticalScaling, and NodeDeclaredFeatures.

What's Next?

As we move toward General Availability (GA), the community is focusing on:

  • Vertical Pod Autoscaler (VPA) Integration: Native support for triggering Pod-level resizes automatically based on observed usage.
  • Feature Hardening: Refining metrics and observability to help operators detect and resolve "stuck" resizes across diverse environments.

Get Involved

We need your feedback! Try out the feature in your dev clusters and let us know your experience in the #sig-node channel on the Kubernetes Slack.

Your input helps us ensure that Kubernetes remains the most flexible platform for running diverse workloads at scale.