Skip to content

fix(kubelet): remove streamingConnectionIdleTimeout for k8s >= 1.34#8812

Open
abigailliang-aks-sig-node wants to merge 12 commits into
scriptless-kubelet-config-file-syncfrom
remove-streaming-connection-idle-timeout
Open

fix(kubelet): remove streamingConnectionIdleTimeout for k8s >= 1.34#8812
abigailliang-aks-sig-node wants to merge 12 commits into
scriptless-kubelet-config-file-syncfrom
remove-streaming-connection-idle-timeout

Conversation

@abigailliang-aks-sig-node

@abigailliang-aks-sig-node abigailliang-aks-sig-node commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Remove streamingConnectionIdleTimeout from both the kubelet command line and the kubelet config file for Kubernetes >= 1.34, where this field was removed from KubeletConfiguration.

Context: RP-side flow (scriptless path)

Step 1: defaults_kubelet.go — removeKubeletFlags()
        └── k8s >= 1.34: delete(kubeletConfig, "--streaming-connection-idle-timeout")
                ↓
        kubeletConfig map (flag is already gone)
                ↓
Step 2: scriptless_direct.go — processKubeletConfigFileContent()
        └── cfg.StreamingConnectionIdleTimeout = kubeletFlags["--streaming-connection-idle-timeout"]
            → empty string (key was deleted in step 1)
                ↓
Step 3: scriptless_direct.go — processKubeletFlags()
        └── filterKubeletFlags() strips translated flags when enableKCF=true
            → flag not on command line either way
                ↓
Step 4: Proto sent to aks-node-controller
        └── KubeletConfigFileConfig.StreamingConnectionIdleTimeout = "" (omitted by protojson)

RP already handles this in removeKubeletFlags(). However, the field may still arrive in KubeletConfigFileConfig from:

  • Older RP versions that haven't rolled out yet
  • The GetAKSNodeConfig (NBC conversion) path where the removal may not be applied

Changes (defense-in-depth)

File Change
aks-node-controller/helpers/utils.go ValidateAndSetLinuxKubeletFlags(): delete --streaming-connection-idle-timeout for >= 1.34
aks-node-controller/parser/parser.go getCSEEnv(): clear KubeletConfigFileConfig.StreamingConnectionIdleTimeout for >= 1.34 before marshaling
pkg/agent/baker.go ValidateAndSetLinuxNodeBootstrappingConfiguration(): same flag deletion for the old NBC path

Why both flag deletion AND config file clearing?

  • Flag deletion (helpers/utils.go, baker.go): covers enableKCF=false case where translated flags stay on the command line
  • Config file clearing (parser.go): covers enableKCF=true case where the field would appear in kubeletconfig.json if RP sets it

Test plan

  • unit tests
  • local e2e tests

Copilot AI review requested due to automatic review settings July 1, 2026 22:18

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to prevent streamingConnectionIdleTimeout from being propagated to kubelet for Kubernetes >= 1.34, where the corresponding KubeletConfiguration field/flag has been removed, by stripping it from kubelet CLI flags and clearing it from the kubelet config-file content.

Changes:

  • Delete --streaming-connection-idle-timeout from kubelet flags for Kubernetes >= 1.34 in both the legacy NodeBootstrappingConfiguration path and the aks-node-controller helper path.
  • Clear KubeletConfigFileConfig.StreamingConnectionIdleTimeout for Kubernetes >= 1.34 before marshaling kubelet config-file JSON in aks-node-controller.
  • Add version gating comments explaining why the field must not appear for 1.34+.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
pkg/agent/baker.go Removes --streaming-connection-idle-timeout from kubelet flags for k8s >= 1.34 in the NodeBootstrappingConfiguration validation path.
aks-node-controller/parser/parser.go Clears StreamingConnectionIdleTimeout before kubelet config-file JSON marshaling for k8s >= 1.34.
aks-node-controller/helpers/utils.go Removes --streaming-connection-idle-timeout from kubelet flags for k8s >= 1.34 in helper validation.

Comment thread aks-node-controller/parser/parser.go
Comment thread aks-node-controller/helpers/utils.go
Copilot AI review requested due to automatic review settings July 1, 2026 22:56
@abigailliang-aks-sig-node abigailliang-aks-sig-node changed the base branch from main to scriptless-kubelet-config-file-sync July 1, 2026 22:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Comment thread aks-node-controller/parser/parser.go
ImageGcHighThresholdPercent: to.Ptr(int32(90)),
ImageGcLowThresholdPercent: to.Ptr(int32(70)),
CgroupsPerQos: to.Ptr(true),
CpuManagerPolicy: "static",

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change block due to gofmt/goimports column alignment.

When a struct literal contains multiple fields, gofmt attempts to align the values ​​following the colons into the same column. The width of the aligned column is determined by the longest field name in the group:

// nested
"--client-ca-file": "/etc/kubernetes/certs/ca.crt",
"--authentication-token-webhook": "true",
"--authorization-mode": "Webhook",

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same gofmt behave

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

The latest Buf updates on your PR. Results from workflow Buf CI / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJul 3, 2026, 12:14 AM

@abigailliang-aks-sig-node abigailliang-aks-sig-node force-pushed the scriptless-kubelet-config-file-sync branch from 917e68a to 7ad9f47 Compare July 2, 2026 22:25
@abigailliang-aks-sig-node abigailliang-aks-sig-node force-pushed the remove-streaming-connection-idle-timeout branch from aaa9706 to 2075fc1 Compare July 2, 2026 22:48
streamingConnectionIdleTimeout was removed from KubeletConfiguration
in Kubernetes 1.34. This field must not appear on the command line or
in the kubelet config file for those versions.

RP already strips the flag from the incoming payload via
removeKubeletFlags() for >= 1.34. This change adds defense-in-depth
checks in AgentBaker to ensure the field is also cleared from
KubeletConfigFileConfig before marshaling.
…version gate

Add unit tests asserting that --streaming-connection-idle-timeout is:
- Kept for k8s 1.33 (below threshold)
- Removed for k8s 1.34.0 (exact boundary)
- Removed for k8s 1.35.0 (above threshold)

Tests cover all three paths:
- aks-node-controller/helpers: ValidateAndSetLinuxKubeletFlags
- aks-node-controller/parser: getCSEEnv config file marshaling
- pkg/agent: ValidateAndSetLinuxNodeBootstrappingConfiguration
Remove the deprecated streamingConnectionIdleTimeout field from
helper_test.go fixtures (both proto input and expected JSON/flags
output). The field is deprecated and should not be part of the
default test baseline.
…lpers

The helpers/ValidateAndSetLinuxKubeletFlags test is redundant — it tests
a function that only exists as a test helper (mirrors RP logic). The same
version-gated removal is already covered by:
- parser_test.go: tests the aks-node-controller production path
- pkg/agent/utils_test.go: tests the RP library production path
…for k8s >= 1.34

Add version-gated removal in pruneKubeletConfig() so that e2e tests
running on k8s >= 1.34 clusters don't include the deprecated flag
in the NodeBootstrappingConfiguration.
…d ensure it is not reintroduced in AgentBaker for k8s >= 1.34
…gs and KubeletConfigFileConfig for k8s >= 1.34
@abigailliang-aks-sig-node abigailliang-aks-sig-node force-pushed the remove-streaming-connection-idle-timeout branch from 2075fc1 to c2dd0bb Compare July 3, 2026 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants