-
Notifications
You must be signed in to change notification settings - Fork 1.6k
KEP-5869: Wildcard Toleration Keys #5880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
KEP-5869: Wildcard Toleration Keys #5880
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ajaysundark The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/cc |
| - Running pods will continue running. | ||
| - New pods if they use `*` in tolerations, validation will fail (if disabled at | ||
| `kube-apiserver`). If only disabled in `kube-scheduler`, they will | ||
| schedule but the wildcard will be treated literally matching nothing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ajaysundark,
Here the rollback plan is too optimistic. If the feature is disabled, we face two major 'broken' states:
- Mass Evictions: Pods using wildcards to tolerate
NoExecutetaints will no longer be protected and will be evicted immediately. - Stalled Controllers: Existing Deployments or DaemonSets with wildcards in their templates will fail to create new Pods because the API server will reject the * during validation.
How are you planning to solve this issue?
|
|
||
| ###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? | ||
|
|
||
| Due to regex parsing and validation overhead, it could add to the time taken for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mentioned using glob-matching in the Design section, but mention regex here. Which one is the intended implementation?
|
|
||
| ###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? | ||
|
|
||
| No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with a flat 'No' here. Increased string matching in the scheduler's hot path will inevitably increase CPU cycles. While it may be 'negligible' for small clusters, we need to define what happens at scale. Can we add a metric to track scheduler_wildcard_match_duration_seconds to verify this during this Alpha?
| ```yaml | ||
| tolerations: | ||
| - key: "readiness.k8s.io/*" | ||
| operator: "Exists" | ||
| effect: "NoSchedule" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh oh!
There was an error while loading. Please reload this page.