Skip to content

Conversation

@x13n
Copy link
Member

@x13n x13n commented Jan 30, 2026

This change optimizes the EmptySorting scale down candidate processor by implementing a caching mechanism for the isNodeEmpty check.

Summary of changes:

  1. Extended CandidatesComparer interface with ResetState() method to allow clearing caches before each sorting pass.
  2. Updated NodeSorter.Sort() to invoke ResetState() on all processors.
  3. Implemented caching in EmptySorting, ensuring simulator.GetPodsToMove is called at most once per node during a sorting cycle.
  4. Added no-op ResetState() to other CandidatesComparer implementations.

Benchmark results for sorting 5000 nodes with EmptySorting:

  • Before: 26,403,629 ns/op (~26.4 ms)
  • After: 5,120,497 ns/op (~5.1 ms)

Performance improvement: ~5x.

Testing: Added micro benchmark BenchmarkEmptySorting and updated existing unit tests.

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Minor performance improvement reducing loop time in large clusters.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. release-note Denotes a PR that will be considered when it comes time to generate release notes. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area labels Jan 30, 2026
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed do-not-merge/needs-area labels Jan 30, 2026
This change optimizes the EmptySorting scale down candidate processor by implementing a caching mechanism for the isNodeEmpty check.

Summary of changes:
1.  Extended CandidatesComparer interface with ResetState() method to allow clearing caches before each sorting pass.
2.  Updated NodeSorter.Sort() to invoke ResetState() on all processors.
3.  Implemented caching in EmptySorting, ensuring simulator.GetPodsToMove is called at most once per node during a sorting cycle.
4.  Added no-op ResetState() to other CandidatesComparer implementations.

Benchmark results for sorting 5000 nodes with EmptySorting:
- Before: 26,403,629 ns/op (~26.4 ms)
- After: 5,120,497 ns/op (~5.1 ms)
Performance improvement: ~5x.

Testing: Added micro benchmark BenchmarkEmptySorting and updated existing unit tests.
return n.nodes
}
for _, p := range n.processors {
p.ResetState()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optimization makes sense, however it's not clear how the Sort() method actuates the reset in a sane way (perhaps these optimizations are consumed downstream somewhere and not obvious in the OSS code?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we do have another comparer downstream that will also benefit from this, but it is not clear to me why the current approach is not sane - state is reset before every sorting, so any local optimizations can be aware of that. Can you clarify what do you think is a problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants