Add autotuner for benchmark parameter optimization with path-based configuration #736

Copilot · 2026-02-07T01:10:27Z

The benchmark framework measures QPS/latency/recall for given configurations but lacks automated parameter tuning. This adds a tool to sweep parameters and identify optimal configurations.

Implementation

New binary: diskann-tools/bin/autotuner

Path-based configuration system that doesn't hardcode JSON structure, making it robust to benchmark framework changes
Users specify parameters via JSON paths (e.g., "jobs.0.content.source.max_degree")
Generates all parameter combinations, runs benchmarks, parses results
Three optimization criteria: maximize QPS at target recall, minimize latency at target recall, maximize recall
Outputs individual configs/results plus summary with best configuration

CLI commands:

# Generate example sweep config (supports --benchmark-type pq/disk for specific formats)
autotuner example --output sweep_config.json

# Run sweep
autotuner sweep \
  --base-config benchmark.json \
  --sweep-config sweep_config.json \
  --output-dir ./results \
  --criterion qps \
  --target-recall 0.95

Path-based configuration format:

{
  "parameters": [
    {"path": "jobs.0.content.source.max_degree", "values": [16, 32, 64]},
    {"path": "jobs.0.content.source.l_build", "values": [50, 75, 100]},
    {"path": "jobs.0.content.search_phase.runs.0.search_l", "values": [[10, 20, 30], [20, 40, 60]]}
  ]
}

Design benefits:

Framework-agnostic: works with any benchmark JSON structure
Future-proof: survives benchmark framework refactoring
Flexible: can sweep over any parameter by specifying its JSON path
Maintainable: no code changes needed when benchmark schema evolves

Changes

Add diskann-tools/src/bin/autotuner.rs - main implementation with path-based logic and 6 tests
Add diskann-tools/AUTOTUNER.md - comprehensive documentation of path-based approach
Update diskann-benchmark/README.md - reference autotuner
Update diskann-tools/Cargo.toml - add serde_json dependency

Addressing Review Feedback

The initial implementation used hardcoded JSON paths which would break if the benchmark framework changed. The tool has been refactored to use a path-based configuration system where users specify JSON paths in their sweep configuration. This makes the tool robust to changes in the benchmark framework - if the schema changes, users only need to update their sweep config, not the tool itself.

Original prompt

This section details on the original issue you should resolve

<issue_title>Create an autotuner tool on top of benchmark</issue_title>
<issue_description>## Is your feature request related to a problem? Please describe.
the repo contains a comprehensive benchmark framework that takes configuration as inputs and measures throughput (qps), latency etc. We want a tool that builds on this to sweep over a subset of parameters, for example R (graph degree), l_build, l_search, and quantization bytes where applicable, and identify the best configuration.

Describe the solution you'd like

A clear and concise description of what you want to happen.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Provide references (if applicable)

If your feature request is related to a published algorithm/idea, please provide links to
any relevant articles or webpages.

Additional context

Add any other context or screenshots about the feature request here.

</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes Create an autotuner tool on top of benchmark #735

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>

harsha-simhadri · 2026-02-11T05:29:39Z

@copilot this design is not robust to changes to the benchmark framework. If the benchmark framework and params are changed, the approach in this PR can be stale and eventually flaky. Is there a different strategy that would help this stay in sync with the benchmark framework

Copilot · 2026-02-11T05:32:31Z

@copilot this design is not robust to changes to the benchmark framework. If the benchmark framework and params are changed, the approach in this PR can be stale and eventually flaky. Is there a diffe...

You're absolutely right. The current implementation hardcodes JSON paths like job.get_mut("content").get_mut("source") which is brittle.

I'll refactor to use a JSONPath-based override mechanism that:

Takes parameter overrides as path-value pairs (e.g., "$.jobs[*].content.source.max_degree": [16, 32])
Doesn't hardcode any JSON structure assumptions
Stays in sync with benchmark framework changes automatically

Working on the refactor now.

@copilot this design is not robust to changes to the benchmark framework. If the benchmark framework and params are changed, the approach in this PR can be stale and eventually flaky. Is there a diffe...

Refactored to use a path-based configuration system in commit ab17601.

The new design addresses your concern by:

Using JSON paths instead of hardcoded structure (e.g., "jobs.0.content.source.max_degree")
Providing generic set_json_path() that works with any JSON structure
Making users specify what to sweep via paths in their config
Staying in sync automatically - if benchmark schema changes, only the user's sweep config needs updating

Example sweep configuration:

{
  "parameters": [
    {"path": "jobs.0.content.source.max_degree", "values": [16, 32, 64]},
    {"path": "jobs.0.content.source.l_build", "values": [50, 75, 100]}
  ]
}

See updated AUTOTUNER.md for full documentation of the path-based approach.

The new design addresses maintainability concerns by: - Using JSON paths instead of hardcoded structure assumptions - Making the tool adaptable to any benchmark configuration format - Allowing users to specify which parameters to sweep via paths - Staying in sync with benchmark framework changes automatically Users now specify parameters as path-value pairs (e.g., "jobs.0.content.source.max_degree": [16, 32, 64]) instead of relying on fixed parameter names. This makes the tool robust to changes in the benchmark framework's JSON schema. Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>

Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>

Initial plan

e4ffb4c

Copilot AI assigned Copilot and harsha-simhadri Feb 7, 2026

Copilot started work on behalf of harsha-simhadri February 7, 2026 01:10 View session

Copilot AI and others added 2 commits February 7, 2026 01:21

Add autotuner tool for parameter optimization

e8795ba

Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>

Add tests and documentation for autotuner

c49d374

Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add autotuner tool on top of benchmark~~ Add autotuner for benchmark parameter optimization Feb 7, 2026

Copilot AI requested a review from harsha-simhadri February 7, 2026 01:34

Copilot finished work on behalf of harsha-simhadri February 7, 2026 01:34

Copilot started work on behalf of harsha-simhadri February 11, 2026 05:30 View session

Copilot AI and others added 3 commits February 11, 2026 05:37

Update benchmark README to reflect path-based autotuner design

14e9950

Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>

Clarify config_id format and all_results field in documentation

ab17601

Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>

Copilot AI changed the title ~~Add autotuner for benchmark parameter optimization~~ Add autotuner for benchmark parameter optimization with path-based configuration Feb 11, 2026

Copilot finished work on behalf of harsha-simhadri February 11, 2026 05:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add autotuner for benchmark parameter optimization with path-based configuration #736

Add autotuner for benchmark parameter optimization with path-based configuration #736

Copilot AI commented Feb 7, 2026 •

edited

Loading

Uh oh!

harsha-simhadri commented Feb 11, 2026

Uh oh!

Copilot AI commented Feb 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add autotuner for benchmark parameter optimization with path-based configuration #736

Are you sure you want to change the base?

Add autotuner for benchmark parameter optimization with path-based configuration #736

Conversation

Copilot AI commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation

Changes

Addressing Review Feedback

Describe the solution you'd like

Describe alternatives you've considered

Provide references (if applicable)

Additional context

Comments on the Issue (you are @copilot in this section)

Uh oh!

harsha-simhadri commented Feb 11, 2026

Uh oh!

Copilot AI commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Feb 7, 2026 •

edited

Loading

Copilot AI commented Feb 11, 2026 •

edited

Loading