Add RFC Process and RFC 0001 — Multi-Vector Distance Functions #731

suri-kumkaran · 2026-02-06T16:31:30Z

Summary

Introduces the rfcs/ directory with the RFC process and the first RFC proposing high-performance Chamfer distance for multi-vector representations in DiskANN.

Changes

New Files

File	Description
`rfcs/README.md`	RFC process guide: when to write one, lifecycle, submission steps, index
`rfcs/0000-template.md`	RFC template with recommended structure
`rfcs/0001-multi-vector-distance-functions.md`	RFC proposing Chamfer distance for ColBERT-style multi-vector search

RFC 0001 — Multi-Vector Distance Functions

Proposes a query-transposed tiling approach for Chamfer distance computation:

Transposes queries into a block layout, keeps documents in row-major format (no index migration)
Processes pairs of document vectors together, amortizing query memory loads (~50% bandwidth reduction)
Reuses a MaxSim scratch buffer across distance calls (zero hot-path allocation)
Achieves up to 2.67x speedup over SIMD baseline (avg 1.93x across 10 configurations)

Builds on existing types from diskann-quantization (Mat, MatRef, MaxSim, Chamfer) and proposes MaxSim implementing DistanceFunctionMut from diskann-vector for ecosystem integration.

Experimental benchmarks available in PR #730.

Review Focus

This PR is the RFC itself — please review the proposal and leave comments directly on the RFC file:

Is query-transposed tiling the right default over document-transposed approaches?
API shape: MaxSim implementing DistanceFunctionMut directly
TransposedMultiVector type design and Repr trait integration
Trade-offs between proposed and alternative approaches

Copilot

Pull request overview

Adds an RFC framework to the DiskANN repository and introduces the first RFC describing a proposed high-performance Chamfer distance implementation for ColBERT-style multi-vector representations.

Changes:

Introduces rfcs/ with an RFC process guide and an index.
Adds an RFC template (0000-template.md) to standardize future proposals.
Adds RFC 0001 proposing “query-transposed tiling” for faster multi-vector Chamfer distance.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`rfcs/README.md`	Documents the RFC process, lifecycle, and index of RFCs.
`rfcs/0000-template.md`	Provides a reusable RFC template and metadata format.
`rfcs/0001-multi-vector-distance-functions.md`	Proposes the multi-vector Chamfer distance approach, API shape, and benchmarks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rfcs/0001-multi-vector-distance-functions.md

codecov-commenter · 2026-02-06T16:44:40Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.00%. Comparing base (f3fcaae) to head (b418f39).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #731      +/-   ##
==========================================
- Coverage   89.01%   89.00%   -0.01%     
==========================================
  Files         428      428              
  Lines       78294    78294              
==========================================
- Hits        69692    69686       -6     
- Misses       8602     8608       +6

Flag	Coverage Δ
miri	`89.00% <ø> (?)`
unittests	`89.00% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.
see 7 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

suri-kumkaran · 2026-02-06T16:51:40Z

@microsoft-github-policy-service agree company="Microsoft"

rfcs/README.md

hildebrandmw · 2026-02-06T18:19:53Z

rfcs/README.md

+
+## How to Submit an RFC
+
+1. Copy [0000-template.md](0000-template.md) to `NNNN-short-title.md` (use the next available number).


Perhaps we can use PR numbers instead of total ordering? I'm worried that ensuring all RFCs are sequentially ordered otherwise would be a giant pain.

I think there are two concerns with using PR numbers:

We would need to update file names after opening PRs, which is not a very smooth process.

Having a total ordering improves readability and overall cleanliness in my opinion, and we only need to ensure the ordering is correct at the time the PR is merged.

I buy the argument on a sequential ordering, though to be fair it's just a number. But, is ensuring the ordering is correct at merge time just as much (if not more) of a process headache than updating the name once with the PR number at the beginning and then never having to worry about it again?

I think ensuring the ordering is correct at merge time may be more of a headache than updating the PR number at the beginning, but the benefit of maintaining sequential order makes it worthwhile.

Also, since RFCs merge will not happen very frequently, we are unlikely to need to update the sequential ordering second time before merge.

hildebrandmw · 2026-02-06T18:20:32Z

rfcs/README.md

+2. Fill in all sections. Remove instructional comments.
+3. Open a pull request with the RFC file. The PR description should summarize the proposal.
+4. Discuss in the PR. Update the RFC based on feedback.
+5. Once accepted, the RFC is merged and the status is updated to **Accepted**.


Merging an RFC should be considered acceptance. This way, there's no need for additional book-keeping.

Are we not planning to merge rejected RFCs as well, so that we can document proposals that were discussed and ultimately rejected?

We can use an RFC label to tag RFC pull requests. Then rejected RFC's will still be available as closed PRs, the final text will not bloat the repo, and the full discussion will be available. The latter is arguable more valuable than just the text.

rfcs/README.md

hildebrandmw · 2026-02-06T18:32:11Z

rfcs/0001-multi-vector-distance-functions.md

+2. Achieve 2x+ speedup over baseline SIMD through memory layout optimization
+3. Maintain compatibility with DiskANN's `DistanceFunctionMut` trait
+4. Provide a clean API that enables standalone distance function usage without full index integration
+5. Achieve performance within 10–20% of `faer` SGEMM-based Chamfer computation, when both our implementation and `faer` are restricted to AVX2 (no AVX-512 on either side)


Why not use AVX-512 if available? We have support, and it could likely really benefit dense operations like this.

I think I would like to park this for now. We can pick it up in a follow up with a more thorough design for dynamic dispatching that also takes the machine architecture into account.

Deferring is fine, but I don't think there need to be a super thorough design planning/review - we already have the infrastructure for dynamic dispatch and micro-architecture specific code generation.

Got it, so what's your suggestion? Should we include the brief design around dynamic dispatch in this RFC itself?

I was thinking we can work on it in separate PRs and RFCs if required in future.

hildebrandmw · 2026-02-06T18:36:02Z

rfcs/0001-multi-vector-distance-functions.md

+
+### Alternative Approaches
+
+The experimental crate explored six approaches total. The query-transposed tiling approach was selected as the proposal, but the alternatives remain available and may be better for specific workloads.


One thing not covered here is how this behaves under multi-threading. it's possible that single threaded tests may look fine, but if effort is not applied to restrict working set sizes to the size of the L1/L2 as much as possible, then multiple-threads could stomp on one-another's usage of L3 when running multiple computations in parallel.

That is a great point. While the distance kernel itself is single-threaded, I agree that minimizing the working set is crucial to prevent cache thrashing when multiple threads are running parallel searches (e.g., in a Rayon thread pool).

I feel Query-Transposed Tiling approach tackles this problem fairly well:

Reduced Bandwidth Pressure: By holding a query block in vector registers (or L1) and iterating over multiple document vectors, we maximize arithmetic intensity. This reduces the frequency of fetches from L3/Memory compared to the baseline, which theoretically leaves more bandwidth available for other threads.

Small Working Set: The scratch buffer (MaxSim) and the query tiles are designed to stay hot in L1/L2. The only data streaming through L3 is the document vectors themselves, which is unavoidable.

That said, theoretical scaling often differs from practice. I can add a 'Throughput Benchmark' to the RFC (running the kernel on all available cores simultaneously) to verify that we don't see sub-linear scaling due to resource contention. Does that sound like a sufficient verification?

hildebrandmw · 2026-02-06T18:36:51Z

rfcs/0001-multi-vector-distance-functions.md

+| dim=384, doc=32, query=8 | 8,412 | 1.41x | **1.65x** | 1.30x | 1.24x |
+| dim=384, doc=64, query=16 | 38,162 | 1.30x | 1.47x | **1.70x** | 1.66x |
+| dim=384, doc=128, query=32 | 171,431 | 1.53x | 1.94x | 2.04x | **2.16x** |
+| **Average** | — | **1.38x** | **1.79x** | **1.93x** | **1.59x** |


Can we also see situations where there are, say, 900+ documents?

I think this is a bit far from a real use case because the Chamfer distance is very costly, but I can still benchmark it if you would like. Let me know if you want me to do that.

hildebrandmw · 2026-02-06T18:37:57Z

rfcs/0001-multi-vector-distance-functions.md

+### Speedup vs SIMD Baseline (Median, Lower Latency = Better)
+
+| Configuration | SIMD (µs) | transposed_simd | transposed_tiling | query_transposed_tiling | sgemm |
+|--------------|-----------|-----------------|-------------------|------------------------|-------|


Small nit - can this table be adjusted to it's readable without rendering the markdown?

Done, yet to push.

harsha-simhadri · 2026-02-07T19:14:53Z

Let us post the RFCs we already have that have shaped code.

Multi Vector Distance Function RFC

4b43589

suri-kumkaran requested review from a team and Copilot February 6, 2026 16:31

suri-kumkaran self-assigned this Feb 6, 2026

Copilot started reviewing on behalf of suri-kumkaran February 6, 2026 16:31 View session

suri-kumkaran requested review from arkrishn94 and hildebrandmw February 6, 2026 16:34

Copilot AI reviewed Feb 6, 2026

View reviewed changes

rfcs/0001-multi-vector-distance-functions.md Show resolved Hide resolved

rfcs/0001-multi-vector-distance-functions.md Show resolved Hide resolved

rfcs/0001-multi-vector-distance-functions.md Show resolved Hide resolved

hildebrandmw reviewed Feb 6, 2026

View reviewed changes

Address review comments

b418f39


		## How to Submit an RFC

		1. Copy [0000-template.md](0000-template.md) to `NNNN-short-title.md` (use the next available number).


		### Alternative Approaches

		The experimental crate explored six approaches total. The query-transposed tiling approach was selected as the proposal, but the alternatives remain available and may be better for specific workloads.

Add RFC Process and RFC 0001 — Multi-Vector Distance Functions #731

Are you sure you want to change the base?

Add RFC Process and RFC 0001 — Multi-Vector Distance Functions #731

Conversation

suri-kumkaran commented Feb 6, 2026

Summary

Changes

New Files

RFC 0001 — Multi-Vector Distance Functions

Review Focus

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

suri-kumkaran commented Feb 6, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

suri-kumkaran Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

harsha-simhadri commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Feb 6, 2026 •

edited

Loading

suri-kumkaran Feb 9, 2026 •

edited

Loading