bb25 is a fast, self-contained BM25 + Bayesian calibration implementation with a minimal Python API. It also includes a small reference corpus and experiment suite so you can validate the expected numerical properties.
pip install bb25
import bb25 as bb
corpus = bb.build_default_corpus()
docs = corpus.documents()
queries = bb.build_default_queries()
bm25 = bb.BM25Scorer(corpus, 1.2, 0.75)
score = bm25.score(queries[0].terms, docs[0])
print("score0", score)
import bb25 as bb
corpus = bb.Corpus()
corpus.add_document("d1", "neural networks for ranking", [0.1] * 8)
corpus.add_document("d2", "bm25 is a strong baseline", [0.2] * 8)
corpus.build_index() # must be called before creating scorers
bm25 = bb.BM25Scorer(corpus, 1.2, 0.75)
print(bm25.idf("bm25"))
import bb25 as bb
corpus = bb.build_default_corpus()
docs = corpus.documents()
queries = bb.build_default_queries()
bm25 = bb.BM25Scorer(corpus, 1.2, 0.75)
bayes = bb.BayesianBM25Scorer(bm25, 1.0, 0.5)
vector = bb.VectorScorer()
hybrid = bb.HybridScorer(bayes, vector)
q = queries[0]
prob_or = hybrid.score_or(q.terms, q.embedding, docs[0])
prob_and = hybrid.score_and(q.terms, q.embedding, docs[0])
print("OR", prob_or, "AND", prob_and)
import bb25 as bb
results = bb.run_experiments()
print(all(r.passed for r in results))
See docs/sample_usage.py for an end-to-end example using BM25, Bayesian calibration, and hybrid fusion.
See benchmarks/README.md for a lightweight runner that compares BM25 and Bayesian BM25 on your own corpora.
This is where BB25 shines: Bayesian Hybrid beats the classic BM25 Hybrid.
| Method | NDCG@10 | MRR@10 | Notes |
|---|---|---|---|
| WS (BB25+Dense) | 0.9149 | 0.8850 | SOTA! |
| WS (BM25+Dense) | 0.9051 | 0.8717 | |
| RRF (BM25+Dense) | 0.8874 | 0.8483 | RRF underperforms weighted sum |
"Bayesian BM25 (bb25) has demonstrated the potential to outperform classic BM25 in hybrid search."
On the English dataset (SQuAD), combining bb25 with Dense (BGE-M3) achieves higher performance than the BM25 + Dense baseline (+1.0%p NDCG). This suggests the probabilistic score from bb25 blends more smoothly with vector scores (less scale mismatch than a simple weighted sum).
Original paper:
https://www.researchgate.net/publication/400212695_Bayesian_BM25_A_Probabilistic_Framework_for_Hybrid_Text_and_Vector_Search
make build
Build a wheel with maturin:
python -m pip install maturin
maturin build --release
For Pyodide builds, see docs/pyodide.md.