BM25

What is BM25?

BM25 is a keyword-matching retrieval algorithm that ranks documents by how well they match a search query, based on how often the query terms appear in the document and how long the document is. It is fast, predictable, and works well when search terms appear directly in the documents.

What are its limitations?

BM25 does not understand meaning — only words. If a user searches for "payment overdue" and the relevant document uses the phrase "invoice aging," BM25 will not surface it. This is why modern enterprise search systems typically combine BM25 with semantic vector search in a hybrid setup: BM25 catches exact matches; semantic search handles meaning. Used together, they cover more ground than either approach alone.