In previous posts, we have presented two tutorials on Okapi BM25 and BM25F, which are based on the Verbosity and Scope Hypotheses.


Here I would like to reference research at both sides of the Scope Hypothesis.

In the abstract of “Revisiting the relationship between document length and relevance” (, Losada, D.E., Azzopardi, L. and Baillie, M. (2008) state:

“The scope hypothesis in Information Retrieval (IR) states that a relationship exists between document length and relevance, such that the likelihood of relevance increases with document length. A number of empirical studies have provided statistical evidence supporting the scope hypothesis. However, these studies make the implicit assumption that modern test collections are complete (i.e. all documents are assessed for relevance). As a consequence the observed evidence is misleading. In this paper we perform a deeper analysis of document length and relevance taking into account that test collections are incomplete. We first demonstrate that previous evidence supporting the scope hypothesis was an artefact of the test collection, where there is a bias towards longer documents in the pooling process. We evaluate whether this length bias affects system comparison when using incomplete test collections. The results indicate that test collections are problematic when considering MAP as a measure of effectiveness but are relatively robust when using bpref. The implications of the study indicate that retrieval models should not be tuned to favour longer documents, and that designers of new test collections should take measures against length bias during the pooling process in order to create more reliable and robust test collections.”


However in the abstract of “Enhancing ad-hoc relevance weighting using probability density estimation” (, Zhou, Huang, and He (2011) state:

“Classical probabilistic information retrieval (IR) models, e.g. BM25, deal with document length based on a trade-off between the Verbosity hypothesis, which assumes the independence of a document’s relevance of its length, and the Scope hypothesis, which assumes the opposite. Despite the effectiveness of the classical probabilistic models, the potential relationship between document length and relevance is not fully explored to improve retrieval performance. In this paper, we conduct an in-depth study of this relationship based on the Scope hypothesis that document length does have its impact on relevance. We study a list of probability density functions and examine which of the density functions fits the best to the actual distribution of the document length. Based on the studied probability density functions, we propose a length-based BM25 relevance weighting model, called BM25L, which incorporates document length as a substantial weighting factor. Extensive experiments conducted on standard TREC collections show that our proposed BM25L markedly outperforms the original BM25 model, even if the latter is optimized.”

My take…

I haven’t reviewed BM25L vs. BM25F, yet. Still the question on the Scope Hypothesis is intriguing. For what I can tell (and this is my sole opinion), if an author writes more about a topic or several topics in a given document, more likely he will be using more instances of index terms. A cluster of the top index term density values (IDs) spreaded over said document should give some insight about its scope. We have developed a tool that computes these clusters. We are testing now whether that would translate into an improved relevance.

Assuming that Web IR systems out there (e.g,, search engines) use these algorithms or derivatives of these: What would be the implications for content writers trying to understand algos based on the Verbosity and Scope Hypotheses? Hello, copywriters, SEOs, etc. This puppy is nice to watch.