Yesterday we published two great tutorials on the BM25 and BM25F algorithms.
The “take away home” from the theory behind these algorithms:
1. A term (e.g., a keyword) has more information gain when it occurs for the very first time.
2. More likely, a term weights more in a title field than in other fields.
3. The weight of a term and its ocurrence frequency are not linearly related.
4. A linear combination of field scores that destroys term dependencies is contraindicated (See BM25F).
Most SEOs know well about 1 and 2.
As a term has more information gain during its first occurrences, a document about specific terms should mention these at the beginning, particularly in the title tag. For testing purposes and since end user assume that a large headline is the actual title of a document (which is not) we like to repeat the title tag content in an h1 header that is placed prominently at the beginning of the copy. Keywords from the title are then repeated early in the document body. In this way, one can write for both end users and search engines. If a search engine uses some form of the above algorithms (which we don’t know if they do), that base is covered, too. You don’t have to adopt this strategy, unless you want. It is just our way of conducting tests, but is a flexible approach.