“IDF is simply neither a pure heuristic, nor the
theoretical mystery many have made it out to be.
We have a pretty good idea why it works as well
as it does.” –Stephen E. Robertson
Here is a sneak preview of IR Watch for the month of June, 2008. It should be in subscribers inbox during the day or at the latest tomorrow.
It is discussed within the context of co-occurrence theory and term independence/dependence assumptions. Issues and misconceptions related with this measure are addressed. Initially we made plans for including current ongoing work we are conducting on specificity measures, but we have chosen not to since is not the appropriate forum.
IRW-2008-06: Understanding Inverse Document Frequency (IDF)
In this issue:
Robertson-Sparck Jones Early Work on IDF
What IDF Is Not
What IDF Really Is
On Terms Independence
On Terms Dependence
Estimating the IDF of a Phrase
News, Research, and Events