Now that the semester is over we can take on other projects. After a little break from the blog, it is good to be back. We are putting the final touches to this month issue of IR Watch – The Newsletter. During the break dozen of new subscribers signed.
The piece takes on several IDF myths and misconceptions promoted by SEOs and on what IDF is/is not. Here is an excerpt:
One recurrent misconception found across online media channels (search marketing blogs, forums, etc) is the assertion that IDF can be used to assess how important or relevant a term might be to the content of a document. This claim has no basis.
It should be stressed that as a measure of term specificity over N, IDF is not a local, but a global measure. IDF evaluates the discriminating power of a term within a collection of documents. A term ti might be relevant or important to the content of a document. However, if this document is part of a collection wherein all documents repeat ti, the term loses its discriminating power since N = ni and IDFi = log(N/ni) = 0.
Somehow, these marketers are mistaking IDF for the RSJ model or who knows what to possibly, as is often the case, promote themselves or whatever they sell.