It is surprising how even serious information retrieval researchers and journals quote papers that were never written!

This is the thesis of David Dubin’s 2004 great article
The Most Influential Paper Gerard Salton Never Wrote

Dubin wrote:

“In giving credit to Salton for the vector model, a number of authors cite an overview paper titled “A Vector Space Model for Information Retrieval,” which some show as published in the JASIS in 1975 and others as published in the Communications of the Association for Computing Machinery (CACM) in 1975. In fact, no such article was ever published, and citations to it usually represent a confusion of two 1975 articles (Salton, Wong, & Yang, 1975; Salton, Yang, & Yu, 1975), neither of which were overviews of the VSM as it is generally understood (see section 5 below). Some of Salton’s own colleagues have been guilty of this mistake: both Cardie et al. and Singhal cite the CACM version, for example (Singhal, 2001; Cardie, Ng, Pierce, & Buckley, 2000). The paper is even cited in a few of the very last articles on which Salton is listed as a coauthor (Singhal, Salton, Mitra, & Buckley, 1996; Singhal & Salton, 1995). These papers were published close to or shortly after the time of his death, and so the errors cannot be blamed on Salton (remembered by his colleagues as a very careful and meticulous writer).”

Somehow far too many IRs misquote Salton’s 1975 paper titled “A vector space model for automatic indexing“. This causes digital libraries to create a spurious record attached to many cross-referenced articles.

I searched Google for “a vector space model for information retrieval” + salton and indeed there are many reputed publications and researchers citing a paper that was never published! What a shame.

That says a lot about researchers, editors, and reviewers that were lazy enough to never bother about the accuracy of the references.

About these ads