Back in 2009/04/03 we wrote a nice comparative between LDA, LSI, and Vector Space theory. http://irthoughts.wordpress.com/2009/04/03/vector-space-probabilistic-lsi-and-lda/ LDA was also discussed by its creator (David Blei) at the 2006 IPAM’s Document Space Workshop (http://www.miislita.com/ipam/ipam-document-space-workshop.pdf ). Years before, in 8/25/2006, we wrote in an old asp-based blog a post about warning users against SEOs selling snakeoil in the form of SVD, LSI, and LDA arguments. The problem with these approaches is that they don’t scale well for the Web. I ended up that 2006 post with a prediction:
“At this point I got tired of highlighting more flaws in the claims of these search marketing firms. A sample list of the latest LSI myths is available for your perusal.
Next stop for these snakeoil marketers? How about PCA (Principal Component Analysis) or LDA (Latent Dirichlet Allocation)?”
That post was eventually referenced in a rebuttal I posted at that cesspool of quacks known as seomoz and later fully reproduced at this blog in 2007/05/03 (http://irthoughts.wordpress.com/2007/05/03/latest-seo-incoherences-lsi/).
It was a matter of time for johnny-comes-late to “discover” LDA, the Niagara Falls and the Grand Canyon. Oh my God, what a “bombshell”.
Expect a new wave of marketers trying to game naïve cheerleaders and their clients with their latest crap.
Nothing new under the Sun. Will the next stop of these snakeoil marketers disguised as “scientists” be NMF? How about Diffusion Geometries?
One more thing, for those that really want to learn LDA: subscribe to Topic-Models at https://lists.cs.princeton.edu
This is a list forum on LDA run by David Blei and others. I’ve being subscribed for many years now and the discussion on the topic is really useful.