I just came across this article


by Valerie DiCarlo and honestly don’t know from where these marketers learn all these misconceptions regarding LSI. Perhaps she has been misled as well by the usual suspects and is just trying to make some honest comments she truly believes. Unfortunately most of her’s are incorrect. I’m commenting her lines, one by one.

“Latent Semantic Indexing (LSI) is a vital element in Search Engine Optimization (SEO) for better keyword rankings in search results.”

No is not.

That’s what many SEO firms claiming to sell “LSI-based” services wish prospective clients to believe: Repeat a false statement often until many takes it as a true statement. Even SEO companies making such statements don’t really know what is LSI or how it works. On top, there is no such thing as “LSI-friendly” or “LSI-optimized” documents, nor LSI can be manipulated to improve rankings. I’m still waiting for an SEO to prove such LSI claims. The challenge has been issued in:


Many SEOs are now recanting their past misleading LSI “explanations” they once told and sold to others –thanks to that challenge or “invitation”.

“LSI is based on the relationship, the “clustering” or positioning, the variations of terms and the iterations of your keyword phrases.”

Simply incorrect. This is not how LSI works and any result one gets from such strategies, if any, has little to do with LSI.

“Expertly knowing LSI and how it can be most useful and beneficial for your SEO and the importance it has with the algorithm updates to search engines like Google, MSN and Yahoo which will benefit your keyword research for best practice SEO.”

Expertly knowing LSI will prevent one from making the above statements.

“Those doing keyword research over the years have always known to use synonyms and “long tail” keyword terms which is a simpler “explanation” to LSI. “

In the early and outdated LSI papers SEOs often misquote, the role of synonyms was stretched. Today we know that what is at the heart of LSI is a high-order co-occurrence phenomenon taking place across a collection of documents, transmitting a redistribution of weights across connectivity paths. This phenomenon can be explained in terms of graph theory.

This phenomenon can be present regardless of whether the terms involved are synonyms or not or regardless if end-users resource to a particular writing style. This synonym fallacy can be traced back to the early LSI papers. To learn how and why this synonym fallacy started read
SVD and LSI Tutorial 5: LSI Keyword Research and Co-Occurrence Theory.

LSI is not word co-citation either. Word co-citation or first-order co-occurrence is observed when terms co-occur in the same documents. This type of co-occurrence and even high-order co-occurrence (in-transit co-occurrence) does not grant contextuality, as terms can occur in passages that discuss different topics in a given document.

Web documents -especially large docs- are prone to be multitopic and thus terms can co-occur within different topics. Even web docs with a central article come with news stories, newsfeeds, blog feeds, ads, sales pitches, etc. These might undergo updates, links can be changed at will, and so forth. Thus, only because terms co-occur this is does not insures topification.

In such cases term weights based on co-occurrence can be misleading. The key to similarity scores that incorporate co-occurrence is not that terms happen to be synonyms or co-occur together or while in-transit, but that they co-occur together or in-transit within similar neighboring terms and within similar topics. This means that not all connectivity paths from an LSI term-term matrix are equally important and as such an SVD-truncated and dense term-term matrix must be artificially sparsed to extract useful topics.

Unfortunately, some misunderstandings regarding synonyms and co-occurrence in relation with LSI are currently being perpetuated at this SEOMOZ blog. Back to the claims of the above writer. She continues:

“The real bottom line is that Latent Semantic Indexing is currently a MUST in keyword research and SEO.”

Simply another misleading statemente since the last part is false (…Latent Semantic Indexing is currently a MUST in…SEO). First SEOs would need to know what is LSI and how to compute LSI scores.

To sum up, this is just an example of how all these SEOs and marketers claim to know about a subject they know little about. It is a consistently distorted and almost sinister way of promoting products and services across the websphere and blogosphere. One can spot these folks from the distance.