LDA and Google’s ranks well correlated?
After the hilarious example of this guy with the SEOMOZ LDA tool (http://smackdown.blogsblogsblogs.com/2010/09/09/proof-that-the-new-seomoz-tools-is-at-least-half-accurate/ ) I can only laugh out loud. Have anyone tried something like that?
Regarding the new fiasco with their LDA tool. Oh, no, another one… (http://www.seomoz.org/blog/lda-correlation-017-not-032) : What can I said? They sound pathetic and apologetic. The words overhyped, shitty, sloppy, flawed, etc are not enough to describe their “research work”.
What will happen now with those Mute Speakerphones that were misled? Those that listen to fools become one.
I don’t feel any sympathy for their 15 minutes of “honesty”. The damage was done already to naïve readers.
Also, note that this latest flaw was discovered by them. It was not the result of any peer review process from external referees, as those throwing a towel at them would like to believe.
As mentioned before, beware of SEOs statistical “studies” and their quack “science” (http://irthoughts.wordpress.com/2010/04/23/beware-of-seo-statistical-studies/ ), especially if coming from SEOMOZ.
Probably their snakeoil will make a comeback soon. (Oh, no. Again?)
If they still think they have a valid LDA implementation, why not announce it at David Blei’s Topic-Models werein a community of LDA experts will review it and compare it against other implementations?
Two things can happen:
(a) It will be reviewed.
(b) it will be ignored.
I “invite” them to do so.
Please, just don’t show up with your snakeoil, yellow shoes, your seo mom, paid cheerleaders, vested investors, overhyped claims, etc, etc.
PS.
More on their hype machine here: http://skitzzo.com/archives/seomoz-hype-machine.php
It appears that even Danny Sullivan is not buying SEOmoz’s “research” on LDA. Accordingly, “He didn’t think it was the remarkable change that SEOmoz was making it out to be.” (http://outspokenmedia.com/internet-marketing-conferences/evening-forum-with-danny-sullivan/). He even confronted and put into question their “highly correlated” numbers. And that was even before they recanted.
Small correction. I helped them catch both mistakes.
Correction not granted. It was not through any external peer review process and moment. It was more of an Ex-lax moment after they did some comparison.
One more thing, and this is for those praising their “transparency” and “peer review”.
So far their “transparency” has been limited to putting out between-lines thoughts like “This number. No, it is this one. Oops, it is that one.”, “It was a programming error”, “I feel shitty”, “I’m begging for mercy”, blah, blah, blah… And even so, that was only after putting a fight defending a data they later found wrong.
The cause of the error, or code lines causing the error if any, has not been disclosed. Where is the “transparency” here? I give them this much for their “transparency”: my two middle “digitals”. Give me a ‘hell yeah’.
Voodoo coding or quack “science”? Pick your best bet through a Monte Carlo process and don’t drink too much of your kool-aid, modern Jim Jones followers.
On a personal note, Hendrickson can go out and try to attack, discredit, or misquote me all he wants. He can go out and try to infer statement I’ve never made, but inferred by his biased mind in search marketing outlets as other SEOs have tried before. Big deal; nothing new here. That doesn’t do a thing to me and my spirit. If that is his defense for his sloppy “research”, it is not working and does nothing good for his credibility. Is that all you have?
Funny that search marketers praise me when I debunk the myths, scams, and pseudo-science of other SEOs they perceive as their competition, as long as that debunking works for them. But when I expose their very own myths, scams, and pseudo-science then they react attacking me, showing they have learned nothing. Fine with me.
Examples of such reactions over the last ten years are many: My long fights with keyword density crapmasters, Markov Chain gamers, LSI snakeoilers, IDF/rare terms dumb-and-dumbers, Term Vector dummies, semantic distance creeps, synonym stuffing jerks, PageRank groupies, and now LDA-holic fools. Who are next? Diffusion geometry crooks, and NMF-freaks? I will say this to them: Just bring it! Show me what you have.
Search marketers need to distinguish between academic, double-blind peer reviewers and drop-outs/under-achievers posing or portrayed as “scientists” before a captive audience of naïve readers; not to mention pseudo “senior scientists” and “guerrilla crapmasters”.
Add to the picture vested interests from elements, cheerleaders, and employees from the same online community and you get a link-philis of “me-too” posts and a “restore kudos rally”. That is not peer review, but a gonorrhea of marketing fools praising each other. In such circumstances any “peer review” attempt more likely becomes a “pleasing review”; read here
…this.replace(/\b(peer)\b/gi,’pleasing’); return this;…var crap=[];…for(var shit = 0;shit<crap.length;shit++){crap[shit]=2*(flawed metric);…window.onerror=try{hide error}…}….Put that chunk of code in an infinite loop until a client crashes and gives you their two other middle fingers.
Rant aside, on the sustantive part…
Pick any r value produced by a linear regression model. If it is for instance r = 0.15 then R = r*r = 0.0225; i..e, only 2.25% of the variations in Y can be explained by variations in X. This means that about 97.75% of the variations in Y cannot be explained by variations in X and by the regression model. How “highly correlated” are variables coming from such a regression model? Give me a break!
In a real double-blind peer review process, reviewers and submitting authors are anonymous. Reviewers as devil’s advocates will try to discover and confirm which way is up (or down), and possibly will try to reproduce errors, pitfalls, etc regardless of their nature (programming, formulae, mental lapses, logic error, etc).
Reviewers will also show the way things should be done and will point to more authoritative work already published on the same area or will compare results with what is already available in the reviewed subject matter. This is not the case with search marketers promoting LDA hype, snakeoil, and self-promotion through online communities. No peer review was ever done here as some have claimed at the usual search marketing discussion forums full of crooks and spammers.
There is a difference between a programming error and plugging the wrong numbers into a formula or using incorrectly preprocessed data, misquoting literature never read thoroughly, forcing data sets, or citing discredited or risky meta-analysis techniques (btw, a surprise coming your way on this ***). By working on a black box outsiders will never know the actual source of the error.
The folks at SEOMOZ are the ones prone to publish articles and posts with the terms “science”, “scientific”, and “studies” in them with overhyped claims. So, why do we even have to ask for what should be expected from a real scientific peer review process in the first place?
*** 01-26-2011 Note: The surprise is now available here: http://irthoughts.wordpress.com/2011/01/07/on-the-non-additivity-of-correlation-coefficients/
At this point, I’m not sure about you, but I cannot trust anything coming from them. Rest assures, anything coming from SEOMOZ on their LDA fiasco will be after these facts.
BTW: This post was updated several times to catch typos and refine lines; just in case some want to find holes in it.
Pingback: !knihT