We wish you all a great data mining 2012.
31 Saturday Dec 2011
Posted in Data Mining
We wish you all a great data mining 2012.
27 Tuesday Dec 2011
I received this morning from the editors of Communications in Statistics: Theory and Methods confirmation that they accepted and will be publishing my peer reviewed paper on a new model for statistical analysis. It should be out this 2012.
Once published, you will understand the SEO (* SEOmoz, I should say) non-sense of computing arithmetic averages of correlation coefficients and why some meta-analysis studies published in the past (* Hunter-Schmidt; Hedges-Olkin) are flawed and invalid.
It took me several meals and research hours to figure it out. I hope that IRs, dataminers, and statistics colleagues find new applications for the model.
The model can be applied to many fields, including marketing, business, risk analysis, data mining, signal processing, engineering, clinical trials, and almost any field or knowledge domain that involves the calculation of weighted statistics. I look forward to discuss it online once it get published.
Happy New Year.
PS. (*) I’ve edited this post to make these points obvious. So, the issue of arithmetically averaging correlations has been raised and killed for good before the scientific and statistical community.
PS. Just in: Last night (Jan-03-2012) I received news from one of the editors of the journal that the paper was assigned to issue 41 (8). Check for its title: The Self-Weighting Model (in Spanish is something like “El Modelo de Autoponderacion“. I forget to mention that this journal is published biweekly; so, things are moving fast. What a way of ending 2011 and starting 2012!!!
13 Tuesday Dec 2011
Posted in Data Mining, Programming, Spam
Yesterday we had a brainstorming session with our programmers on google hacking. It is soooooo easy to grab php codes, passwords, databases from all over the Web, thanks to sloppy coders. For instance, do a search for
index.of
index.of/php
index.of/pswd
index.of/db
index.of/mda
index.of/pgp
or check the list at http://www.thenetworkadministrator.com/googlesearches.htm These types of searches will spit out directory trees.
There are many “smart cookies” posting derivatives of these lists all over the Web.
And how about typos?
Try filetype command searches with extra characters in extensions like
0php
1php
phps
php.
etc….
Servers will spit out entire php codes.
The great offenders are large sites like those belonging to .edu, .gov, .org, not to mention large .com and .net sites.
Ho, Ho, Ho, Merry Christmas, Santa.