MHM is a tool for discovering sites on same host or IP and for the discovery of sites affiliated to each other, or that might be your competitor. It is available at
It is great for discovering domain names branded with keywords or known name brands. Excellent also for discovering spam communities, domainers, and more.
You may also use it to build micro-indexes and topic-specific collections (as we do) or to chase down communities of personal interest to law and order agencies, recruiters, etc.
As part of the development of Minerazzi, we have published an article explaining two of our search modes: XOR and XNOR. Additional articles explaining other modes will soon follow.
We believe that IR and SEO practitioners will find these search modes particularly useful.
The beauty of XOR and XNOR searches is that these allow users to run complex co-occurrence searches in a straightforward manner. This is important as Latent Semantic Indexing information is related to term-term co-occurrence relationships.
Cyberchondriac = a person who compulsively searches the Internet for information on real or imagined symptoms of illness.
It looks like there is light at the end of the tunnel.
PlaceRaider has been called a government spyware for smartphones. Expect copycats soon. Download the PlaceRaider article.
The abstract says:
“As smartphones become more pervasive, they are increasingly targeted by malware. At the same time, each new generation of smartphone features increasingly powerful onboard sensor suites. A new strain of `sensor malware’ has been developing that leverages these sensors to steal information from the physical environment | e.g., researchers have recently demonstrated how malware can `listen’ for spoken credit card numbers through the microphone, or `feel’ keystroke vibrations using the accelerometer. Yet the possibilities of what malware can `see’ through a camera have been understudied.”
“This paper introduces a novel `visual malware’ called PlaceRaider, which allows remote attackers to engage in remote reconnaissance and what we call \virtual theft.” Through completely opportunistic use of the phone’s camera and other sensors, PlaceRaider constructs rich, three dimensional models of indoor environments. Remote burglars can thus `download’ the physical space, study the environment carefully, and steal virtual objects from the environment (such as nancial documents, information on computer monitors, and personally identi able information). Through two human subject studies we demonstrate the e ectiveness of using mobile devices as powerful surveillance and virtual theft platforms, and we suggest several possible defenses against visual malware.”
I Doser has been called an addictive electronic drug. It is a common hype in social networks. But, actually it is nothing new, but a well-repacked business.
You can get all kind of e-drugs: from e-marihuana to e-….anything by just using earphones. A dangerous mixture if you are driving a car!
Such e-drugs are based on binaural beats, discovered in 1839 by Dove. These are slow modulations that are perceived when tones of different frequency are presented to each ear. Such auditory beats in the brain can have unexpected results, altering consciousness: A virtual LSD?
In 1973 Oster discovered that binaural beats can be detected by humans when carrier tones are below approximately 1000 Hz. According to Lane et al (see references below)
WHEN two pure auditory signals of similar frequency are mixed together, the phase interference between their waveforms produces a composite signal with a frequency midway between the upper and lower frequencies and an amplitude modulation that occurs with a frequency equal to the difference between the two original frequencies. For example, mixing tones of 100 Hz and 110 Hz yields a signal with a perceived frequency of 105 Hz that rises and falls in amplitude with a frequency of 10 Hz. The amplitude-modulated composite signal is called an auditory beat.
A similar phenomenon occurs when auditory signals of similar frequency are presented separately to the left and right ear through stereo headphones. Although each ear hears only one of the frequencies, the listener perceives the middle frequency and the amplitude modulation, even though the auditory beat does not exist in physical space. This phenomenon, called a ‘‘binaural auditory beat,’’ and described more than 25 years ago (6), is created by the brain’s processing of the two separate auditory signals at the level of the olivary nuclei of the brainstem.
It was a matter of time to see some looking for making a quick cash doing a 2 + 2 math, mixing hungry with necessity (“se juntó el hambre con la necesidad”). So now we can see low level forms of life looking for an escape to their reality through I Doser.
Hackers may soon be able to misuse these e-brain technologies to cause physical harm. A WMD in the making or accident waiting to happen?
Binaural Auditory Beats Affect Vigilance Performance and Mood
Auditory Beats in the Brain
Inducing Altered States Auditory and Visual Stimulation
Entraining Tones and Binaural Beats
Which words pack more wallop, are more emphatic, are more beefy or juicy? Whatever you want to call it, if you are an SEO or copywriter, you probably know what I mean.
Well, the answer to such a question depends on what you are trying to accomplish.
According to the family of BM25 algorithms,
a term has more information gain during its first occurrences, especially if these occur earlier in a document. This pressumes some kind of relationship between information gain and the position and distribution of words in a document.
Journalists and editors understand the concept. That’s why they like to answer the who, what, when, why, and how early in a copy, although not necessarily in that order.
And that’s why you see so many press release titles written in a ‘who-what’ way!
That strategy might work with search engines, but if you want to emphasize more specific keywords in a natural way you probably need a different keyword positioning strategy, at least if you write in English.
Says who? William Strunk, Jr. in his book The Elements of Style.
Says who? Joe Carrillo and Strunk, and quote:
In his original 1918 edition of The Elements of Style (that was long before E. B. White came up with a chapter on style that made him a co-author of the book), William Strunk, Jr. came up with this perplexing prescription in his discussion of the principles of exposition:
“The proper place for the word, or group of words, which the writer desires to make most prominent is usually the end of the sentence…The word or group of words entitled to this position of prominence is usually the logical predicate, that is, the new element in the sentence…”
Strunk gave the following example to illustrate his point:
The modifying phrase at the tail-end of the sentence: “This steel is principally used for making razors, because of its hardness.”
The logical predicate at the tail-end of the sentence: “Because of its hardness, this steel is principally used in making razors.”
And here is the eye-opening point:
For his final words on the subject, however, Strunk made the following provocative—and as I already said, perplexing—prescription:
“The principle that the proper place for what is to be made most prominent is the end applies equally to the words of a sentence, to the sentences of a paragraph, and to the paragraphs of a composition.”
Carrillo’s essay is an excellent one. He later wrote a follow up post and quote:
In spoken English, we can emphasize the ideas we want to emphasize by giving them a stronger stress, leveling off our voice when enunciating minor or neutral ones, and downplaying the points that simply don’t support our contention. In writing, however, the process is rarely that simple. We can achieve emphasis only with our choice of words and how we array them into word clusters, into clauses and phrases, and ultimately into sentences and paragraphs. Mechanical devices exist that help, of course, like underlining, boldface type, italics, headlines and subheadlines, and—in today’s savvy word-processing routines—even colors, clip-arts, and emoticons. But as the aspiring writer soon discovers, much of the emphasis we seek has to be built into the very contours of the individual words as they unfold on the page.
There are three basic word-positioning principles we must know for maximum emphasis in writing English sentences: first, the initial and terminal positions of sentences are by nature more emphatic than their middles; second, when we construct a complex sentence, the main clause gets more emphasis than subordinate clauses; and third, when everything is written and done, the last words of the sentence are normally the most emphatic of all. These are structurally inherent in the English language itself, as we will see more clearly when we study them in closer detail.
Carrillo then mentions three important concepts:
1. The initial and terminal positions of sentences are prime.
2. The main clause gets more emphasis than subordinate clauses.
3. The last words of the sentence are normally the most emphatic.
The take away
Clearly, all this shows that although interrelated, information gain, keyword wallop, and relevancy are not the same thing. Relevancy is more along the lines of “aboutness”, “eliteness”, and few other semantic concepts.
The problem is that there is a relevance perception divide between machines and end-users: topic that we have discussed. See this link:
Still thinking in the keyword density/spamming crap?
I received this morning from the editors of Communications in Statistics: Theory and Methods confirmation that they accepted and will be publishing my peer reviewed paper on a new model for statistical analysis. It should be out this 2012.
Once published, you will understand the SEO (* SEOmoz, I should say) non-sense of computing arithmetic averages of correlation coefficients and why some meta-analysis studies published in the past (* Hunter-Schmidt; Hedges-Olkin) are flawed and invalid.
It took me several meals and research hours to figure it out. I hope that IRs, dataminers, and statistics colleagues find new applications for the model.
The model can be applied to many fields, including marketing, business, risk analysis, data mining, signal processing, engineering, clinical trials, and almost any field or knowledge domain that involves the calculation of weighted statistics. I look forward to discuss it online once it get published.
Happy New Year.
PS. (*) I’ve edited this post to make these points obvious. So, the issue of arithmetically averaging correlations has been raised and killed for good before the scientific and statistical community.
PS. Just in: Last night (Jan-03-2012) I received news from one of the editors of the journal that the paper was assigned to issue 41 (8). Check for its title: The Self-Weighting Model (in Spanish is something like “El Modelo de Autoponderacion“. I forget to mention that this journal is published biweekly; so, things are moving fast. What a way of ending 2011 and starting 2012!!!
Since at this time we haven’t launch an official blog, this post goes…
We are excited to announce several updates to the minerazzi crawler. This is the online version of the indexing crawler used by the minerazzi search engine (beta).
The long-term goal is to turn this version into a multifunctional mining platform and a crawler for the IT masses; i.e., a crawler to be used by IR researchers, data miners, webmasters, developers, etc. That is, a crawler that even Web designers and the average public can use.
You’re welcome to give it a try. Keep in mind the tool is still in beta. While you are there, feel also free to test the multiple whois domain name tool.