Crimes Against Meta Data: and sites

During the course of building a financial miner, we found sites committing a lot of crimes against meta data. The most recent are courtesy of the and sites. Perhaps the result bad copy rewritten by software or humans?

These are great sites for finding financial and business information, but some of their pages contain poorly written meta tag data that make indexers go ga-ga gu-gu.

To illustrate, check the meta description tags of the pages at the following URLs:

Links and CSS style instructions declared as meta description data? Great!

Finally…Enjoy the Ride!

Finally, Minerazzi is here and open for business, after 1 year in beta.
Enjoy the ride without registration at

Sentiment Viz: A Great Tweet Sentiment Visualization Tool

While developing new tools for our platform (, we came across Prof. Christopher Healey’s work on data mining and visualization. For those interested on these subjects, Prof. Healey has an incredible site, full of superb resources for data mining the Web. For instance, he has developed several great tools for analyzing sentiment from tweets. These are available at

  1. Healey, C., Ramaswamy, S. (Accessed on 6-17-2014). Visualizing Twitter Sentiment.
  2. Healey, C. (Accessed on 6-17-2014). Sentiment Viz. Tweet Sentiment Visualization Tool.

Amazon’s, Facebook’s, and Twitter’s URL naming patterns

Discovering Amazon’s, Facebook’s, and Twitter’s URL naming patterns is easy with MHM,

An MHM search for retrieves

Whereas an MHM search for retrieves

Finally, and MHM search for returns

Finding Resources with MHM

Following our previous post on our recent host mining tool, available at

If a search retrieves few results, try submitting one of these as the initial query or one of the alternate searches the tool suggests you. An example follows:

A search for “” returns 0 results. One of the alternate searches is “” Submitting this new query returns 260+ results.


Praise MHM.

Improving MHM, our hosts mining tool

We have improved our Minerazzi Hosts Miner (MHM) available at

The tool now provides alternate searches. We found that the discovered alternate searches some times retrieve additional resources.

Among other useful applications the tool simplifies the building of topic-specific collections and micro-indexes.

For instance, querying retrieves 6 results:

MHM then suggests several alternate searches. One of these is

Querying this new address (which at the time of writing resolves to, retrieves 74 new results:


MHM: An Interesting Host Mining Tool

MHM is a tool for discovering sites on same host or IP and for the discovery of sites affiliated to each other, or that might be your competitor. It is available at

It is great for discovering domain names branded with keywords or known name brands. Excellent also for discovering spam communities, domainers, and more.

You may also use it to build micro-indexes and topic-specific collections (as we do) or to chase down communities of personal interest to law and order agencies, recruiters, etc.

A Domain Intelligence Tool

A new domain intelligence tool is available now.
This tool checks if a brand, product, service, subdomain, initials, or keywords have been registered as a domain name. It helps you to secure the Web presence of your intellectual property while helping you to identify cybersquatters, domain brokers, and domainers.
Update: Minor glitches fixed today. Have fun :)