One useful application of the Minerazzi’s URL Scoring Tool we just launched consists in doing some IP intelligence on a list of banned domains. Usually, those with a similar or common IP are in a shared hosting environment or have a common ownership, or both. This can be another piece of information that could help you identify those behind a set of domain names.
To do this, just google [banned domains] and follow a result that points to a list of domain names banned by a web property. Then paste the list in the MUST textarea and submit it. You may want to be sure the URLs are carriage return delimited (crd). In general, you could do the same analysis for lists of parked domains, hacker sites, registrar companies, affiliate program urls, etc.
Here is the result of checking this old list of Banned Domains by Reddit. To properly interpret the results, visit our tool’s page.
Here is a nice example of how gathering IPs can be used for business intelligence purposes.
Want to know what happened with those Old Glory URLs from the Golden Age? Now you can, with the URL Scoring Tool. Just submit up to 100 of them.
The tool will check whether they are still accessible, redirect, or have similar IPs. Nice for mining legacy web properties.
Minerazzi industry-specific and region-specific working demos and new resources. Available now!
Learning about Grover’s Algorithm: Quantum Database Search.
- Grover, L. K. (1996). A fast quantum mechanical algorithm for database search.
- Grover, L. K. (1997). Quantum Mechanics helps in searching for a needle in a haystack Phys. Rev. Lett. 79 (1997) 325.
- Lavor, C., Manssur, L.R.U., and Portugal, R. (2008). Grover’s Algorithm: Quantum Database Search.
- Wikipedia. Grover’s algorithm.
Learn about the power of X Searches (short for XOR and XNOR searches) for keyword discovery, disambiguation, clustering, information retrieval, and data mining in general.
This is a follow up on the Beauty of XOR and XNOR searches post, describing possible applications of these search modes to Information Retrieval, Search Marketing, and Web Mining. The post is a snippet taken from http://www.minerazzi.com/help/xor-xnor.php
An IR researcher can test the performance of an LSI algorithm with a sample of documents retrieved through XOR and XNOR searches. Said sample should be rich in co-occurrence cases. Using a similar procedure, search marketers or Web intelligence specialists can identify sets of documents that emphasize keywords somehow related through different co-occurrence paths.
An interesting application consists in extracting all the unique terms (or just the high frequency ones) from a text source and constructing an XOR query with these. We may refer to this as XORing a text source. This should help one identify a network of co-occurrence paths over a collection and which documents might be relevant to specific combination of terms from the original source.
The text source can be a title, description, abstract, or paragraph of a document, or even an entire document. However, XORing a large document might be computer-intensive.
A similar exercise can be done by XNORing a text source. In both cases, the resultant output can be used to identify prospective competitors; i.e., documents relevant to similar concepts or belonging to companies within the same business space.
We are currently testing the XOR and XNOR search modes as a query disambiguation strategy.
PS. Today, 1-9-2014, we added new material that discusses these search modes for disambiguation and clustering. :)
Who said that IR and LSI cannot be fun? Detecting Cyberbullying: Query Terms and Techniques