Yesterday we launched the US University Sites Collection (http://www.minerazzi.com/usc). This is a miner built with the Minerazzi platform that allow users to search, mine, and recrawl all top university sites from the US. Note that in this miner your query should be about university sites. Our topic-specific miners in general are not for generic queries. All this is explained at our site.
You can try it by using [ stanford ], [ cornell ], or [ MIT ] as the query. Then just click the Search Inside link of any of the search results and you are your way to discover or build collections out of these university sites. You can also use any of the dozen of extraction tools of the platform to mine each result.
You can do the same with any of our miners. For instance, using the Information Retrieval Collection miner (http://www.minerazzi.com/irc), search for [ Gerard Salton ]. Soon you will see dozens of Salton articles and you will be in a journey discovering nice resources. If you go to the first result and click the chained arrow icon at the far right, the Search Inside tool will recrawl that result.
Then go the Internal Links list and find the link result that says “Browse All” and click again the same icon to see over 1,400 results from Cornell’s dSpace community list. Soon you will be discovering more resources.
You can also search for [ ecommons ] and mine Cornell’s eCommons database as well and any subsequent database, directory, or library resource that you might come across during the discovery journey.
Note that with Minerazzi, users can search and mine records straight from the search result pages, something that cannot be done with Google, Bing or Yahoo. So the platform might benefit researchers, librarians, students, and the general public.
To sum up, what the Minerazzi platform proposes is a new search paradigm: to turn searchers into data miners.