The Information Retrieval Collection (IRC)

A new miner is available at Minerazzi.com: The Information Retrieval Collection (http://www.minerazzi.com/irc). What you can do with it? Use this … More

Building topic-specific collections, the easy way

We have improved the Minerazzi platform (http://www.minerazzi.com) by adding new features. That includes an internal filter for deduplicating urls, which … More

Lessons learned from building an IR collection

We are currently building the Information Retrieval Collection (IRC) with the Minerazzi platform. URLs pointing to resources like articles from … More

Improving the Data Structures and Algorithms Collection

We have almost doubled the index of the Data Structures and Algorithms (DSAC) miner. In addition, we are moving to … More

Unveiling Link Honey Pots with Minerazzi

In Web Spam Taxonomy, Gyongyi and Garcia-Molina, describe several web spam techniques, one being honey pots. They describe these as … More

Minerazzi: Allowing Users to Recrawl Search Results

Effectively immediately Minerazzi (http://www.minerazzi.com) allows users to recursively recrawl search results. Why is recrawling so important? The purpose of allowing … More

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31