, ,

The news miner (http://www.minerazzi.com/news) was built for indexing and mining newspapers. However, you can use it to mine news aggregation sites like HuffingtonPost, DrudgeReport, Topix, Google News, Yahoo News, Bing News, and many more. Just visit the above link and search for any of those sites.

After that you can recursively crawl these with Minerazzi’s Search Inside and Recrawl It tools. These are complementary tools so if one returns no results, try the other one.

To illustrate, the HuffingtonPost and DrudgeReport are two of the best user-friendly and content-rich news sites on the Web. These are great sources for building news collections about relevant topics like politics.

By searching for [ huffingtonpost ] or for [ drudgereport ] you can discover additional news services and even follow specific authors and their posts. You can then start building curated collections of news services, authors, and their posts.

When building collections from news services, if a remote host is busy you may want to retry it at another time. However, if the remote host denies you service you are out of luck. This is not really a drawback. As there are zillion of friendly hosts out there that will provides you with rich content, the ones that eventually refuse connection are expendable.