Week 8 Agenda

Lecture Session

To understand the present we need to look at the past.

Thus, in this lecture we will take an in-depth look at early and current search engine architectures and their “floor plan”. We look at some published papers that started what we have today. Some hard-to-find material will be used. These are actual pieces of history that explain few “how-did-they-do-that”..

Since we cannot cover all the search engine architectures as we would like to, I have selected few of these. These were classified in three category:early, old glory, and currrent. The last two are open source projects.

I might add few more. Either way, this lecture might cover two weeks.

Early:

Archie
ALIWEB
WWW Wanderer
WWW Worm
JumpStation
RBSE

Old Glory:

WebCrawler
Lycos

Current:

Google
Lucene
Terrier

Lab Session

Complete previous lab.

Advertisements