Week 8 Agenda
Lecture Session
To understand the present we need to look at the past.
Thus, in this lecture we will take an in-depth look at early and current search engine architectures and their “floor plan”. We look at some published papers that started what we have today. Some hard-to-find material will be used. These are actual pieces of history that explain few “how-did-they-do-that”..
Since we cannot cover all the search engine architectures as we would like to, I have selected few of these. These were classified in three category:early, old glory, and currrent. The last two are open source projects.
I might add few more. Either way, this lecture might cover two weeks.
Early:
Archie
ALIWEB
WWW Wanderer
WWW Worm
JumpStation
RBSE
Old Glory:
WebCrawler
Lycos
Current:
Google
Lucene
Terrier
Lab Session
Complete previous lab.