Week 3 Agenda
Lecture Session
Document Indexing
Web Crawling Techniques
Lab Session
For this lab, students should have already signed to download Terrier from http://ir.dcs.gla.ac.uk/terrier. We can use the Desktop API as is, but for development we need JAVA in the local machine.
Lab instructions for using the API will be provided in class. Please read in advanced Terrier documentation. Bring with you a directory (folder) full of documents from the pupr.edu site or your favorite site to play with. This will be analyzed during the lab.
This lab report is due next week.
Saludos Profesor,
Los labs tenemos que entregarlos también en formato digital pero, ¿en un archivo comprimido para pasarlo de un flash drive a otro?
Gracias.
Hi, Luis:
Yes.
Hello Professor,
I’ve been playing around with Terrier, searching the contents of the collection and learned how to index a specific collection. And while the desktop_terrier.bat searches are returning the correct results of my searches, I haven’t been able to do the same with the interactive_terrier.bat. It keeps returning “No results”.
I renamed the /etc/terrier.properties file and modified it to suit my document paths. However, with or without modifying the file, it does not return results.
I have searched the Terrier wiki and forums, and even Google, but I haven’t found a solution. Can you shed any light on this issue?
Thanks,
Gina
Hi, Gina:
Try this:
1. Index some files with Terrier Desktop and then do a search.
2. Keeping this one open, try using the interactive_terrier.bat for the same search.
I already did, still its not returning any results.
Hi, Gina:
I just ran a fresh search for inverted file without problem. This is what I did. I’m using the Windows version of Terrier from a USB removable drive. I tested on Vista and on XP.
1. Ran Terrier, indexing its own documentation.
2. Searched for inverted file as a query.
3. Double clicked on bin/interactive_terrier.bat file
4. Searched for inverted file as a query.
After the standard headers, I got.
Set TERRIER_HOME to be J:\terr
WARNING: The file terrier.prop
rrier\etc\terrier.properties
Assuming the value of terrier
INFO – time to intialise index
Please enter your query: inverted file
Displaying 1-82 result
0 326 112 2.908176030920721
1 753 471 2.8691891125133053
2 426 212 2.776936903771998
3 759 477 2.747474445016238
4 424 210 2.691801832968892
5 745 463 2.478234538724751
6 741 459 2.4410476260123852
7 734 452 2.4026089191907287
8 1000 665 2.351714630117981
9 427 213 2.3437764460259336
10 402 188 2.3294244235277524
11 975 640 2.2693842354980465
12 425 211 2.2625182087305125
13 422 208 2.2102952746378635
14 301 87 2.067004898225863
15 429 215 2.06550056347662
16 548 279 2.024848127420526
17 245 78 2.024848127420526
18 197 75 2.007975485950941
19 76 40 1.9954047945790332
20 404 190 1.979005166160501
21 703 421 1.9562281732712443
22 339 125 1.9536831254119893
23 434 220 1.9235037768768704
24 756 474 1.7579234589850747
25 28 10 1.7505691014173455
26 439 225 1.7188073021593937
27 406 192 1.6978730490041303
28 436 222 1.6698926255833755
29 747 465 1.6273903541804449
30 338 124 1.5987128520302278
31 22 5 1.5618933140413125
32 19 2 1.4967606279394443
33 751 469 1.480498642467794
34 124 63 1.4586748548300594
35 437 223 1.4486166892075465
36 749 467 1.424296235607222
37 468 254 1.3813768603244259
38 752 470 1.3636373922215335
39 304 90 1.2695069276928859
40 707 425 1.2595481602180154
41 968 633 1.2427295999967598
42 340 126 1.238697939749856
43 758 476 1.2361806398396333
44 718 436 1.2171117485149614
45 112 51 1.2044548267644286
46 303 89 1.191849419444905
47 1013 678 1.1900218043921074
48 716 434 1.175694715373152
49 305 91 1.1144461445417837
50 373 159 1.071949456510424
51 350 136 1.0442178750271376
52 120 59 1.032249223156422
53 39 21 1.0289066140518082
54 754 472 1.021245562355564
55 431 217 1.0112196888981333
56 40 22 0.9783128316736401
57 33 15 0.952665942151352
58 130 69 0.9323554684113402
59 401 187 0.9298191911149045
60 606 324 0.9031691289936831
61 662 380 0.8991516469240083
62 736 454 0.8835616760231381
63 323 109 0.882295943944168
64 126 65 0.8768326329975594
65 403 189 0.8603341765453334
66 667 385 0.853341277999745
67 742 460 0.8361417417726925
68 668 386 0.8064130750674726
69 412 198 0.7467488439890779
70 123 62 0.745745069118931
71 1007 672 0.6855982352785376
72 121 60 0.6414169643360372
73 115 54 0.612631867415782
74 717 435 0.5569985846392175
75 117 56 0.5425438654855919
76 128 67 0.5311998902670805
77 118 57 0.4688456765488282
78 127 66 0.44246701888818213
79 113 52 0.42745882075231073
80 125 64 0.34905485374098644
81 75 39 0.20774592741474804
Please enter your query: