At the last *Search Engines Architecture* lecture we discussed LSI and Terrier. Great questions were raised. Some of these follows:

**Q: **How many dimensions to keep?

**A: **This is done by trial and error. I have a research project on the topic. None of the current ways of addressing this problem convince me.

**Q:** How do we compute a truncated version of the initial matrix, **A**?

**A:** After *SVDing* **A**, truncate **U**, **S**, and **V** by retaining the first k columns of **U** and **V** (rows of **V **transpose) and the first k diagonal elements of **S**. Multiply these as discussed in class to get **A** truncated.

**Q:** To compute the query vector in the reduced space, do we need to compute **A** truncated for each query?

**A:** No. The new coordinates of this vectors are defined as

**q = q ^{T}U_{k}S_{k}^{-1 }**

This means that

**A**can be called from the cache. See the fast track tutorial

http://www.miislita.com/information-retrieval-tutorial/lsi-keyword-research-fast-track-tutorial.pdf

over at Mi Islita.com site.

**Q:** Do I need to compute **A** truncated each time a new document is added or previous are modified?

**A:** For small matrices the answer is YES. However, for huge matrices we can resource to updating/appending techniques. Some of these add doc vectors without recomputing the previous matrix. There is a point wherein this can compromise orthogonality, though.

**Q:** How do I use Desktop Terrier?

**A: **Follow the instructions provided in the updated version of Lab Report 2.