I am writing a tutorial series on Cluster Analysis. It is my pleasure to announce that the
Association and Scalar Clusters Tutorial – Part 1: Back Mapping Term Clusters to Documents was uploaded few days ago.
Online publication was announced in advanced to subscribers of the IR Watch – The Newsletter, so they already have an edge over regular readers and visitors of Mi Islita
In this tutorial you will learn how to extract association and scalar clusters from a term-document matrix. A “reaction” equation approach is used to break down the classification problem to a sequence of steps. From the initial matrix, two similarity matrices are constructed, and from these association and scalar clusters are identified. A back mapping technique is then used to classify documents based on their degree of pertinence to the clusters. Matched documents are treated as distributions over topics. Applications to topic discovery, term disambiguation, and document classification are discussed.
During last night lecture (Web Mining Course), I applied the back mapping technique to scalar clusters generated from LSI. The technique provides additional information and reasons as to how and why documents score as observed after implementing SVD. A clear connection with Fuzzy Set Theory was made.
Students taking the Web Mining Course will find this tutorial quite handy.