, , ,

One of the most interesting problems in data mining and cluster analysis relates to the transformation of similarities into distances without breaking the triangular inequality condition for a distance metric.

Some of the transformations found in the literature are based on heuristics and tricks of the trade, or based on assumptions applicable to a given knowledge domain. This topic is discussed in our tutorial on distance and similarity (http://www.minerazzi.com/tutorials/distance-similarity-tutorial.pdf).

We have incorporated to our Cosine Similarity Calculator a simple methodology that easily transforms cosine similarities into distances while obeying the requirements for a distance metric. It all boils down to mean-centering the variables.

Check our revamped and improved Cosine Similarity Calculator tool now at