(Some) mathematical measures

This page outlines, mainly for convenience, some useful measures/metrics utilised for several purposes.

The harmonic mean

It is the reciprocal of the mean of reciprocals:

h=ni=1i=n1xih = \frac{n}{\sum_{i=1}^{i=n} \frac{1}{x_i}}

Common measures of distance/similarity

Euclidean

The euclidean distance of two vectorsva=(xi)av_a = (x^i)_aandvb=(xi)bv_b = (x^i)_bis the norml2l_2of the vector connecting them (it measures its length):

d=i(xaixbi)2=vAvB2d = \sqrt{\sum_i (x^i_a - x^i_b)^2} = ||v_A - v_B||_2

Hamming

The Hamming distance expresses the number of different elements in two lists/strings:

A=110101;B=111001;dAB=2A = 110101; B = 111001; d_{AB} = 2

Jaccard (index)

Given two finite sets A and B, the Jaccard index gives a measure of how much they overlap, as

JAB=ABABJ_{AB} = \frac{|A \cap B|}{|A \cup B|}

Manhattan

Also called cityblock, the Manhattan distance between two points is the norml1l_1of the shortest path a car would take between these two points in Manhattan (which has a grid layout):

d=iuivid = \sum_i |u_i - v_i|

Minkowski

The Minkowski distance is a generalisation of both the euclidean and the Manhattan to a generic p:

d=(ixiyip)1/pd = \left(\sum_i |x_i - y_i|^p\right)^{1/p}

Cosine

The cosine similarity is given by the cosine of the angleθ\thetaspanned by the two vectors

d=cosθ=uˉvˉuˉvˉd = \cos \theta = \frac{\bar u \cdot \bar v}{|\bar u| |\bar v|}

So two perfectly overlapping vectors would have a cosine similarity of 1 and vectors at 9090^{\circ}would have a cosine similarity of 0.

Chebyshev

It is also called chessboard distance. In the game of chess, the Chebyshev distance between the centers of the squares is the minimum number of moves a king needs to go from a square to another one.

maxiuivi\max_i |u_i - v_i|

See the figure here, it reports in red all the Chebyshev distance value from where the king (well, there's a drawing for it ...) sits to cell; note that the king can move horizontally, vertically and diagonally.