Singular value decomposition

What is

A singular-value decomposition (SVD) of a high-dimensional matrix gives a low-dimensional representation of it. Given a matrix
MM
, an
m×nm \times n
matrix of rank
rr
, we can find matrices
UU
,
Σ\Sigma
and
VV
such that we have the decomposition
M=UΣVtM = U \Sigma V^t
. These matrices respect conditions:
  1. 1.
    UU
    is an
    m×rm \times r
    matrix and is column orthonormal (each column is a unit vector and the dot product of any two columns is 0);
  2. 2.
    VV
    is a
    n×rn \times r
    matrix, also column orthonormal;
  3. 3.
    Σ\Sigma
    is a diagonal matrix and the elements on its diagonal are called the singular values of
    MM
    .
UU
,
Σ\Sigma
and
VV
are called the SVD components of
MM
. The singular values represent "hidden concepts" which connect the two other matrices together. SVD is often employed in recommender systems.

A small example, in words

If
MM
contains the ratings given by people to movies, so that people are on the rows and movies on the columns, then
UU
connects people to concepts,
VV
connects movies to concepts and
Σ\Sigma
gives the strength of each concept.
A person not represented in
MM
wants to know what movies he/she would like based on the ones he/she has seen. If
qq
is the vector representing the movies the person rated,
qVq V
maps the person to the concept space and this can be mapped back to the movie space by multiplying by
VtV^t
.
To find similar users to a given user, we can map all users to the concept space using
VV
and then apply some similarity.

Where is the dimensionality Reduction part

If
MM
is a very large matrix, its SVD components will be large as well. But in real-world uses of the technique, approximate decompositions are employed instead, so to reduce the original dimensionality.
Setting the
ss
smallest singular values to 0 and eliminating the corresponding
ss
rows of
UU
and
VV
we obtain an approximate representation. This procedure works well because it can be shown that it minimises the root mean square error between
MM
and its approximation, or the Frobenius norm of this difference.
Let
M=PQRM = PQR
be a generic decomposition, then
mij=klpikqklrljm{ij} = \sum_k \sum_l p_{ik} q_{kl} r_{lj}
and then
{M2=ijmij2=ij(klpikqklrlj)2=ijklnmpikqklrljpinqnmrmj\begin{cases} ||M||^2 &= \sum_i \sum_j m_{ij}^2 \\ &= \sum_i \sum_j (\sum_k \sum_l p_{ik} q_{kl} r_{lj})^2 \\ &= \sum_i \sum_j \sum_k \sum_l \sum_n \sum_m p_{ik} q{kl} r_{lj} p_{in} q_{nm} r_{mj} \end{cases}
Now, if
PP
,
QQ
and
RR
are the SVD components of
MM
, then it means
QQ
is diagonal,
PP
is column orthonormal,
RR
is row orthonormal (note the transpose in the SVD equation), so
M2=kqkk2||M||^2 = \sum_k q_{kk}^2
.
Calling
P=UP = U
,
Q=ΣQ=\Sigma
and
R=VtR=V^t
, with
σi\sigma_i
being the
ii
-th diagonal element of
Σ\Sigma
, if we keep the first
nn
elements of
Σ\Sigma
and set the remaining ones to 0, obtaining a
Σ\Sigma'
, we get
M=UΣVt ,M' = U \Sigma' V^t \ ,
which is an approximation of
MM
. The matrix giving the resulting errors is
MM=U(ΣΣ)VtM - M' = U (\Sigma - \Sigma') V^t
, whose norm is (following from the above)
MM2=k(ΣΣ)kk2||M - M'||^2 = \sum_k (\Sigma - \Sigma')_{kk}^2
The matrix
ΣΣ\Sigma - \Sigma'
has 0 in the first
ΣΣ\Sigma - \Sigma'
diagonal elements and
σi\sigma_i
in the remaining ones, with
n<irn < i \leq r
. So,
MM2||M - M'||^2
is the sum of the squares of the elements of
Σ\Sigma
which were set to 0. In order to minimise it, we pick the smallest elements of
Σ\Sigma
.

Computing the SVD of a matrix

The computation is strictly connected to the eigenvalues of the symmetric matrix
MtMM^tM
and
MMtM M^t
.
If
M=UΣVtM = U \Sigma V^t
, then
Mt=VΣtUtM^t = V \Sigma^t U^t
and because
Σ\Sigma
is diagonal, then
Σ=Σt\Sigma = \Sigma^t
, so
Mt=VΣUtM^t = V \Sigma U^t
, which then means (using the orthonormality of
UU
and
VV
)
MtM=VΣUtUΣVt=VΣ2VtM^t M = V \Sigma U^t U \Sigma V^t = V \Sigma^2 V^t
so
MtMV=VΣ2M^t M V = V \Sigma^2
Now,
Σ2\Sigma^2
is diagonal, with entries being the squares of the
Σ\Sigma
's entries. This last equation says that
VV
is the matrix of eigenvectors of
MtMM^tM
and
Σ2\Sigma^2
is the matrix of eigenvalues. So, by computing
MtMM^tM
we have the eigenvectors and the singular values, only
UU
remains.
Now,
MMt=UΣVtVΣUt=UΣ2UtM M^t = U \Sigma V^t V \Sigma U^t = U \Sigma^2 U^t
so
MMtU=UΣ2M M^t U = U \Sigma^2
.
which means that
UU
is the matrix of eigenvectors of
MMtM M^t
.
MMtM M^t
is a
n×mn \times m
matrix,
MMtM M^t
is
m×nm \times n
and
n,mrn, m \geq r
. Hence,
MtMM^t M
and
MMtM M^t
have additional
nrn-r
and
mrm-r
eigenvectors which do not show up in
UU
,
VV
and
Σ\Sigma
. Since
rr
is the rank, all other eigenvalues are 0.
Last modified 7mo ago