Regression metrics

RMSE: root mean square error

This is probably the most common metric used to assess the quality of a regression task. The RMSE is calculated as
RMSE=i=1n(yiyi^)2n ,RMSE = \sqrt{\frac{ \sum_{i=1}^n (y_i - \hat{y_i})^2 }{n}} \ ,
is the number of samples in the set,
the actual value and
y^\hat y
the predicted score (the difference between predicted and actual value being called the residual). This metric represents the square root of the average of the squared differences between the actual and the predicted values.

RSS: residual sum of squares

The RSS is calculated as
RSS=i=1n(yiyi^)2RSS = \sum_{i=1}^n (y_i - \hat{y_i})^2
The RSS expresses the unexplained variance, the variance not captured by the model.

: coefficient of determination

The coefficient of determination, usually indicated as
, expresses the proportion of the variance in the dependent variable that is predictable from the independent variable. It is a number smaller or equal than 1, 1 being the best situation.
the predicted values and
the actual values, we calculate the average of the actual values
yˉ=1ni=1nyi ,\bar y = \frac{1}{n} \sum_{i=1}^n y_i \ ,
the total sum of squares
SSTOT=i=1n(yiyˉ)2SS_{TOT} = \sum_{i=1}^n (y_i - \bar y)^2
and the explained sum of squares
SSexp=i=1n(yi^y^)2SS_{exp} = \sum_{i=1}^n (\hat{y_i} - \hat y)^2
With the definition of the RSS from above, we have
R2=1RSSSSTOTR^2 = 1 - \frac{RSS}{SS_{TOT}}
The second bit expresses the fraction of unexplained variance to the total variance in the data, so the
is the fraction of variance explained to the total variance.

MAE: mean absolute error

The MAE is calculated as
MAE=1ni=1n(yiyi^) ,MAE = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y_i}) \ ,
that is, as the average of the differences of the actual to the predicted values


  1. 2.
    Wikipedia has a very nice page on the coefficient of determination