Blurring, inverting, thresholding an image

We're going to use the same sample image used in other pages in this section - a photo I took of two pens on a desk. You will need to import some stuff:

import cv2
from matplotlib import pyplot as plt

Read the image, show it

# read an image with OpenCV
image = cv2.imread('pens.jpg')
# transform into RGB as OpenCV reads in BGR (and Matplotlib uses RGB)
RGB_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# show it
My sample photo


Blurring is also called smoothing and it is an operation typically used to reduce noise on the source. It consists in applying a filter to the image. Let's see what this means.

The operation goes like this:

g(i,j)=h,lf(i+h,j+l)K(h,l) ,g(i,j) = \sum_{h,l} f(i+h, j+l) K(h,l) \ ,

whereg(i,j)g(i,j)represents the transformed value of the pixel at position (i,j)(i,j),K(h,l)K(h,l)is the kernel andffrepresents the original pixels. What the sum is doing is weighing every pixel by the kernel and considering its neighbourhood. The kernel is a matrix which determines how should the neighbourhood be factored in.

In the following, we'll go through some common types of filters and we'll try them out on the image.

Common types of filters and trying them out

Normalised Box Filter

Each pixel value gets transformed into the mean of its neighbours, each of which contributes with equal weight. The kernelKKis a matrix with the same value at each place, and this value is

k=1kwkh ,k = \frac{1}{k_w k_h} \ ,

wherekw,khk_w, k_hare the (width and height) dimensions of the matrix. In other words, this filter is considering a neighbourhood rectangle of dimensionkw×khk_w \times k_haround each pixel and averaging the intensities values inside it, assigning to pixels.

# a normalised box filter with kernel 50X50 and showing result
nb = cv2.blur(RGB_image, (50, 50))
Normalised box filter blurring

Gaussian Filter

It's the most popular but not the fastest. The kernel is given by a gaussian in 2 dimensions, so that at a point(x,y)(x, y)it is:

K(x,y)=12πσxσye(xμx)22σx2e(yμy)22σy2K(x, y) = \frac{1}{2 \pi \sigma_x \sigma_y} e^{-\frac{(x - \mu_x)^2}{2 \sigma_x^2}} e^{-\frac{(y - \mu_y)^2}{2 \sigma_y^2}}

This way, the pixel in the middle is given the largest weight and this weight decreases normally with distance to the pixel in consideration. Look at the docs for the implementation of this filter in OpenCV3.

# a Gaussian filter with kernel 49X49 and and showing result
# Note that the third (required) arg is the sigma_x, if 0 means both sigma_x and sigma_y are calculated from the
# kernel size. Also note kernel size must be odd (not sure why this constraint)
gb = cv2.GaussianBlur(RGB_image, (49, 49), 0)
Gaussian filter blurring

Median Filter

Each pixel gets replaced with the median of its neighbours (those in a square around it).

Median filter blurring

Bilateral Filter

Avoids (to a certain extent) smoothing the edges in a picture (all other filters don't avoid that). Considers neighbouring pixels with weight. In a region of pixels similar in intensity, it will replace pixel with the average of neighbourhood, acting similarly to other filters; in a region where there is a boundary of two intensity areas, that is, a region where pixels on one side are sensibly brighter than those on the other side, a bilateral filter yields a value of 1 for pixels on the same side and 0 for the others. See the detailed explanation in the references for a detailed explanation and the OpenCV3 docs for the API.

Note that it's a quite slow algorithm, especially for large diameters of the neighbourhood.

# a bilateral filter with kernel of diameter 15 and sigmas in color space and coordinate space
bb = cv2.bilateralFilter(RGB_image, 15, 2, 2)
Bilateral filter blurring


Inverting an image means subtracting each pixel value from 255, so that white becomes black and vice versa.

# make the sample image grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# invert it
inverted = 255 - gray
# Plot both gray image and inverted
f, (ax1, ax2) = plt.subplots(1, 2, sharey=True)
Grayscale image and its inverted version


Thresholding is a way to modify the pixels of an image based on a given threshold in their intensity/colour.

See the API call cv2.threshold: it needs the grayscale image as first argument, the thresholdttas the second argument and the value to assign as the third argument in the case of a binary and binary inverted thresholding.

See the OpenCV docs for an explanation of thresholding with graphics on OpenCV itself.

Simple Thresholding

Applied to grayscale images, it sets the pixel to a new value if it exceeds a given threshold and to another value otherwise. Modes are, calling p(x,y)p(x, y)the pixel,aathe value to set andttthe threshold:

  • binary:

    p(x,y)={a if p(x,y)>t0 elsep'(x, y) = \begin{cases} a \ \text{if} \ p(x,y) > t \\ 0 \ \text{else} \end{cases}
  • binary inverted:

    p(x,y)={0 if p(x,y)>ta elsep'(x, y) = \begin{cases} 0 \ \text{if} \ p(x,y) > t \\ a \ \text{else} \end{cases}
  • threshold truncated:

    p(x,y)={t if p(x,y)>tp(x,y) elsep'(x, y) = \begin{cases} t \ \text{if} \ p(x,y) > t \\ p(x,y) \ \text{else} \end{cases}
  • threshold to zero:

    p(x,y)={p(x,y) if p(x,y)>t0 elsep'(x, y) = \begin{cases} p(x,y) \ \text{if} \ p(x,y) > t \\ 0 \ \text{else} \end{cases}
  • threshold to zero inverted:

    p(x,y)={0 if p(x,y)>tp(x,y) elsep'(x, y) = \begin{cases} 0 \ \text{if} \ p(x,y) > t \\ p(x,y) \ \text{else} \end{cases}

Trying the binary thresholding

# binary threshold: put 255 (white) if pixel passes 100 threshold, 0 (black) otherwise
dest = cv2.threshold(gray, 100, 255, cv2.THRESH_BINARY)[1]
Binary thresholding

Trying the binary inverted thresholding

The code is the same, what changes is the use of cv2.THRESH_BINARY_INV.

Binary inverted thresholding

Trying the binary truncated thresholding

Code is the same, it uses cv2.THRESH_TRUNC.

Binary truncated thresholding

Trying the thresholding to 0

Same code, uses cv2.THRESH_TOZERO.

Thresholding to 0

Trying thresholding to 0 inverted

Same code, uses cv2.THRESH_TOZERO_INV.

Thresholding to 0 inverted

Adaptive thresholding

Applied to grayscale images, the threshold is calculated locally so it is different for each region and this accounts for different conditions like illumination. Modes are binary and binary inverted, as above, with the difference thatt=t(x,y)t = t(x, y). The pixel gets set to a specified new value. Methods are:

  • Adaptive mean:t(x,y)t(x, y)is the average of the neighbourhood of pixelp(x,y)p(x, y), the neighbourhood being a square of specified size around pixel

  • Adaptive gaussian:t(x,y)t(x, y)is a weighted sum, with gaussian weights of the neighbourhood of pixel p(x,y)p(x, y). The standard deviation depends on the block size.

# adaptive mean thresholding with binary method and a neighborhood of 3X3
# note that last arg (required) gets subtracted from the mean for computing the threshold
dest = cv2.adaptiveThreshold(gray, 100, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 3, 0)
Adaptive thresholding with a binary method.
# adaptive gaussian thresholding with binary method and a neighborhood of 3X3
# note that last arg (required) gets subtracted from the mean for computing the threshold
dest = cv2.adaptiveThreshold(gray, 100, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 3, 0)
Adaptive thresholding with a gaussian method.

Otsu's Binarization Thresholding

This (see the OpenCV docs) is a global thresholding method but the value of the threshold is computed as the mean value in between the two peaks of a bimodal image (see page). For this reason, it is not good on non-bimodal images. It minimises the weighted within-class variance where a class is the set of pixels around a peak.

# Otsu binarisation
dest = cv2.threshold(gray, 100, 255, cv2.THRESH_OTSU)[1]
Otzu binarisation thresholding


  1. A detailed explanation of the bilateral filter, University of Edinburgh, School of Informatics

  2. The explanation of thresholding on OpenCV itself

  3. N Otsu, A Threshold Selection Method from Gray-Level Histograms, IEEE Transactions on Systems, Man and Cybernetics, 9, 1979.