Sorting algorithms
Sorting algorithms deal with sorting a list (array) of elements in a desired order - you can do this in a variety of ways, one more efficient than the other.
We will illustrate how different sorting algorithms work, using array
65318724\boxed{6}\boxed{5}\boxed{3}\boxed{1}\boxed{8}\boxed{7}\boxed{2}\boxed{4}
as the example. This array is used in the Wikipedia pages of these algorithms. We will assume that the task is to sort the array in the ascending order (from the smallest to the largest element), but all the discussion can be easily mirrored for the opposite ordering.
Note that this overview is far from being comprehensive of all possible existing algorithms for sorting!
For the code, you will just need to import Pyplot:
1
from matplotlib import pyplot as plt
Copied!

Mergesort

It is a divide and conquer algorithm and applies repeated comparison: the array (of length
nn
) is first divided into its single components and then bigger and bigger chunks are merged together to recompose the original array, but sorted.

The algorithm

    1.
    In the initial stage, the array starts by being divided iteratively into halves until it gets to the level of the cells of each single component. If the array is not even in size, the initial two halves will have different lengths. In our example case we have
    6   5   3   1    8   7   2   4\boxed{6}\ \ \ \boxed{5}\ \ \ \boxed{3}\ \ \ \boxed{1}\ \ \ \ \boxed{8}\ \ \ \boxed{7}\ \ \ \boxed{2}\ \ \ \boxed{4}
    2.
    Now we need to create couples (subsets of two elements) by merging the single elements two by two. The way we do this is comparing couples of adjacent elements and sort them
    56   13    78   24\boxed{5}\boxed{6}\ \ \ \boxed{1}\boxed{3}\ \ \ \ \boxed{7}\boxed{8}\ \ \ \boxed{2}\boxed{4}
    3.
    Now time for merging into subsets of four elements. We compare adjacent couples for the job: compare the first element of the first with the first element of the second and choose the smallest and put it aside (there is no point in comparing the second elements of the first couple with the first of the second as couples are ordered); then the remaining element of the first couple with the remaining element of the other couple (remaining here stands for the one staying there after the isolation, whichever it is) and so on
      specifically, in our example we need to merge couples
      56\boxed{5}\boxed{6}
      and
      13\boxed{1}\boxed{3}
      , so we proceed by comparing 5 to 1, and choose 1; we are left with
      56\boxed{5}\boxed{6}
      and
      3\boxed{3}
      , so we compare 5 to 3 and choose 3; we are then left with
      1356\boxed{1}\boxed{3}\boxed{5}\boxed{6}
    1356    2478\boxed{1}\boxed{3}\boxed{5}\boxed{6}\ \ \ \ \boxed{2}\boxed{4}\boxed{7}\boxed{8}
    4.
    Now merge into subsets of eight by using the same procedure: start by comparing the first elements of the two first 4-subsets, choose the smallest; then keep comparing whatever is left on the first subset to what is left on the second subset. Eventually, this leads to
    12345678\boxed{1}\boxed{2}\boxed{3}\boxed{4}\boxed{5}\boxed{6}\boxed{7}\boxed{8}
A great visual representation of the procedure can be seen in the visual examples.

Complexity

It is a
O(nlogn)O(n \log n)
algorithm, in the worst, best and average cases, because the splits take
O(logn)O(\log n)
(this is because for an array of size n, you will have log n regrouping of elements from the initial couples to the final total length
nn
, e.g. in our case with
n=8n=8
we split into couples, then into 2 section of 4 and then into the full length, which is 3 regroupings) and there are
nn
comparisons to be done at each split.

An implementation

Note that this implementation, which is recursive, is inspired by the StackOverflow reference.
1
def mergesort(my_array):
2
3
if len(my_array) == 1:
4
return my_array
5
6
new = []
7
n = len(my_array)
8
midpoint = int(n/2)
9
10
right = mergesort(my_array[:midpoint])
11
left = mergesort(my_array[midpoint:])
12
13
while len(right) > 0 and len(left) > 0:
14
if right[0] > left[0]:
15
new.append(left[0])
16
left.pop(0)
17
else:
18
new.append(right[0])
19
right.pop(0)
20
21
new += right + left
22
23
return new
24
25
mergesort(array)
Copied!

Bubble sort

It is called this way because smaller items iteratively walk up ("bubble") the top of the array. The algorithm works in such a way that the array is walked repeatedly and adjacent items are swapped if they are in the reverse order.

The algorithm

    First round of walking the array:
      Let's start from the example array above and start comparing the first two elements, 5 < 6, so we swap them, creating
      56318724\boxed{5}\boxed{6}\boxed{3}\boxed{1}\boxed{8}\boxed{7}\boxed{2}\boxed{4}
      After this, we proceed on the next couple in line, 3 < 6 and we produce another swap
      We continue like this, always comparing the next couple in line, eventually ending up with array
      53167248\boxed{5}\boxed{3}\boxed{1}\boxed{6}\boxed{7}\boxed{2}\boxed{4}\boxed{8}
      , which concludes the first round
    Second round of walking the array:
      We proceed in exactly the same way, starting to compare couples from the start again
    ...
    We will eventually end up with a totally sorted array
A great visual representation of the procedure can be seen in the visual examples.

Complexity

The complexity of bubble sort is
O(n2)O(n^2)
in the worst and average case,
O(n)O(n)
in the best case (no swaps, meaning array is already sorted). This makes for a bad algorithm in general in terms of performance, due to all the repeated checks it does. It is a quite inefficient algorithm!

An implementation

Which is also recursive.
1
def bubblesort(my_array):
2
3
my_array_copy = my_array.copy()
4
n = len(my_array_copy)
5
6
if (all([my_array_copy[i] < my_array_copy[i+1] for i in range(n-1)])):
7
return my_array_copy
8
9
for i in range(n-1):
10
if my_array_copy[i] > my_array_copy[i + 1]:
11
my_array_copy[i], my_array_copy[i+1] = my_array_copy[i+1], my_array_copy[i]
12
13
my_array_copy = bubblesort(my_array_copy)
14
15
return my_array_copy
Copied!

Insertion sort

Insertion sort works in such a way that each element gets pushed to its right location by doing all comparisons with the elements in front of it.

The algorithm

    1.
    In the case of our array, start with 6 which is the first number
    2.
    Now 5 is smaller than 6, so move it, obtaining
    56318724\boxed{5}\boxed{6}\boxed{3}\boxed{1}\boxed{8}\boxed{7}\boxed{2}\boxed{4}
    3.
    Proceed on to the next number, which is 3, and move it to the top, obtaining
    35618724\boxed{3}\boxed{5}\boxed{6}\boxed{1}\boxed{8}\boxed{7}\boxed{2}\boxed{4}
    4.
    Now on to 1, which gets moved to the top in the same way
    5.
    Now 8, which stays there
    6.
    Now 7, which gets placed right before the 8, ending up with
    13567824\boxed{1}\boxed{3}\boxed{5}\boxed{6}\boxed{7}\boxed{8}\boxed{2}\boxed{4}
    7.
    ...
    8.
    Eventually you'll have a sorted array

Complexity

In the worst and average case, this is a
O(n2)O(n^2)
algorithm, in the best case it is a
O(n)O(n)
one (it does
nn
comparisons, this is the case when the array is already sorted).
1
def insertionsort(my_array):
2
3
my_array_copy = my_array.copy()
4
n = len(my_array_copy)
5
6
for i in range(1, n):
7
for j in range(i):
8
if my_array_copy[i] < my_array_copy[j]:
9
my_array_copy[i], my_array_copy[j] = my_array_copy[j], my_array_copy[i]
10
11
return my_array_copy
Copied!

Selection sort

This algorithm is somehow similar to the insertion sort. It works by building two lists out of the original array: the sorted one built iteratively and the rest, which shrinks till becoming empty. Have a look at the viz in the visual examples.

The algorithm

    1.
    Initially, the sorted list is empty
    2.
    Find the smallest elements in the unsorted list and swap it with the leftmost one, so that in our example we get
    15368724\boxed{1}\boxed{5}\boxed{3}\boxed{6}\boxed{8}\boxed{7}\boxed{2}\boxed{4}
    : this way the sorted list contains 1 and the rest has one element less
    3.
    Proceed on the next element and keep doing the same till everything is sorted

Complexity

The complexity is
O(n2)O(n^2)
in the best, worst and average case. This is because you have
n1n-1
comparisons to find the lowest element,
n2n-2
to find the second lowest, ... and so on, effectively meaning you will do
(n1)+(n2)++1=n(n1)2(n-1) + (n-2) + \ldots + 1 = \frac{n(n-1)}{2}
comparisons.

Heapsort

Heapsort (see ref) works on the same philosophy of selection sort but uses a heap structure to find the minimum rather than employing a linear time lookup.

The algorithm

    1.
    Build the (min) heap
    2.
    The sorted array is build by repeatedly extracting the root of the heap (the minimum element) and add to array
    3.
    Keep going, updating the heap each time

Complexity

The complexity of heapsort is
O(nlogn)O(n \log n)
in all cases. This is because the heap is built in
O(n)O(n)
time and finding the needed element each time is a
O(logn)O(\log n)
operation given the length of heap is
logn\log n
, with
nn
calls this gives
O(nlogn)O(n \log n)
, so a total of
O(n+nlogn)O(nlogn)O(n + n \log n) \approx O(n \log n)
.

Quicksort

It is a more efficient that mergesort and heapsort, was published in 1962 (see ref).

The algorithm

Note that many great examples with step to step approaches are present on the internet.
The algorithm is a divide and conquer one which works by recursively finding the right split point of the array and sorting the two subarrays.
    1.
    Pick an item from the array, call it pivot: in our case let's choose the first one (6) - the goal is to find its correct place in the sorted array, where all items before it are smaller and all items after it are bigger
    2.
    To sort the elements in such a way that all those smaller than the pivot go before it and all those greater than the pivot go after it (partition phase), we need to locate the split point (the correct location of the pivot):
      Start by placing right and left markers as respectively the last element of the array (4) and the pivot itself - we need to check their relation to start with
      4 < 6, so we switch them, the array becomes
      45318726\boxed{4}\boxed{5}\boxed{3}\boxed{1}\boxed{8}\boxed{7}\boxed{2}\boxed{6}
      Now, the markers sit on the pivot (6) and on the first element (4) and we have switched the pivot to the right, so we turn our attention to what happens on its left: 4,5,3,1 are all smaller than the pivot, 8 isn't - we switch them:
      45316728\boxed{4}\boxed{5}\boxed{3}\boxed{1}\boxed{6}\boxed{7}\boxed{2}\boxed{8}
      Continuing this process, we now turn our attention to the right of the pivot: 7 is bigger, 2 isn't, so we switch 2 and 6:
      45312768\boxed{4}\boxed{5}\boxed{3}\boxed{1}\boxed{2}\boxed{7}\boxed{6}\boxed{8}
      At this point we turn our attention to the left of the pivot again and see that 7 is bigger, so we switch, obtaining
      45312678\boxed{4}\boxed{5}\boxed{3}\boxed{1}\boxed{2}\boxed{6}\boxed{7}\boxed{8}
      Now 6 is in the right place, nothing else to switch around it. We proceed by doing exactly the same on the two subarrays before and after 6

Complexity

The complexity is
O(nlogn)O(n \log n)
in the best and average case,
O(n2)O(n^2)
in the worst case.
The worst case verifies when the pivot divides the array into two groups of 0 and
n1n-1
items, which then recursively become groups of 0 and
n2n-2
items ... leading to
O(n2)O(n^2)
comparisons. This happens for sorted arrays. The best case verifies when each split divides the array into two groups of comparable length, which implies each recursive call processes an array of half the size:
logn\log n
calls are made as you are splitting
nn
recursively until you reach length 1, so for instance with
n=32n=32
you go to 16, 8, 4, 2, 1 in 5 calls, which is
log32\log 32
. At each level there are then
O(n)O(n)
comparisons, so we get an overall
O(nlogn)O(n \log n)
.
The average case can be proven by solving the recursive relation, you can read it on Wikipedia.

Implementation

This code here is admittedly awful (quite imperative). Also note that unlike the previous ones, this method is modifying the original array so to not lose the original order it is better to pass a copy.
1
def quicksort(my_array, pivot_index, end_index):
2
3
my_array_slice = my_array[pivot_index: end_index]
4
5
if len(my_array_slice) <= 1:
6
return
7
8
pivot = my_array[pivot_index]
9
10
l, r = pivot_index + 1, end_index - 1
11
12
while l <= r - 1:
13
14
i = r
15
while i > pivot_index:
16
if my_array[i] < pivot:
17
break
18
i -= 1
19
r = i
20
21
i = l
22
while i < end_index:
23
if my_array[i] >= pivot:
24
break
25
i += 1
26
l = i
27
28
if l <= r:
29
my_array[l], my_array[r] = my_array[r], my_array[l]
30
31
if l < r:
32
my_array[l], my_array[pivot_index] = my_array[pivot_index], my_array[l]
33
elif l == r:
34
if my_array[l] < my_array[pivot_index]:
35
my_array[l], my_array[pivot_index] = my_array[pivot_index], my_array[l]
36
else:
37
my_array[r], my_array[pivot_index] = my_array[pivot_index], my_array[r]
38
39
new_pivot_index = my_array.index(pivot)
40
my_array_slice = my_array[pivot_index: end_index]
41
42
right = my_array_slice[pivot_index: new_pivot_index]
43
left = my_array_slice[my_array_slice.index(pivot) + 1:]
44
45
quicksort(my_array, 0, len(right))
46
quicksort(my_array, new_pivot_index + 1, new_pivot_index + 1 + len(left))
47
48
return
Copied!

What does Python use

The sort and sorted built-in methods in Python use a very efficient algorithm called Timsort, designed by Tim Peters specifically for use in Python itself in 2002.

Measurement time (!)

Let's measure the time taken by the implementations we wrote above, and by timsort, on arrays of different length. Note that we wanted to use bubblesort as well but our implementation easily exceeds recursion limits! So we will use mergesort, insertion sort and timsort.
From the plot below you can see both the quadratic trend of insertion sort, and the fact that timsort is the most efficient one. Note that our implementation of mergesort includes some calls to .pop, which may add some overhead.

Now let's run these

Going to run all these codes on arrays of various lengths, scaling as powers of ten, and measuring the time taken. Note that the implementations above may add some unneeded overhead give the way they've been written so the results here shouldn't be taken at face value.
1
mergesort_t = []
2
bubblesort_t = []
3
insertionsort_t = []
4
quicksort_t = []
5
timsort_t = []
6
7
#n_range = [10, 100, 1000, 10000]
8
n_range = np.arange(1, 1000, 100)
9
10
for n in n_range:
11
12
an_array = random.sample(range(n), n)
13
14
t = time.process_time()
15
sorted_array = mergesort(an_array)
16
mergesort_t.append(time.process_time() - t)
17
18
t = time.process_time()
19
sorted_array = bubblesort(an_array)
20
bubblesort_t.append(time.process_time() - t)
21
22
t = time.process_time()
23
sorted_array = insertionsort(an_array)
24
insertionsort_t.append(time.process_time() - t)
25
26
t = time.process_time()
27
sorted_array = quicksort(an_array.copy(), 0, len(an_array))
28
quicksort_t.append(time.process_time() - t)
29
30
t = time.process_time()
31
sorted_array = sorted(an_array)
32
timsort_t.append(time.process_time() - t)
Copied!
1
plt.figure(figsize=(15,6))
2
3
plt.semilogy(n_range, mergesort_t, marker='o', label='mergesort')
4
plt.plot(n_range, bubblesort_t, marker='o', label='bubblesort')
5
6
plt.plot(n_range, insertionsort_t, marker='o', label='insertionsort')
7
plt.plot(n_range, quicksort_t, marker='o', label='quicksort')
8
plt.plot(n_range, timsort_t, marker='o', label='timsort')
9
10
plt.legend()
11
plt.title('Time taken by sorting algorithm versus length of array')
12
plt.xlabel('Array length')
13
plt.ylabel('Seconds')
14
plt.show();
Copied!
Time taken by the various sorting implementations vs. array length

References

    2.
    J W J Williams, Algorithm 232: heapsort, Commun. ACM 7, 1964
    3.
    C A R Hoare, Quicksort, The Computer Journal 5.1, 1962

Visual examples

Last modified 6mo ago