Big O notation

In mathematics and computer science, Big O notation is a way of comparing rates of growth of different functions. It is often used to compare the efficiency of different algorithms, which is done by calculating how much memory is needed, and how much time it takes to complete.

The Big O notation is often used in identifying how complex a problem is, also known as the problem's complexity class. The mathematician Paul Bachmann (1837-1920) was the first to use this notation, in the second edition of his book "Analytische Zahlentheorie", in 1896. Edmund Landau (1877-1938) made the notation popular. For this reason, when people talk about a Landau symbols, they refer to this notation.

Big O notation is named after the term "order of the function", which refers to the growth of functions. Big O notation is used to find the upper bound (the highest possible amount) of the function's growth rate, meaning it works out the longest time it will take to turn the input into the output. This means an algorithm can be grouped by how long it can take in a worst-case scenario, where the longest route will be taken every time.

More specifically, given two positive functions $f(x)$ and $g(x)$ , we say that $f$ is in the big O of $g$ (written $f\in O(g)$ ) if for large enough $x$ , $f(x)\leq k\cdot g(x)$ for some constant $k$ .^[1]^[2]^[3]

Big O is an expression that finds worst-case scenario run-time, showing how efficient an algorithm is without having to run the program on a computer. This is also useful due to the fact that different computers may have different hardware, and therefore need different amounts of time to complete it. Since Big O always assumes the worst-case, it can show a consistent measurement of speed: regardless of the hardware, $O(1)$ is always going to complete faster than $O(n!)$ , because they have different levels of efficiency.

Examples[change | change source]

The following examples all use code written in Python. Note that this is not a complete list of Big O types.

Constant[change | change source]

$O(1)$ always takes the same amount of time regardless of input. For example, take a function that accepts an integer (called x) and returns double its value:

def double(x):
    return x * 2   #Return the value of x times 2

After accepting the input, this function will always take one step to return an output. It is constant because it will always take the same amount of time, so it is $O(1)$ .

Linear[change | change source]

$O(n)$ increases according to the size of the input, represented by $n$ . For example, for the following function that accepts n and returns every number from 1 to n:

def count(n):
    i = 1           #Create a counter called "i" with a value of 1
    while i <= n:   #While i is less-than or equal to n
        print(i)    #Print the value of i
        i = i + 1   #Redefine i as "the value of i + 1"

If we were to input the value of 5, then this would output $1,2,3,4,5$ , requiring 5 loops to complete. Similarly, if we input 100, then it would output $1,2,3...98,99,100$ , requiring 100 loops to complete. If the input is $n$ , then the algorithm's run time is exactly $n$ loops every time, therefore it is $O(n)$ .

Factorial[change | change source]

$O(n!)$ increases in factorial amounts, meaning the time taken increases massively with the input. For example, say we wish to visit five cities around the world and want to see every possible ordering (permutation). An algorithm we could write using Python's itertools library is as follows:

import itertools    #Import the itertools library
cities = ['London', 'Paris', 'Berlin', 'Amsterdam', 'Rome'] #An array of our chosen cities

def permutations(cities):                    #Taking an array of cities as input:
    for i in itertools.permutations(cities): #For each permutation of our items (assigned to variable "i")
        print(i)                             #Output i

This algorithm will calculate each unique permutation of our cities and then output it. Examples of output will include:

('London', 'Paris', 'Berlin', 'Amsterdam', 'Rome')
('London', 'Paris', 'Berlin', 'Rome', 'Amsterdam')
('London', 'Paris', 'Amsterdam', 'Berlin', 'Rome')
...
('Rome', 'Amsterdam', 'Paris', 'Berlin', 'London')
('Rome', 'Amsterdam', 'Berlin', 'London', 'Paris')
('Rome', 'Amsterdam', 'Berlin', 'Paris', 'London')

Here, our input list is 5 items long, and for every selection our remaining options decreases by 1. In other words, our 5 inputs choose $5\times 4\times 3\times 2\times 1$ items (or $5!$ ). If our input is $n$ cities long, then the number of outputs is $n!$ ; in general, assuming that we go through every permutation, then we will require $O(n!)$ loops to complete it.

Little-o notation[change | change source]

A related concept to big-O notation is little-o notation. Big-O is used to say that a function does not grow faster than another function, while little-o is used to say that a function grows slower than another function. If two functions grow at the same rate, big-O can be used but little-o cannot. The difference between big-O and little-o is similar to the difference between $\leq$ and $<$ .

If $f(x)$ grows slower than $g(x)$ , then $f(x)\in O(g(x))$ and $f(x)\in o(g(x))$ .
If $f(x)$ grows at the same rate as $g(x)$ , then $f(x)\in O(g(x))$ but $f(x)\not \in o(g(x))$ .
If $f(x)$ grows faster than $g(x)$ , then $f(x)\not \in O(g(x))$ and $f(x)\not \in o(g(x))$ .