WTF is time complexity? š¤
Hereās a subject I never thought Iād write about. But life is full of surprises.
People who learn web development through coding bootcamp arenāt (usually) familiar with time complexity. They might have read the words, but thatās it. I know I tried to look up the Wikipedia page only to fall asleep at the end of the first paragraph. When I finally woke up, I thought āNevermind, Iāll never have to deal with it anywayā.
And boy, was I wrong. I was soon asked during a coding interview to write a pairing algorithm and to calculate its time complexity š¤. It made me realize that time complexity is more than just a fancy word thrown by an engineer to make you sweat. Knowing the gist of it can be useful outside the coding-interviews-world too.
So, for all my fellow bootcampers out there, here are the very basics of time complexity - so you can shine like a diamond during your next cocktail party.
Time complexity for noobs
Time complexity is how you measure your code's execution time depending on the size of the input.
Thatās it. You can stop reading right now and go back to your life.
Still with me? Ok, wannabe-CS-nerd, hereās an example:
- If your code handles a 10-integer array and a 10,000-integer array during the same amount of time, well done, your codeās time complexity is š.
- If your code takes 100x longer to handle a 10x larger array, sorry, your codeās time complexity is š© 1.
A much-needed example
Letās take the coding interview classic I had to tackle:
Say you get an array of 10,000 random integers, positive and negative. You want to get all couples (without duplicates) of elements summing to a given number.
Hereās a first solution:
Letās break it down:
repeated_combination(2)
generates an enumerator containing all repeated pairs of integers from the 10,000 array (roughly 50,000,000 elements š±).- The
find_all
returns an array with all occurrences checking the block. uniq
removes the duplicated pairs
But whatās more important is: how much time does this method take to run depending on the size of the input?
Weāll benchmark our code with benchmark-ips. This gem provides you with the number of time your code runs per second.
To add benchmark-ips
, first run:
Then add the benchmark to your code. Here, Iāll run my find_pairs(array, sum)
with four different inputs:
- a 10-integer array
- a 100-integer array
- a 1,000-integer array
- a 10,000-integer array
Hereās the output:
š As you can see, it takes my code 80x longer to run between a 10-integer array and a 100-integer array. But then, it takes 100x longer between the 100-integer array and the 1,000-integer array. And 100x longer between the 1,000-integer array and the 10,000-integer array.
Itās safe to say that its time complexity is š©.
You can already picture the user looking at a blank screen for 10 seconds while your code runs. And that, friends, is not good at all.
Moving from the āš and š©ā notation to the Big-0 notation
Time complexity has its own scale: the Big-0 notation 2.
From best to worst:
0(1)
: the size of the input doesnāt impact your codeās runtime0(log n)
: 10x the input and your code takes 2x longer to run0(n)
: 10x the input and your code takes 10x longer to run0(n log n)
: 10x the input and your code takes 50x longer to run0(n^2)
: 10x the input and your code takes 100x longer to run0(2^n)
: 10x the input and your user has fallen asleep while your code still runs
Thereās really no need to learn it by heart or to be able to calculate the right Big-0 on top of your head (the first reason is: CS engineers and math nerds donāt even agree on the calculus).
Itās better to understand some rules of thumb:
- the
n
refers to the input size - if you iterate once on an array, the time to run is linearly proportional on the number of elements in the array (hence
0(n)
) - code with a 2-level nested loop usually fits the
0(n^2)
(code with a 3-level nested loop would be0(n^3)
and so on) - time complexity indicates possible performance issues based on the input size
If you want to get a more in-depth explanation, go and check A Rubyistās Guide to Big-O Notation by Honeybadger. Itās neat š.
Getting back to our real-life example
Remember that?
To calculate the time complexity of a piece of code, you just define the time complexity of each element, and the worst bit wins. So letās break down find_pairs()
and see whatās going on under the hood.
āļø First, letās create a 4-integer array
āļø Then, letās define a method find_pairs()
and call repeated_combination(2)
on the array to create pairs
Letās stop there for a moment. repeated_combination(2)
iterates over each element in array
and pair it with each element of array
. repeated_combination(2)
creates a nested-loop.
The first loop first stops on -6
then a second loop starts and iterates over each element. Once the second loop is done, we get back to the first loop which moves onto 4
where the second loop kicks in again. Rinse and repeat until repeated_combination(2)
has iterated over each element and returned all combinations.
A loop inside a loop? š The time complexity of this method is 0(n^2)
.
āļøāļø Now letās find_all
pairs whose integers sum to a given number.
find_all
iterates over the enumerator returned by repeated_combination(2)
and checks for pairs that match the block.
A single loop means 0(n)
. Since 0(n)
is much faster than the O(n^2)
from repeated_combination(2)
, it will be of little consequence in the grand scheme of things.
āļøāļø Finally, letās remove duplicates with uniq
In this example, we donāt have any duplicates. But when you run the algorithm on a 10,000-integer array, chances youāll have some.
uniq
only iterates once on the array, so itās time complexity is 0(n)
. Once again, it wonāt affect much the time complexity of the whole method.
š Crunching the numbers
Now that we know the time complexity of each element, letās calculate the complexity of the whole method.
0(n^2)
+ 0(n)
+ 0(n)
= 0(n^2)
As I said before 0(n)
is so much faster than 0(n^2)
that it wonāt affect the final time complexity substantially.
So now, I know that my codeās time complexity is 0(n^2)
. You can stick with my š© notation but 0(n^2)
will make you look slightly smarter.
This is where knowing the gist of time complexity comes in handy. If you know your code becomes exponentially slower when the input grows, then you know youāll have performance issues when you scale. So you have to make your code better.
Making the algorithm better
Instead of using some of Rubyās built-in methods, Iāll use Set. Set creates a collection of unordered values without duplicates with fast lookup capabilities.
Letās break it down:
- I create an empty Set to store
results
. - I transform my input array into a Set and store it into
input
. - I iterate over
input
. - I take the first integer (
element
) and subtract it fromsum
. This gives me a potential pairing integer (other_element
). - If
other_element
is included ininput
(my input array turned to a Set), then I make sure thatelement
andother_element
are paired from smallest to biggest, and pushed inresults
. Why bother to sort them? Because if I have duplicates,results
being a Set will only keep the first occurrence š.
Hereās the benchmark for this new code:
Much better! Now, my code only takes 10x longer to run when the input is multiplied by 10. Its time complexity has become linear (0(n)
) and takes less than 0,006s to handle a 10,000-integer input š„³.
Well, that was super geeky. I hope itāll help fellow bootcampers to wrap their head around time complexity and how it can be useful.
Did I missed something? Lemme know on Mastodon,
RĆ©mi