WTF time complexity? 🤔
Here’s a subject I never thought I’d write about. But life is full of surprises.
People who learn web development through coding bootcamp aren’t (usually) familiar with time complexity. They might have read the words, but that’s it. I know I tried to look up the Wikipedia page only to fall asleep at the end of the first paragraph. When I finally woke up, I thought “Nevermind, I’ll never have to deal with it anyway”.
And boy, was I wrong. I was soon asked during a coding interview to write a pairing algorithm and to calculate its time complexity 🤔. It made me realize that time complexity is more than just a fancy word thrown by an engineer to make you sweat. Knowing the gist of it can be useful outside the coding-interviews-world too.
So, for all my fellow bootcampers out there, here are the very basics of time complexity - so you can shine like a diamond during your next cocktail party.
Time complexity for noobs
Time complexity is how you measure your code's execution time depending on the size of the input.
That’s it. You can stop reading right now and go back to your life.
Still with me? Ok, wannabe-CS-nerd, here’s an example:
- If your code handles a 10-integer array and a 10,000-integer array during the same amount of time, well done, your code’s time complexity is 👌.
- If your code takes 100x longer to handle a 10x larger array, sorry, your code’s time complexity is 💩 1.
A much-needed example
Let’s take the coding interview classic I had to tackle:
Say you get an array of 10,000 random integers, positive and negative. You want to get all couples (without duplicates) of elements summing to a given number.
Here’s a first solution:
Let’s break it down:
repeated_combination(2)generates an enumerator containing all repeated pairs of integers from the 10,000 array (roughly 50,000,000 elements 😱).
find_allreturns an array with all occurrences checking the block.
uniqremoves the duplicated pairs
But what’s more important is: how much time does this method take to run depending on the size of the input?
We’ll benchmark our code with benchmark-ips. This gem provides you with the number of time your code runs per second.
benchmark-ips, first run:
Then add the benchmark to your code. Here, I’ll run my
find_pairs(array, sum) with four different inputs:
- a 10-integer array
- a 100-integer array
- a 1,000-integer array
- a 10,000-integer array
Here’s the output:
😓 As you can see, it takes my code 80x longer to run between a 10-integer array and a 100-integer array. But then, it takes 100x longer between the 100-integer array and the 1,000-integer array. And 100x longer between the 1,000-integer array and the 10,000-integer array.
It’s safe to say that its time complexity is 💩.
You can already picture the user looking at a blank screen for 10 seconds while your code runs. And that, friends, is not good at all.
Moving from the “👌 and 💩” notation to the Big-0 notation
Time complexity has its own scale: the Big-0 notation 2.
From best to worst:
0(1): the size of the input doesn’t impact your code’s runtime
0(log n): 10x the input and your code takes 2x longer to run
0(n): 10x the input and your code takes 10x longer to run
0(n log n): 10x the input and your code takes 50x longer to run
0(n^2): 10x the input and your code takes 100x longer to run
0(2^n): 10x the input and your user has fallen asleep while your code still runs
There’s really no need to learn it by heart or to be able to calculate the right Big-0 on top of your head (the first reason is: CS engineers and math nerds don’t even agree on the calculus).
It’s better to understand some rules of thumb:
nrefers to the input size
- if you iterate once on an array, the time to run is linearly proportional on the number of elements in the array (hence
- code with a 2-level nested loop usually fits the
0(n^2)(code with a 3-level nested loop would be
0(n^3)and so on)
- time complexity indicates possible performance issues based on the input size
If you want to get a more in-depth explanation, go and check A Rubyist’s Guide to Big-O Notation by Honeybadger. It’s neat 👌.
Getting back to our real-life example
To calculate the time complexity of a piece of code, you just define the time complexity of each element, and the worst bit wins. So let’s break down
find_pairs() and see what’s going on under the hood.
☝️ First, let’s create a 4-integer array
✌️ Then, let’s define a method
find_pairs() and call
repeated_combination(2) on the array to create pairs
Let’s stop there for a moment.
repeated_combination(2) iterates over each element in
array and pair it with each element of
repeated_combination(2) creates a nested-loop.
The first loop first stops on
-6 then a second loop starts and iterates over each element. Once the second loop is done, we get back to the first loop which moves onto
4 where the second loop kicks in again. Rinse and repeat until
repeated_combination(2) has iterated over each element and returned all combinations.
A loop inside a loop? 👉 The time complexity of this method is
☝️✌️ Now let’s
find_all pairs whose integers sum to a given number.
find_all iterates over the enumerator returned by
repeated_combination(2) and checks for pairs that match the block.
A single loop means
0(n) is much faster than the
repeated_combination(2), it will be of little consequence in the grand scheme of things.
✌️✌️ Finally, let’s remove duplicates with
In this example, we don’t have any duplicates. But when you run the algorithm on a 10,000-integer array, chances you’ll have some.
uniq only iterates once on the array, so it’s time complexity is
0(n). Once again, it won’t affect much the time complexity of the whole method.
🖐 Crunching the numbers
Now that we know the time complexity of each element, let’s calculate the complexity of the whole method.
As I said before
0(n) is so much faster than
0(n^2) that it won’t affect the final time complexity substantially.
So now, I know that my code’s time complexity is
0(n^2). You can stick with my 💩 notation but
0(n^2) will make you look slightly smarter.
This is where knowing the gist of time complexity comes in handy. If you know your code becomes exponentially slower when the input grows, then you know you’ll have performance issues when you scale. So you have to make your code better.
Making the algorithm better
Instead of using some of Ruby’s built-in methods, I’ll use Set. Set creates a collection of unordered values without duplicates with fast lookup capabilities.
Let’s break it down:
- I create an empty Set to store
- I transform my input array into a Set and store it into
- I iterate over
- I take the first integer (
element) and subtract it from
sum. This gives me a potential pairing integer (
other_elementis included in
input(my input array turned to a Set), then I make sure that
other_elementare paired from smallest to biggest, and pushed in
results. Why bother to sort them? Because if I have duplicates,
resultsbeing a Set will only keep the first occurrence 👌.
Here’s the benchmark for this new code:
Much better! Now, my code only takes 10x longer to run when the input is multiplied by 10. Its time complexity has become linear (
0(n)) and takes less than 0,006s to handle a 10,000-integer input 🥳.
Well, that was super geeky. I hope it’ll help fellow bootcampers to wrap their head around time complexity and how it can be useful.
Did I missed something? Lemme know on Twitter,