Why do interviewers focus on data structures
We all have this one question, why in the holy hell interviewers do focus and stress about data structures in an interview. We hardly use them in real life.
But what if I tell you, choosing an improper data structure is the biggest mistake that a developer can make while designing or writing code for a scalable system.
Before bragging much about this discussion let’s get a hands on, what made me understand the importance of choosing the right data structure.
So as a E-Commerce company we provide customers the ability to book flights, hotels, cars, events happening in the city etc.
But in India, no business runs without discounting. Configuring discounts is straight forward for all other verticals except hotels.
A person manually sets what should be the discount to be applied for a list of hotels(Hotels all over the world that ranges sums up to a lakh or two). [ 30% for hotels 1,2,3].
When I say list of hotels, the list may range till 40000 hotels having 20% discount.
So when a user selects a specific hotel in the portal, we evaluate the hotel by searching the hotel-Id in list of hotels and deliver them the discount that the hotel has.
This is where I made one of the costliest mistake of choosing a wrong data structure.
Being a guy who is new to the system, I under estimated the scale at which Goibibo is working currently and I choose list data structure instead of map when storing these hotel Ids in the cache. So when a user selects a hotel, I load all these hotel-Ids into RAM and then make a linear search to check if the hotel is in the list.
At this point there are two mistakes I’ve made.
- Thinking that a linear search is of O(n) complexity where n is no of elements in the list. It’s wrong. For a string datatype the linear search takes O(m*n) where m is length of each string. In our case the length of m(hotel-Id) is 20 letters. (String comparison is of O(m) complexity where m is no of characters in the string)
- Even if I make a list comparison I should’ve tried binary search instead of linear which could decrease my complexity to O(m*log(n))
Well both the list comparison results in bad to worst performance and all of a sudden our application started spiking like a heart attack patient.
Response time of API, what should have been 5 ms had now reached to 35 ms to 40 ms.
As a result we immediately did a performance(bench marking) check on each part of code and came to know that the list comparison is the culprit and I immediately changed that to a map data structure.
Boom, our application started responding at 5 ms speed as expected.
At this point I’ve understood how important is to choose a right data structure and what happens when we hit the wrong one.