My experience from interviewing as a Data Scientist in 2021 in Bangalore, India.

The coding round has become an integral part of Data Science interviews. As ubiquitous it may be, it is also a dreaded round for many. With this post, I aim to fight fear with information, by sharing the different types of coding interviews and questions I encountered recently.

Let us look at the different Formats of execution and questions asked, and understand what concept is being tested by the questions.

Format1 — Live coding

Photo by ThisisEngineering RAEng on Unsplash

You are asked to open an editor (Jupyter notebook) and share your screen with the interviewer. It’s appreciated when a candidate talks through their process to keep the interviewer on…

Understand threshold invariance and beware of one common mistake

This post will help you understand the advantage of AUC over other metrics, how it’s calculated (using RoC), and why it’s necessary to calculate it so.

Nature own RoC (Image by ykaiavu from Pixabay)


If you have built a classifier, you have most certainly measured the performance of the model using metrics like accuracy, precision, recall, or F score. But each of these metrics are calculated after defining the cutoff probability (like 0.5) at which to measure the metric.

Advantage of AUC over other metrics

What about when you have two competing models and want to compare their performance? What if model 1 performs best at a 0.5 cutoff, and model 2 performs best…

An explanation for why the bagging fraction is 63.2%

If you have read about Bootstrap and Out of Bag (OOB) samples in Random Forest (RF), you would most certainly have read that the fraction of observations in the ‘bag’ when you build RF with bootstrap is around 63.2%.

This post is a crisp explanation for the origins of the number 63.2%.

How Random can the Forest be? (Photo by David Kovalenko on Unsplash)

The post is organized as:

  1. Recap of RF terminologies
  2. Example of Bootstrap
  3. Generalizing the example
  4. Simulation
  5. Conclusion

Recap of RF terminologies

RF is a techniques of ensemble learning through Bagging.

Bagging = Bootstrap + Aggregation

Bootstrap means that instead of training on all the observations, each tree of RF is trained on…

Using a regex parser based on grammar to extract key phrases

The goal of this article is to introduce the concept of POS chunking with the example of Amazon review tags.

I am planning to upgrade from a 2017 Moto G5 plus to a new phone. In my research for a new phone, I ended up going through a lot of phones listed on Amazon and scouring through their reviews.

Screenshot captured by Author

And just like me, you’d have noticed a list of tags on top of the verbose reviews. These tags saved me a lot of time by highlighting the most talked about points regarding the phone.

Using Full Text Search

Photo by Marten Newhall on Unsplash

If you’ve used SQL to perform a text search, you would have probably used the like command. But the limitation with like command is that it looks for exact matches. Luckily for us, SQL offers a feature - SQL FULL TEXT INDEX — that offers fuzzy text search capability on any column that contains raw text. this is a god sent for NLP projects.

I for one, am a big fan of NLP libraries offered by python — scikit learn and spaCy.

But before one steps into the deep waters of the text processing, its good to dip your toe…

Its important to know the difference

Photo by Ethan Dow on Unsplash

Quite often when we are venturing in something new, we are faced with doubts, fears. We are even tempted to give up saying, it was impossible. At this time if we can draw the line between the impossible and the unknown, then we can quite easily transcend the fear.

More often than not, the task is only unknown. At such time, we need to list down things to do, and find the best person/resource to guide. And slowly what was impossible starts becoming a reality.

On the other hand if it really is impossible, still try. At least now we know that its impossible in this way. Perhaps it’ll be possible some other way.

Divya Choudhary

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store