Resources
Here I wanted to collect various websites that contain resources for data science and programming. I've been asked everything under the sun in interviews, and wanted to create a reference page that is a one-stop-shop for any data science-related issues.
Ensemble Methods
Random Forest vs. Gradient Boosting - The two videos at the bottom of this post (via Udacity) explain how both the Random Forest and Gradient Boosting alogorithms work. This is the clearest explanation of the two I've come across.
Case Studies
Predicting Customer Churn With Sci-Kit Learn - This is incredibly helpful blog post that addresses a churn problem in many steps. First the author builds a classification model that optimizes accuracy for a churn dataset. Then he moves to recall. Then he looks at churn probability rather than a binary classification and combines the probability with each customer's value so that it can calculate an expected loss. Finally he develops production level code that spits out a CSV file of results. Great post, with all of the code included.
A Comprehensive Beginner's Guide to Create a Time Series Forecast - One topic that seems relatively undercovered to me in a lot of data science tutorials is time series models. This post, complete with Python code, introduces you to concepts like stationarity, moving averages, decomposition and ARIMA forecasting. It's a good jumping off point to learn more about doing time series forecasts.
A/B Testing With Hierarchical Models in Python -
SQL
44 Essential SQL Questions - There's a ton of SQL tutorials out there, but this prep prepares you for questions someone may ask you in an interview.
SQL Interview Questions - This is another good one that lists potential SQL interview questions. I'd say the above's hardest questions are harder than the hardest questions here (they're also more in-depth), but in terms of pure quantity there's several more questions here for all skill levels. Highly recommended.
Statistics
Common Probability Distributions: The Data Scientist's Crib Sheet - Here's a handy quick read that describes some common distributions and how they're related.
Linear Algebra
The Essence of Linear Algebra - This video series from 3Blue1Brown is a great way to gain an intuitive understanding of linear algebra.
Example Questions
40 Interview Questions Asked At Startups in Machine Learning/Data Science - I don't see anything here in particular that gears these questions towards startup companies - they could be used in really any interview situation for a data scientist. I think high-level questions like these are really helpful for those getting into data science because it forces you to seek clarity in what you're learning. Often times we're taught a list of different algorithims in a curriculum but don't have much time to digest what's being given to us. Questions like these fill in the "why" and help us become both more attractive job candidates and better thinkers in solving the problems given to us.
109 Commonly Asked Data Science Interview Questions - This is a good one because it collects all types of potential questions, from statistics and programming to past behavior and culture.
Data Culture
Five Building Blocks of a Data-Driven Culture - This is a great article about the steps needed to build a 'data-driven culture' in a company, discussing things like having strict definitions for different metrics across the company and having a single, centralized source of data for people in the company. Great talking points to bring up in an interview to show that you're aware of what a healthy data culture looks like.
The Data Science Handbook - This "pay as you wish" ebook contains interviews with 25 data scientists from companies like AirBNB, Facebook, and Palantir that cover their different backgrounds and approaches to data science. I thought the George Roumeliotis (AirBNB) and John Foreman (Mailchimp) interviews in particular were helpful for talking points about establishing a data-driven culture. I also really recommend John Foreman's 'Data Smart' which is a great introduction to learning data science concepts via Excel.
Analytics
Building an Analytics Data Stack - This blog post from Andrew Bartholomew is an extremely helpful rundown of how to build an analytics stack from scratch.