Top Data Science Interview Questions & Answers

阿新 • • 發佈：2018-12-28

1.Using Python, write a program/function that prints the least integer that is not present in a given list and cannot be represented by the summation of the sub-elements of the list.

E.g. For a = [1,2,5,7] the least integer not represented by the list or a slice of the list is 4, and if a = [1,2,2,5,7]

then the least non-representable integer is 18.

import itertoolstotal_list = []input = [1,2,5,7]for L in range(0, len(input)+1):    for subset in itertools.combinations(input, L):        total_list.append(sum(subset))new_list = list(set(total_list))new_list.sort()for each in range(0,new_list[-1]+2):    if each not in new_list:        print(each)        break

2. Is more data always better?

The answer to these kind of questions is to think logically at different levels.

At a fundamental level having more data means additional storage, more computational power and memory requirement. Hence, there is a cost related to more data. Doing a cost to benefit analysis of this scenario can help make an informed decision.

At a more specific level, considering the nature of the data, quality is an important metric. If your data is biased, just getting more data is of no use.

From a model perspective, we need to consider what additional data does to the existing model. If a model suffers from a high bias, more data will not be able to improve test results beyond a limit unless more features are added.

3. What is the difference between an inner join, left join/right join, and full join?

For this question lets us take an example of having two database tables:

Contains the name and unique Ids of all the people who love Pizza.
Contains the name and unique Ids of all the people who are software developers.

Inner Join- This consists of people who are software developers and love Pizza at the same time.

Left Join- This consists of all the people who love pizza who may/may not be software developers.

Right Join- This consists of all software developers who may/may not love Pizza.

Full Join- This consists of all the people from both tables.

4. Write a query that returns the name of each department and a count of the number of employees in each.

EMPLOYEES containing: Emp_ID (Primary key) and Emp_NameEMPLOYEE_DEPT containing: Emp_ID (Foreign key) and Dept_ID (Foreign key)DEPTS containing: Dept_ID (Primary key) and Dept_Name

Select Dept_Name, count(1)

from DEPTS a right join EMPLOYEE_DEPT b on a.Dept_id = b.Dept_id

Group By Dept_Name

5. What is regularization? Explain L1 and L2 regularization.

Regularization basically adds penalty to a model as complexity increases. Regularization parameter penalizes all the parameters except intercept so that model generalizes the data. This prevents overfitting.

Both L1 and L2 regularization use penalty to avoid overfitting. The major difference between the two is the way penalty is defined. L1 or Lasso and L2 or Ridge regularization will both reduce/remove features from the model when applied.

Lasso Regression adds “absolute value (magnitude)” of coefficient as penalty term to the loss function. Here we use absolute value as highlighted.

Ridge regression adds “squared value (magnitude)” of coefficient as penalty term to the loss function.

Due to their respective coefficients, L1 regularization is more tolerant of outliers. L1 is better with noisy data and used extensively for the same.

Subscribe to our Acing AI newsletter, I promise not to spam and its FREE!

Thanks for reading! ? If you enjoyed it, test how many times can you hit ? in 5 seconds. It’s great cardio for your fingers AND will help other people see the story.

The sole motivation of this blog article is to provide answers to some Data Science Interview Questions. I aim to make this a living document, so any updates and suggested changes can always be included. Please provide relevant feedback.

Cracking The Data Science Interview

Divide-and-Conquer AlgorithmA very popular algorithmic paradigm, a typical Divide and Conquer algorithm solves a problem using following three steps:Divide

50+ Data Structure and Algorithms Interview Questions for Programmers

50+ Data Structure and Algorithms Interview Questions for Programmers https://hackernoon.com/50-data-structure-and-algorithms-interview-questions-fo

[LeetCode]Top Interview Questions/Easy Collection/String to Integer (atoi)

Implement atoi which converts a string to an integer. The function first discards as many whitespace characters as necessary until

LeetCode(Top Interview Questions)——Array

（一）Remove Duplicates from Sorted Array Given a sorted array nums, remove the duplicates in-place suc

Java Interview Questions and Answers for Job Seekers

Java Interview Questions can be intimidating if you are not prepared to answer them. Don’t worry though, because now you can start here. I will be addin

Top 10 Mistakes to Avoid to Master Data Science Data Science Blog

The Harvard Business Review called the data scientist'the sexiest job of the 21st century'. As problem solvers and analysts, data scientists are the profes

Top 10 roles in AI and data science

When you think of the perfect data science team, are you imagining 10 copies of the same professor of computer science and statistics, hands delicately sta

Top 10 Quora Data Science Writers and Their Best Advice

Here is a list of top 10 Data Science writers on Quora and their selected answers. Next, play around some more and check out the tutorials for Titanic: Mac

Five Interview Questions to Predict a Good Data Scientist

What is the significance of the normal distribution to data science? This question is designed to demonstrate an understanding of one of the most basic ele

Javarevisited: Blog about Java Programming Tutorials, Examples, Design Patterns, Interview Questions and Answers, FIX Protocol,

One of the best things about jQuery is there selectors, which gives the jQuery enormous power to find and select DOM elements so easily. If you are comin

Top Data Science Interview Questions & Answers

Top Data Science Interview Questions & Answers

Top AI Interview Questions & Answers — Part 3

Cracking The Data Science Interview

50+ Data Structure and Algorithms Interview Questions for Programmers

[LeetCode]Top Interview Questions/Easy Collection/String to Integer (atoi)

LeetCode(Top Interview Questions)——Array

Java Interview Questions and Answers for Job Seekers

Top 10 Mistakes to Avoid to Master Data Science Data Science Blog

Top 10 roles in AI and data science

Top 10 Quora Data Science Writers and Their Best Advice

Top September Stories: Essential Math for Data Science: Why and How; Machine Learning Cheat Sheets

Five Interview Questions to Predict a Good Data Scientist

Javarevisited: Blog about Java Programming Tutorials, Examples, Design Patterns, Interview Questions and Answers, FIX Protocol,

Top 10 Machine Learning, Deep Learning, and Data Science Courses for Beginners (Python and R)

Ask HN: Any podcasts with Java interview questions and answers?

How to Get Your First Data Science Job: Interview with Michael Galarnyk

The 30 Most Important Interview Questions TO ASK(shared from Glassdoor)

Coursera Algorithms week1 Interview Questions: 3Sum in quadratic time

Coursera Algorithms week2 基礎排序 Interview Questions: 1 Intersection of two sets

Coursera Algorithms week2 棧和隊列 Interview Questions: Queue with two stacks

Top Data Science Interview Questions & Answers

相關推薦