Cracking The Data Science Interview

阿新 • • 發佈：2018-12-29

Divide-and-Conquer Algorithm

A very popular algorithmic paradigm, a typical Divide and Conquer algorithm solves a problem using following three steps:

Divide: Break the given problem into subproblems of same type.
Conquer: Recursively solve these subproblems
Combine: Appropriately combine the answers

Following are some standard algorithms that are Divide and Conquer algorithms:

1 — Binary Search is a searching algorithm. In each step, the algorithm compares the input element x with the value of the middle element in array. If the values match, return the index of middle. Otherwise, if x is less than the middle element, then the algorithm recurs for left side of middle element, else recurs for right side of middle element.

2 — Quicksort is a sorting algorithm. The algorithm picks a pivot element, rearranges the array elements in such a way that all elements smaller than the picked pivot element move to left side of pivot, and all greater elements move to right side. Finally, the algorithm recursively sorts the subarrays on left and right of pivot element.

3 — Merge Sort is also a sorting algorithm. The algorithm divides the array in two halves, recursively sorts them and finally merges the two sorted halves.

4 — Closest Pair of Points: The problem is to find the closest pair of points in a set of points in x-y plane. The problem can be solved in O(n²) time by calculating distances of every pair of points and comparing the distances to find the minimum. The Divide and Conquer algorithm solves the problem in O(nLogn) time.

5 — Strassen’s Algorithm is an efficient algorithm to multiply two matrices. A simple method to multiply two matrices need 3 nested loops and is O(n³). Strassen’s algorithm multiplies two matrices in O(n².8974) time.

6 — Cooley–Tukey Fast Fourier Transform (FFT) algorithm is the most common algorithm for FFT. It is a divide and conquer algorithm which works in O(nlogn) time.

7 — Karatsuba algorithm for fast multiplication: It does multiplication of two n-digit numbers in at most 3n^(log 3) single-digit multiplications in general (and exactly n^(log3) when n is a power of 2). It is therefore faster than the classical algorithm, which requires n² single-digit products.

Counting Inversions Problem

We will consider a problem that arises in the analysis of rankings, which are becoming important to a number of current applications. For example, a number of sites on the Web make use of a technique known as collaborative filtering, in which they try to match your preferences (for books, movies, restaurants) with those of other people out on the Internet. Once the Web site has identified people with “similar” tastes to yours — based on a comparison of how you and they rate various things — it can recommend new things that these other people have liked. Another application arises in recta-search tools on the Web, which execute the same query on many different search engines and then try to synthesize the results by looking for similarities and differences among the various rankings that the search engines return.

A core issue in applications like this is the problem of comparing two rankings. You rank a set of rt movies, and then a collaborative filtering system consults its database to look for other people who had “similar” rankings. But what’s a good way to measure, numerically, how similar two people’s rankings are? Clearly an identical ranking is very similar, and a completely reversed ranking is very different; we want something that interpolates through the middle region.

Let’s consider comparing your ranking and a stranger’s ranking of the same set of n movies. A natural method would be to label the movies from 1 to n according to your ranking, then order these labels according to the stranger’s ranking, and see how many pairs are “out of order.” More concretely, we will consider the following problem. We are given a sequence of n numbers a1, …, an; we will assume that all the numbers are distinct. We want to define a measure that tells us how far this list is from being in ascending order; the value of the measure should be 0 if a1 < a2 < … < an, and should increase as the numbers become more scrambled.

A natural way to quantify this notion is by counting the number of inversions. We say that two indices i < j form an inversion if ai > aj, that is, if the two elements ai and aj are “out of order.” We will seek to determine the number of inversions in the sequence a1, …, an.

Cracking The Data Science Interview

Divide-and-Conquer Algorithm

Counting Inversions Problem

Cracking The Data Science Interview

Top Data Science Interview Questions & Answers

#ODSC - The Data Science Community

The Data Science Community – Medium

Cracking the coding interview 智力題之-扔雞蛋問題

Cracking the coding interview

Dr. Data Show Video: Why Machine Learning Is the Coolest Science

Ask HN: Whats the best desktop cfg for ML and Data science side project as R&D?

Data Science, Geography and Frontify: The Future of Venture Capital

Statistics and data science degrees: Overhyped or the real deal?

Acceleration Platform for Data Science, Volvo Selects NVIDIA DRIVE | The Official NVIDIA Blog

Data Science vs Machine Learning vs Data Mining: The Real Differences

The Power of Goal-Setting in Data Science

IBM Melds Data Science, Business Intelligence on the Cloud

The Huge Role of Data Science in Artificial Intelligence and Machine Learning

The Real Super Power of Data Science

50,000 AI & Data Science Jobs Are Vacant In India, As Candidates Don't Have The Required Skills

Marginally Interesting: How Python became the language of choice for data science

Marginally Interesting: Three Things About Data Science You Won't Find In the Books

Data Science and the Art of Producing Entertainment at Netflix

Cracking The Data Science Interview

Divide-and-Conquer Algorithm

Counting Inversions Problem

相關推薦