Top AI Interview Questions & Answers — Part 3

阿新 • • 發佈：2018-12-28

1.How would you transfer data from one Hadoop cluster to another?

This question aims to test your experience with Hadoop. Generally, migration of data from one cluster to another is not very frequent. Hence, a person with deep expertise would be able to tackle this one.

Distributed copy command (distcp) is a tool provided by Hadoop for copying large data sets between distributed file systems within and across clusters. The command submits a regular MapReduce job that performs a file-by-file copy. MapReduce is also used to effect its distribution, error handling and recovery, and reporting.

Hadoop -distcp [source] \ [destination]

Here [source] and [destination] are hdfs urls.

2. What are Fact Tables?

A fact table record captures a measurement or a metric. For example, FACT_PURCHASED that gives us a number of units purchased by date, by store and by product for a company. The other tables which provide some data around how these measurements and metrics in a Fact table are dimensions tables. So for the same scenario above, DIM_TIME

provides date details around the purchase, DIM_STORE provides the store details around the purchase. A fact table usually will have all the primary keys to required dimensional tables and the measurement or the metric value for that record.

3. Give some problems or scenarios where MapReduce concept works well and where it doesn’t work.

In order for us to answer this question, we should understand the motives behind this question. The interviewee wants to know how well we know MapReduce and other similar programs to be able to distinguish between where to use MapReduce vs where not to use it. This is ideally aimed to prevent the hammer and the nail problem where, to a person with a hammer every problem looks like a nail.

A MapReduce program consists of both a Map method and a Reduce method. The Map method takes a set of data and converts it into another set of data via filtering or sorting operations, where individual elements are broken down into key value pairs. The Reduce method takes the Map’s output as input and performs a summary operation.

At a system level, the MapReduce System orchestrates the processing by marshalling the distributed servers, running the various tasks in parallel, managing all communications and data transfer between the various parts of the system, and providing the redundancy and fault tolerance.

If we have understood this process in detail, it is easy to understand that since all the data is already provided to a MapReduce program, it will not be able to perform if the data is of streaming type. Some computations need to happen in memory for them to be effective. These cannot be handled by MapReduce. In ML, we use iterative processing which converges to provide us results. As the iterations run, we go more closer to the desired results. MapReduce cannot be used directly for Iterative processing. These are the top three areas where MapReduce may not work.

Subscribe to our Acing AI newsletter, I promise not to spam and its FREE!

Thanks for reading! ? If you enjoyed it, test how many times can you hit ? in 5 seconds. It’s great cardio for your fingers AND will help other people see the story.

The sole motivation of this blog article is to provide answers to some Data Science Interview Questions. I aim to make this a living document, so any updates and suggested changes can always be included. Please provide relevant feedback.

Q&A/Written Interview — The answers — Part 2

Q&A/Written Interview — The answers — Part 2I shall continue answering in order.The original questions are posted here.BusinessI answered question 17 i

[LeetCode]Top Interview Questions/Easy Collection/String to Integer (atoi)

Implement atoi which converts a string to an integer. The function first discards as many whitespace characters as necessary until

LeetCode(Top Interview Questions)——Array

（一）Remove Duplicates from Sorted Array Given a sorted array nums, remove the duplicates in-place suc

Java Interview Questions and Answers for Job Seekers

Java Interview Questions can be intimidating if you are not prepared to answer them. Don’t worry though, because now you can start here. I will be addin

Beyond the Hype: AI, ML, and Deep Learning in Cybersecurity (Part 3)

This is the final piece of my three-part blog on the topic human intelligence vs. AI, and how AI is being used successfully to address various problems in

AI Newsletters You Should Subscribe To (Part 3/3: Commercial)

The MAII’s aim is to ‘educate modern marketers on the present and future potential of artificial intelligence, and connect them with AI-powered technologie

Got Bias in Your AI Bots? Due Diligence Can Root it Out. (AI Ethics, Part 3)

In the previous two episodes, Bryson cut through the smoke and mirrors around resistance to AI regulation, and broke down the importance of due diligence i

Javarevisited: Blog about Java Programming Tutorials, Examples, Design Patterns, Interview Questions and Answers, FIX Protocol,

One of the best things about jQuery is there selectors, which gives the jQuery enormous power to find and select DOM elements so easily. If you are comin

Top AI Interview Questions & Answers — Part 3

Top AI Interview Questions & Answers — Part 3

Top Data Science Interview Questions & Answers

Q&A/Written Interview — The answers — Part 2

[LeetCode]Top Interview Questions/Easy Collection/String to Integer (atoi)

LeetCode(Top Interview Questions)——Array

Java Interview Questions and Answers for Job Seekers

Beyond the Hype: AI, ML, and Deep Learning in Cybersecurity (Part 3)

AI Newsletters You Should Subscribe To (Part 3/3: Commercial)

Got Bias in Your AI Bots? Due Diligence Can Root it Out. (AI Ethics, Part 3)

Javarevisited: Blog about Java Programming Tutorials, Examples, Design Patterns, Interview Questions and Answers, FIX Protocol,

Ask HN: Any podcasts with Java interview questions and answers?

Lesson 2 Building your first web page: Part 3

多線程編程-- part 3 多線程同步->synchronized關鍵字

The 30 Most Important Interview Questions TO ASK(shared from Glassdoor)

Coursera Algorithms week1 Interview Questions: 3Sum in quadratic time

Coursera Algorithms week2 基礎排序 Interview Questions: 1 Intersection of two sets

Coursera Algorithms week2 棧和隊列 Interview Questions: Queue with two stacks

JSP復習（part 3 ）

深度學習系列 Part(3)

深度剖析Kubernetes API Server三部曲 - part 3

Top AI Interview Questions & Answers — Part 3

相關推薦