Rewriting Excel for the era of big(ger) data

阿新 • • 發佈：2018-12-29

Rewriting Excel for the era of big(ger) data

The spreadsheet may very well be the biggest innovation after the personal computer itself. Spreadsheets are used by professionals in virtually all sectors where information is processed, where they are entrusted with anything between shopping lists and billion-dollar, life-or-death decisions support systems. They are used for all sorts of tasks that spreadsheets were and weren't designed to do: modeling, simulation, information storage, extract-transform-load, to name a few. Spreadsheets are the quantitative lingua franca of the business world.

Spreadsheets are so ubiquitous because the mental model for a spreadsheet is so easy to grasp, even for non-programmers. A spreadsheet is, after all, nothing more than a grid that contains numbers, and formulas that use those numbers. Spreadsheets were modelled after blackboard calculations. The power of spreadsheets comes from the fact that using this simple grid, it is possible to compute virtually anything (flight simulators, a processor emulator and K-means clustering are just a few examples of such Excel abuse). The spreadsheet is one of the few modes of computation that is both easy to use and understand, as well as extremely powerful. And most importantly, doesn't require you to think like a computer, like virtually all programming and query languages (despite several attempts

attempts). The key invention that made spreadsheets possible is the process by which the web of formulas in a spreadsheet is made into a computer program.

The spreadsheet is one of the few modes of computation that is both easy to use and understand, as well as extremely powerful.

Spreadsheets have some downsides as well. For one, logic and data is not clearly separated. This becomes a problem when you want to generalize your spreadsheet, for example to bigger data sets. Because spreadsheet formulas always take a fixed number of cells as their inputs, it is not easy to (automatically) accomodate increases of data size (Excel actually does a quite decent job by assuming that data added in a range should be treated like the other items in the range and rewriting references to the range, and also provides tables functionality to make this explicit).

Increasingly, data-analyzing people like myself find themselves in a situation where they need to work with data sets that Excel just cannot stomach anymore. The concept of a spreadsheet simply doesn't scale (try analyzing files with more than a few million cells in a recent version of Excel and you’ll agree). Because any cell could (in theory) influence the value in any other cell, Excel needs to calculate relationships between the cells before it can compute results. Beyond a certain size, big data can only be analyzed using a divide-and-conquer approach. Because spreadsheets contain so many interdependencies, they are not so easy to divide (much less to conquer!).

For many organizations, big data is “anything that doesn’t fit in Excel anymore”.

There are numerous tools and platforms available that are able to grok such amounts of data. Unfortunately, none of these tools come close to the ease of use of a spreadsheet. They either require you to think like a computer, or do not provide a simple alternative mental model.

What the world needs is not another Hadoop, but a new Excel for bigger data: an intuitive analysis tool for big data, with a simple mental model but the same powerful capabilities.

Rewriting Excel for the era of big(ger) data

Rewriting Excel for the era of big(ger) dataThe spreadsheet may very well be the biggest innovation after the personal computer itself. Spreadsheets are us

Procurement Benchmarking in the Era of Big Data

Let's be honest: most benchmark reports promise much but deliver little. They often start with good intentions but focus on high-level best practices or re

Brace Yourselves For The Era of Voice Search and Voice SEO

Brace Yourselves For The Era of Voice Search and Voice SEOLately, one of the hottest yet nuanced topics in the Internet space has been Voice Search and Voi

Apple’s AI Strategy, For Better or Worse, Stands Apart From the Rest of Big Tech

Apple’s artificial intelligence strategy continues to be focused on running workloads locally on devices, rather than relying heav

The case for humanities in the era of AI, automation, and technology

Whether we're working side-by-side with autonomous robots on the factory floor, spreading the happy news about a new addition to the family on Facebook, or

Functional Programming For The Rest of Us

title: Functional Programming For The Rest of Us data: 2018-9-25 tags: [Functional,Programming,原文,學習] categories: [學習] grammar_cjkRuby: true co

An error occurred during local report processing.Failed to load expression host assembly. Details: Request for the permission of

var reportInstance = new LocalReport(); reportInstance.SetBasePermissionsForSandboxAppDomain(new PermissionSet(PermissionState.Unrestricted));

The Era of the Autodidact is Here.

The Era of the Autodidact is Here.Why they said I wouldn’t make it and why they were right.It was always a bit difficult explaining people I was a visual a

Japanese spacecraft drops robot onto asteroid to hunt for the origin of the solar system

A robot has been successfully dropped onto an asteroid millions of miles from Earth – and will now hunt for the origin of the solar system. The German-Fren

Learning and Leading in the Era of Artificial Intelligence and Machine Learning, Part 1

Learning and Leading in the Era of Artificial Intelligence and Machine Learning, Part 1Wikimedia CommonsWith this 2-part blog series, I’ll explore the evol

The Noose Tightens Around Malta and What it Means for the Island of Crypto

The Noose Tightens Around Malta and What it Means for the Island of CryptoMalta has over 7,000 years of history and a questionable reputation as a haven fo

Studying genetics in the age of big data

New biomedical techniques, such as next-generation genome sequencing, are creating vast amounts of data and transforming the scientific landscape. They're

Is "Java Concurrency in Practice" still valid in the era of Java 8 and 11?

One of my reader Shobhit asked this question on my blog post about 12 must-reads advanced Java books for intermediate programmers - part1. I really like t

Friendly Banks Are Vital For The Future Of Cryptocurrency

Why are crypto-friendly banks so important for the future of this sector?Crypto-friendly banks aren’t the enemy here. In fact, they are the middle-ground s

Free will a dwindling commodity in the age of big data and AI

In his rather dystopian foray into an educated prophecy about what society will look like 100 years from now, the celebrated historian Yuval Noah Harari re

The eventual demise of Moore’s Law and alternatives for the future of high performance computing.

What can be done?Despite these potential issues, research on nontraditional methods of computing promise higher performance in the future.GrapheneA view of

Conditions for the possibility of exit

Conditions for the possibility of exitExit (and the right to exit) is fundamentally important and too-often ignored. But, we should be careful to be clear

My own translation of Nouriel Roubini testimony for the Hearing of the US Senate Committee on…

My own translation of Nouriel Roubini testimony for the Hearing of the US Senate Committee on Banking, Housing and Community AffairsI decided to translate

How sports teams, athletes and fans reap the rewards of big data

How sports teams, athletes and fans reap the rewards of big dataThe sports industry is booming.Just one look at the stats — KPMG found that the global spor

For the future of artificial intelligence to be bright, the skills needed are human

In 1983, Stanislav Petrov was on duty at a Soviet nuclear early warning centre, when computers warned of five incoming missiles. Rather than escalating it

Rewriting Excel for the era of big(ger) data

Rewriting Excel for the era of big(ger) data

相關推薦