The gift that sports gives to AI

阿新 • • 發佈：2018-12-29

Clark Alexander and Sofya Akhmametyeva are sitting in an airy, modern downtown Chicago office, describing decathlons.

“How do you say who is the best athlete in a decathlon?” asks Alexander, a math professor at DePaul University and a mathematical engineer at Nousot, an AI-based tech startup. “You take every event that those athletes do. Then you score all of them, and the highest total gives you the best athlete. But in our case, the highest total gives you the best algorithm.”

Decathlons are both an analogy for and the basis of a solution that Alexander and Akhmametyeva didn’t set out to build, but designed and published nonetheless: a common, comprehensive standard to quantitatively assess the performance of clustering algorithms.

The two mathematicians and computer scientists’ original goal was development, not measurement. Nousot had already built an autonomous forecasting algorithm that used deep learning to deliver high initial accuracy and then improve over time, and the company wanted to do the same with a clustering algorithm.

“Clustering is perfect for big data,” says Akhmametyeva, Nousot’s lead machine learning engineer. “There’s a ton of data out there, and the user doesn’t have to be lost in it. An algorithm figures out the groupings in the data, and the user creates stories from the groupings.”

In fact, users have created literal world change from the groupings. Meaningful data clusters — those groups of elements that reveal something conclusive or useful — have helped people and organizations to do things like develop vaccines, discover species, run election campaigns, and see a tsunami coming, even before the advent of AI.

Now that AI is here, so is the technology to build algorithms that find even more precise and powerful groups in ever growing volumes of data, with little or no human input. But soon after Alexander and Akhmametyeva began the work of creating such an algorithm, they discovered that they had to move the goalposts.

“The baseline for architecting our algorithm had to address all the existing performance metrics for clustering algorithms, and improve on them,” says Akhmametyeva, who also founded AIR, a company that builds interfaces between humans and robots. “So our approach was to first identify all the quantitative metrics for determining how well a clustering algorithm does.”

That approach grew perplexing right away. “We kept not finding quantitative measures,” Alexander says.

“Every clustering algorithm is good at identifying about four types of clusters and not so good with about three others,” he continues, referring to the seven features by which clusters are commonly defined: stability, noise, complexity, homogeneity, intercluster distance, covolume, and shape.

“Every paper we read concluded with something like ‘this algorithm is good because it checks off more boxes that we care about,’ ” says Alexander. “They were qualitative evaluations — almost pros and cons — for a task that is inherently numerical.”

So Alexander and Akhmametyeva relegated algorithm development to step two, and made algorithm assessment step one, resolving to build the broad and rigorous evaluation framework that they hadn’t found. Landing on the idea of multi-event athletic competitions as a scoring model, they created a heptathlon for clustering algorithms, complete with seven “events”: those aforementioned seven cluster features that data scientists look for in unlabeled data.

Math gets in on the gift-giving too

Within their heptathlon-inspired assessment framework, Alexander and Akhmametyeva devised new, scrupulously constructed math to quantify each cluster feature.

“Each of the seven features now has a numerical range that we designed and implemented, and we can articulate what those numbers mean,” says Akhmametyeva. “Going forward, researchers can score algorithms with real measures that are tied to real values. They can pick up nuances in performance where previous methods could not.”

The step-by-step math for each feature is illustrated here, with the shape feature — and the especially complex math required to measure it — getting its own paper here.

The “clustering heptathlon,” if you will, works as you might imagine. Just as a decathlete receives a point score for her performance in each individual event, a clustering algorithm, when put through the assessment framework, receives a point score for its performance in identifying clusters according to each of the seven features.

And just as a decathlete’s combined score determines his final standing in a competition, an algorithm’s combined score across the framework’s seven cluster features determines its performance overall.

Still, in order to appeal to a broad audience, the framework had to elegantly handle diverse units of measurement and allow researchers to designate certain cluster features as more critical than others. With these requirements in mind, Alexander and Akhmametyeva baked three parameters into the system: scale, reference point, and weight.

Scale allows point adjustments so that an overall score isn’t skewed when adding highly scattered scores (like those for intercluster distance can be) to tightly compact ones (like those for covolume can be). Reference point accounts for the fact that high scores are best for some features (like stability), while low scores are best for others (like noise). Weight enables any cluster feature to carry more or less importance, depending upon the goals of a project. Researchers can add other parameters as well, such as a maximum range of scores.

“We wanted to give users choices,” Akhmametyeva says. “So our assessment is like an autonomous car. You want it to drive, but sometimes you want to override it. Having both options is important.”

Putting the gifts to good use

With a well-defined clustering algorithm assessment, courtesy of the heptathlon, and a rigorous quantitative measure for each cluster feature, courtesy of advanced mathematics, Alexander and Akhmametyeva turned to their original objective: building an autonomous clustering algorithm that would score well against the assessment in any clustering project, using any type of data.

Here again the pair refer to the all papers they studied. The assessment criteria used in those studies may have been soft, but the clustering algorithms themselves were not — they were actually strong, but specialized.

“What we found is that all the authors had built up their algorithms to fit their research purposes,” says Alexander. Indeed, today there are scores of clustering algorithms that perform really well at finding certain types of clusters but not others. Each is like a decathlete with world-class sprinting skills but not distance running skills, or excellent jumping technique but not throwing technique.

Alexander and Akhmametyeva recognized the opportunity and the raw material to create the super-athlete of clustering algorithms.

“What we’ve been able to do is study what all of these authors have done, and pick and choose the best functional pieces underneath their algorithms,” Alexander says. “We took those and wove them together into our own product.”

Their autonomous clustering algorithm is currently in beta, with wide release scheduled for the first quarter of 2018. “I really wanted to name it after Jessica Ennis, the Olympic heptathlon champion from the UK,” says Alexander, “but I couldn’t make ENNIS a proper acronym.”

He’s taking suggestions.

Really putting the gifts to good use

An autonomous clustering algorithm will undoubtedly outperform humans at finding meaningful groups in massive and growing quantities of data. People excel at spotting patterns quickly, but they can’t observe and learn from hundreds of thousands, even millions, of data sets like AI-powered algorithms can.

But this does not spell the end of the human contribution to cluster analysis. Quite the opposite: it offers a kind of new beginning that highlights and even demands the human contribution. The machines merely find the clusters. People are uniquely qualified to decide how and where to use the knowledge those clusters bring. We, not the machines, will innovate and deploy the systems, services, products, and treatments that high-quality cluster discovery makes possible.

Three industries in particular are poised to benefit from the improved cluster analysis that an all-purpose clustering algorithm enables, according to Alexander and Akhmametyeva.

The first is waste management. “We’re throwing out a lot of things that have value,” says Alexander. “For example, coffee grounds add richness to soil, but we don’t have an efficient way to collect them and get them where they need to go. Now we can cluster collection efforts from coffee shops and other places in the best possible way.”

Second: medicine. “Ideally, you’d synthesize a drug for one individual in order to treat them best,” Alexander continues. “With a clustering algorithm that carves out extremely precise groups, we can keep getting closer to that.”

Energy is the third industry that Alexander and Akhmametyeva are excited about transforming with cluster analysis. “We can group buildings to a smart grid by type, size, hours of energy consumption, and a lot of other variables,” explains Alexander, “and then optimize energy load shift.”

Happy holidays, cluster analysis. You’ve been upgraded.

The gift that sports gives to AI

Clark Alexander and Sofya Akhmametyeva are sitting in an airy, modern downtown Chicago office, describing decathlons.“How do you say who is the best athlet

《Procrastination Sucks—So Here’s The “Eat That Frog” Way to Powerful Productivity（拒絕拖延，今日吃蛙）》

文章標題：《Procrastination Sucks—So Here’s The “Eat That Frog” Way to Powerful Productivity（拒絕拖延，利用“吃掉青蛙”的方法提升我們的專注力）》文章解析： -- 本篇文章主旨在於告訴我們如何專注於一件事情，作者認為專注於一件

The Idea That Sperm Race to the Egg Is Just Another Macho Myth

This story is for Medium members.Continue with FacebookContinue with GoogleMedium curates expert stories from leading publishers exclusively for members (w

關於check the manual that corresponds to your MySQL server version for the right syntax to use near

mysql今天在更新表的時候一直提示check the manual that corresponds to your MySQL server version for the right syntax to use near問題排查很久，數據庫版本沒有問題，語法也沒有問題，卻一直報錯最後排查發現是關鍵詞沖突

MySQL check the manual that corresponds to your MySQL server version for the right syntax錯誤

bat sqli jdbc call wrap base defaults org dsta 今配置Mybatis框架測試的時候報了這個錯：以下是錯誤的信息：org.apache.ibatis.exceptions.PersistenceException: ### Err

Error Code : 1064 You have an error in your SQL syntax; check the manual that corresponds to your My

轉自：https://blog.csdn.net/haha_66666/article/details/78444457 Query : select * from order LIMIT 0, 1000 Error Code : 1064 You have an error in your

linux mysql----You have an error in your SQL syntax; check the manual that corresponds to your MySQL

Mysql 語句毛病 (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use

The server cannot or will not process the request due to something that is perceived to be a client

HTTP Status 400 – Bad Request Type Status Report Description The server cannot or will not process the request due to something that i

解決ROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use n

之前一直用的好好的，突然就出現了這個錯誤： ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the rig

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near MyBatis中踩到

1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'grou

mysql8.0版本在已存在的表裡插入一條資料 insert INTO api_user(id,username,email,groups)VALUES('1','hh','[email protected]','Boss'); 執行報錯：1064 - You have an e

check the manual that corresponds to your MySQL server version for the right syntax to use near ###

報錯如下： pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version

資料庫工作筆記005---You have an error in your SQL syntax; check the manual that corresponds to y

Query : select * from order LIMIT 0, 1000 Error Code : 1064 You have an error in your SQL syntax; check the manual that corresponds to y

凜冬之翼---You have an error in your SQL syntax; check the manual that corresponds to your

今天在寫網頁與資料庫互動的程式碼的時候遇到了一個問題You have an error in your SQL syntax; check the manual that corresponds to your 問題分析：從英文翻譯過來的字面上意思是你的資料庫MySQL出現問題，也就是M

2018年9-12月份雅思口語題庫素材（原創）describe the first cellphone of yours that brought changes to your life

When I was a child in the primary school, I always used my parents’ phone. After studying in the high school, I hoped to own a phone bel

myBatis查詢報錯 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near

myBatis查詢報錯　　 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near

The gift that sports gives to AI

The gift that sports gives to AI

《Procrastination Sucks—So Here’s The “Eat That Frog” Way to Powerful Productivity（拒絕拖延，今日吃蛙）》

The Idea That Sperm Race to the Egg Is Just Another Macho Myth

關於check the manual that corresponds to your MySQL server version for the right syntax to use near

MySQL check the manual that corresponds to your MySQL server version for the right syntax錯誤

Error Code : 1064 You have an error in your SQL syntax; check the manual that corresponds to your My

linux mysql----You have an error in your SQL syntax; check the manual that corresponds to your MySQL

The server cannot or will not process the request due to something that is perceived to be a client

解決ROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use n

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server

1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'grou

check the manual that corresponds to your MySQL server version for the right syntax to use near ###

資料庫工作筆記005---You have an error in your SQL syntax; check the manual that corresponds to y

凜冬之翼---You have an error in your SQL syntax; check the manual that corresponds to your

2018年9-12月份雅思口語題庫素材（原創）describe the first cellphone of yours that brought changes to your life

myBatis查詢報錯 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near

Selenium執行IE報錯This usually means that a call to the COM method IWebBrowser2::Navigate2() failed.解決方案

mysql建立資料庫報錯You have an error in your SQL syntax; check the manual that corresponds to your MySQL se

The Hunger for Data is Asia's Main Threat to AI Development Analytics Insight

Here's the electric Jaguar that Waymo wants to turn into a self

The gift that sports gives to AI

相關推薦