1. 程式人生 > >Do Not Use Random Guessing As Your Baseline Classifier

Do Not Use Random Guessing As Your Baseline Classifier

I recently received the following question via email:

Hi Jason, quick question. A case of class imbalance: 90 cases of thumbs up 10 cases of thumbs down. How would we calculate random guessing accuracy in this case?

We can answer this question using some basic probability (I opened excel and typed in some numbers).

Don't Use Random Guessing As Your Baseline Classifier

Don’t Use Random Guessing As Your Baseline Classifier
Photo by cbgrfx123, some rights reserved.

Let’s say the split is 90%-10% for class 0 and class 1. Let’s also say that you will guess randomly using the same ratio.

The theoretical accuracy of random guessing on a two-classification problem is:

1 = P(class is 0) * P(you guess 0) + P(class is 1) * P(you guess 1)

We can test this on our example 90%-10% split:

123 = (0.9 * 0.9) + (0.1 * 0.1)= 0.82= 0.82 * 100 or 82%

To check the math, you can plug-in a 50%-50% split of your data and it matches your intuition:

123 = (0.5 * 0.5) + (0.5 * 0.5)= 0.5= 0.5 * 100 or 50%

If we look on Google, we find a similar question on Cross Validated “What is the chance level accuracy in unbalanced classification problems?” with an almost identical answer. Again, a nice confirmation.

Interesting, but there is an important takeaway point from all of this.

Don’t Use Random Guessing As A Baseline

If you are looking for a classifier to use as a baseline accuracy, don’t use random guessing.

There is a classifier called Zero Rule (or 0R or ZeroR for short). It is the simplest rule you can use on a classification problem and it simply predicts the majority class in your dataset (e.g. the mode).

In the example above with a 90%-10% for class 0 and class 1 it would predict class 0 for every prediction and achieve an accuracy of 90%. This is 8% better than the theoretical maximum using random guessing.

Use the Zero Rule method as a baseline.

Also, in imbalanced classification problems like this, you should use metrics other than Accuracy such as Kappa or Area under ROC Curve.

For more information about alternative performance measures on classification problems see the post:

For more on working with imbalanced classification problems see the post:

Do you have any questions about this post? Ask in the comments.


Frustrated With Machine Learning Math?

Mater Machine Learning Algorithms

See How Algorithms Work in Minutes

…with just arithmetic and simple examples

It covers explanations and examples of 10 top algorithms, like:
Linear Regression, k-Nearest Neighbors, Support Vector Machines and much more…

Finally, Pull Back the Curtain on
Machine Learning Algorithms

Skip the Academics. Just Results.


相關推薦

Do Not Use Random Guessing As Your Baseline Classifier

Tweet Share Share Google Plus I recently received the following question via email: Hi Jason, qu

vue Do not use built-in or reserved HTML elements as component id: nav

記錄 就是 round serve hang inter 技術分享 分享 ima 剛入坑vue 在新建組建的時候出現這個問題,原因是我新建的這個組建name: ‘nav‘ 在vue中好像nav 這樣的 有點類似於 “關鍵字” 不能作為組建的name,按照服務端來說 就是不可

Do not use built-in or reserved HTML elements as component id: header

就是 報錯 default head header fault 我們 .com 成了 剛剛在搭建項目時發現控制臺報錯 查找發現是因為組件名稱所致,也就是當我們起名一個header.vue的組件時,我們安裝的vue插件會自動把name設置為default 這就造成了錯誤

Do not use built-in or reserved HTML elements as component id: animate

報錯截圖: 元件名不能和html標籤重複,animate 和系統(vue)的內建屬性名衝突了!    由於在模板需要插入到 DOM 中,所以模板中的標籤名必須能夠被 DOM 正確地解析。主要有三種情況:     1).是完全不合法的標籤名,例如 &l

老男孩教育每日一題-2017年5月7日-加餐-linux下面如何實現,執行rm命令,就顯示do not use rm command

linux別名 每日一題 1.題目-老男孩教育每日一題-2017年5月7日-加餐-linux下面如何實現,執行rm命令,就顯示do not use rm command2.要求結果[[email protected]/* */ ~]# rm do not use rm command3.答

vue 命令行報錯“Do not use ‘new’ for side effects“

ide pre 檢查 lint 有用 style class app cnblogs 開始學習vue.js 對別人的項目敲,一路報錯 1.命令行報錯“Do not use ‘new’ for side effects“ main.js 的代碼是 【沒有用ESLint檢查運行

application.yml使用@符合問題:'@' that cannot start any token. (Do not use @ for indentation)

The exce 配置 reader cat .so not 文件 uil 在application配置文件中使用@出現異常: Exception in thread "main" while scanning for the next tokenfound charact

解決vue專案eslint校驗 Do not use 'new' for side effects 的兩種方法

import Vue from 'vue' import App from './App.vue' import router from './router' new Vue({ el: '#app', render: h => h(App), router })  當使用eslin

make sure other views do not use the same id .....

  報錯內容如上,當不同型別的檢視在同一個層次上有相同的標識時,通常會發生這種情況,意思是,這個檢視的id是lv,確保其他檢視不使用相同的標識。 然後我查了這個id,發現xml裡面不同佈局確實有很多id的命名是相同的,找到那個id後改成了一個唯一的,結果依舊報錯。。哎,然

【Android】AS警告:Do not concatenate text displayed with setText. Use resource string with placeholders.

轉載請註明出處,原文連結:https://blog.csdn.net/u013642500/article/details/80167402 【錯誤】 Do not concatenate text displayed with setText. Use resource string wi

TextView.setText提示Do not concatenate text displayed with setText. Use resource string with placehold

挖坑背景 在實際的專案開發過程中,我們會經常用到TextView.setText()方法,而在進行某些單位設定時,比如 設定時間xxxx年xx月xx日 或者設定 體重xx公斤* 時,大家一般都會使用如下寫法: // 設定顯示當前日期 TextView tvDate = (Text

Do not throw System.Exception, System.SystemException, System.NullReferenceException, or System.IndexOutOfRangeException intentionally from your own s

sonarqube的掃描結果提示 https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/exceptions/creating-and-throwing-exceptions https://stackoverflow.com/q

make menuconfig提示Build dependency: Please do not compile as root.解決辦法

修改prereq-build.mk檔案: 輸入命令 vi include/prereq-build.mk 然後註釋這句程式碼 [ "$$(shell whoami)" != "root" ] 然後執行make menuconfig即可。

linux下安裝laravel出現警告 Do not run Composer as root/super user! See https://getcomposer.org/root for details 解決

1.解決思路新增一個普通使用者執行安裝操作 2.伺服器新增普通使用者參考 :https://www.cnblogs.com/victorcode/p/9988159.html 3.新增完成需要給這個使用者新增許可權(以張三為例) 並且賦予 sudoers檔案寫許可權 # chmod u+w

Ask HN: Why did Microsoft not use HTML instead of .doc as Word doc format?

If one were to write a word processor in 2018 from scratch, should they use HTML as the document format over .doc or anything else? Considering each browse

Ask HN: What do you use as network diagram software?

Though this in no way addresses your question, I hope it may still be of use.I regularly create extremely information-dense graphs for highly distributed,

Ask HN: How do you use internal project codenames in your company?

We use them and I complain about it a lot. Anything that reduces clarity and transparency is a bad thing in my opinion - a codename is at best security the

Ask HN: How do you pick your investments? What tools do you use?

If you are talking about equities:Very simple trick when investing in stocks. Look for companies with great management. The better the management the mor

Ask HN: What do you use as a home surveillance system?

Personally, I live with just my wife and we have a WiFi single camera that watches the inside of our part of the duplex. It covers all the entrances and ha

Ask HN: What API Gateway do you use for your GRPC services?

A bit of context. I'm writing GRPC serivces that run on container. I want to implement access rules and security on those services without repeating code i