What is the Weka Machine Learning Workbench
Machine learning is an iterative process rather than a linear process that requires each step to be revisited as more is learned about the problem under investigation. This iterative process can require using many different tools, programs and scripts for each process.
A machine learning workbench is a platform or environment that supports and facilitates a range of machine learning activities reducing or removing the need for multiple tools.
Some statistical and machine learning work benches like R provide very advanced tools but require a lot of manual configuration in the form of scripts and programming. The tools can also be fragile, written by and for academics rather than written to be robust and used in production environments.
What is Weka
The Weka machine learning workbench is a modern platform for applied machine learning. Weka is an acronym which stands for Waikato Environment for Knowledge Analysis. It is also the name of a New Zealand bird the Weka.
Five features of Weka that I like to promote are:
- Open Source: It is released as open source software under the GNU GPL. It is dual licensed and Pentaho Corporation owns the exclusive license to use the platform for business intelligence in their own product.
- Graphical Interface: It has a Graphical User Interface (GUI). This allows you to complete your machine learning projects without programming.
- Command Line Interface: All features of the software can used from the command line. This can be very useful for scripting large jobs.
- Java API: It is written in Java and provides a API that is well documented and promotes integration into your own applications. Note that the GNU GPL means that in turn your software would also have to be released as GPL.
- Documentation: There books, manuals, wikis and MOOC courses that can train you how to use the platform effectively.
The main reason I promote Weka is because a beginner can go through the process of applied machine learning using the graphical interface without having to do any programming. This is a big deal because getting a handle on the process, handling data and experimenting with algorithms is what a beginner should be learning about, not learning yet another scripting language.
Need more help with Weka for Machine Learning?
Take my free 14-day email course and discover how to use the platform step-by-step.
Click to sign-up and also get a free PDF Ebook version of the course.
Introduction to the Weka GUI
Now I want to show of the graphical user interface a bit and encourage you to download and have a play with Weka. The workbench provides three main ways to work on your problem: The Explorer for playing around and trying things out, the Experimenter for controlled experiments, and the KnowledgeFlow for graphically designing a pipeline for your problem.
Weka Explorer
The explorer is where you play around with your data and think about what transforms to apply to your data, what algorithms you want to run in experiments.
The Explorer interface is divided into 5 different tabs:
- Preprocess: Load a dataset and manipulate the data into a form that you want to work with.
- Classify: Select and run classification and regression algorithms to operate on your data.
- Cluster: Select and run clustering algorithms on your dataset.
- Associate: Run association algorithms to extract insights from your data.
- Select Attributes: Run attribute selection algorithms on your data to select those attributes that are relevant to the feature you want to predict.
- Visualize: Visualize the relationship between attributes.
Weka Experimenter
This interface is for designing experiments with your selection of algorithms and datasets, running experiments and analyzing the results.
The tools for analyzing results are very powerful, allowing you to consider and compare results that are statistically significant over multiple runs.
Knowledge Flow
Applied machine learning is a process and the Knowledge Flow interface allows you to graphically design that process and run the designs that you create. This includes the loading and transforming of input data, running of algorithms and the presentation of results.
It’s a powerful interface and metaphor for solving complex problems graphically.
Tips for Getting Started
Here are some tips for getting up and running fast:
Download Weka Right Now
It supports the three main platforms: Windows, OS X and Linux. Find the distribution for your platform, download it, install it and start it up. You might have to install Java first. The installation includes many standard experimental datasets (in the data directory) that you can load and practice on.
Read the Weka Documentation
The download includes a PDF manual (WekaManual.pdf) that can get you up to speed very quickly. It is very details and comprehensive with screenshots. There is plenty of supplemetry documentation online, check out:
Don’t forget the book. If you get into Weka, then buy the book. It provides an introduction to applied machine learning as well as an introduction to the Weka platform itself. Highly recommended.
Extensions and Plugins for Weka
There are a lot of plugin algorithm, extends and even platforms that build on Weka:
Online Courses on Weka
There are two online courses that teach data mining with Weka:
Rushdi Shams has an amazing Channel of YouTube videos showing you how to do lots of specific tasks in Weka. Check out his Weka YouTube channel here.
Have you used Weka? Leave a comment and share your experiences.
Want Machine Learning Without The Code?
Develop Your Own Models in Minutes
…with just a few a few clicks
Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, build models, tuning, and much more…
Finally Bring The Machine Learning To
Your Own Projects
Skip the Academics. Just Results.
相關推薦
What is the Weka Machine Learning Workbench
Tweet Share Share Google Plus Machine learning is an iterative process rather than a linear proc
A Tour of the Weka Machine Learning Workbench
Tweet Share Share Google Plus Weka is an easy to use and powerful machine learning platform. It
What is the difference between Machine Learning and Artificial Intelligence?
Artificial Intelligence and Machine Learning are two terms related to the world of computer science that can be heard a lot these days. These technologies
Ask HN: Is neuroscience-inspired machine learning the next big thing?
>> There is no need to build airplanes with flapping wings,As I understand it, birds don't need to flap their wings to fly. Many birds can glide for
What is the difference between Kill and Kill -9 command in Unix?
data esp osi lin mil print ren win sku w difference kill -9 pid and kill pid command - Ask Ubuntu https://askubuntu.com/questions/7918
UVA10056 - What is the Probability ?(概率)
ant scanf org import 1.0 key align review soft UVA10056 - What is the Probability ?(概率) 題目鏈接 題目大意:有n個人玩遊戲,一直到一個人勝出之後遊戲就能夠結束,要不然就一直從第
What is the Windows Integrity Mechanism?(什麽是Windows完整性機制)
ech 管理員 tac p s configure hierarchy 重要 issue ide The Windows integrity mechanism is a core component of the Windows security architectu
【轉載】What is the difference between authorized_keys and known_hosts file for SSH?
led accounts dep protocol wide HERE data round enc The known_hosts file lets the client authenticate the server, to check that it isn‘t c
What is the difference between static func and class func in Swift?
truct computed per value subclass guid between tab odi Special Kinds of Methods Methods associated with a type rather than an instance
【轉】What is the minimum version of Red Hat Enterprise Linux that supports Intel® Omni-Path? (OPA)
原文連結:https://access.redhat.com/solutions/2803981 What is the minimum version of Red Hat Enterprise Linux that supports Intel® Omni-Path? (OPA) &nb
What is the difference between iface eth0 inet manual and iface eth0 inet static?
iface eth0 inet static: Defines a static IP address for eth0 iface eth0 inet manual :To create a network interface without an IP address at a
What is the 'cls' variable used for in Python classes?
在類方法中第一個引數名叫cls,這是一種預設的程式設計風格,並非強制,也可以叫其他名字。主要用於區分類方法和例項方法。 Function and method arguments: Always use self for the first argument to instance
What is the difference between book depreciation and tax depreciation?
Generally, the difference involves the "timing" of the depreciation expense on a company's financial statementsversus the depreciati
Pandas isna() and isnull(), what is the difference?
isnull is an alias for isna. Literally in the code source of pandas: isnull = isna Indeed: >>> pd.isnull <function isna at 0x
killall doesn't kill all and rarely kills, what is the command for then?
轉載自https://askubuntu.com/questions/271028/killall-doesnt-kill-all-and-rarely-kills-what-is-the-command-for-then Q: I occasionally use the kil
What is the meaning of "Error 10000 (sugg. bt 0x0, driver bt 0x0, host bt 0x1)"?
Environment Red Hat Enterprise Linux (RHEL) 6 Red Hat Enterprise Linux (RHEL) 5 Issue What is the meaning of "Error 10000 (sugg. bt 0
What is the difference between PPTP, L2TP/IPsec, SSTP, IKEv2, and OpenVPN?
相關文章: iOS10把PPTP協定拿掉了,VPN協定只剩下L2TP、IPSEC、IKEv2。IKEv2 從 iOS 8.0 版開始支援。 PPTP和L2TP都使用PPP協議對資料進行封裝,然後新增附加包頭用於資料在網際網路絡上的傳輸。儘管兩個協議非常相似,但是仍存在以下幾方面的不同: 1.PPT
What is the best way to paginate results in SQL Server
先了解 MySQL server side paging: First off, don’t have a separate server script for each page, that is just madness. Most applications implement pagination v
Ask HN: What is the best Gmail client alternative?
Gmail.com totally sucks now since they've pushed the new UI to everyone (no revert back option). We all hate it, it's confusing, can't even find out if I h
Hire the Right Machine Learning Talent
When your enterprise sets out to build an artificial intelligence and machine learning team, are you targeting the right people to hire? Or is it possible