【機器學習】【Apriori演算法-2】Apriori演算法的Python實現 + 程式碼講解

阿新 • • 發佈：2019-02-20

1.Apriori演算法原理詳解

2.Python實現Apriori演算法

2.1演算法的重要Python操作知識點

實現Apriori演算法時，重要Python知識點：

1）如何實現二維list 轉化為set

2）如何判斷list A是list B的子集

此處A和B是一維序列；另外A是B的有序子集，比如[1, 3]是[1,2,3]的有序子集，但不是[3, 2, 1]的有序子集

3）如何實現由[[1], [2], [3]]得到[[1, 2], [1, 3], [2, 3]]

4）如何實現由[[1, 2], [1, 3], [2, 3], [2, 4], [3, 4]]得到[[1,2,3], [1,2,4], [1,3,4], [2,3,4]]

這個是Apriori的重要操作。是根據舊的支援資料集得到新的支援資料集的關鍵操作。

5）演算法的另外一個關鍵操作是，計算一個序列在序列列表裡面的出現次數

注：下面2.2是精簡的Python程式碼，2.3是展示數學求解過程的Python程式碼，如果是理解演算法思路可以看2.3的程式碼，其他可以看2.2程式碼。

2.2精簡的Python程式碼

# -*- coding: utf-8 -*-
"""
@author: Tom
Talk is cheap, show me the code
Aim:實現Apriori演算法
"""

import numpy as np

class CApriori(object):
    '''
    實現Apriori演算法
    '''
    def __init__(self, goods, minSupport):
        self.goods = goods           #交易商品列表
        #最小支援度，支援度過濾時支援度小於此值的頻繁項會被過濾掉
        self.minSupport = minSupport 
        
        self.N = np.shape(goods)[0]  #交易次數
        self.goodsSet = set([])      #商品集合, 元素是單個商品
        self.max_len  = 0            #最長交易的商品總數
        #支援資料集，元素是[頻繁項, 支援項]，頻繁項=商品序列list, 支援項=支援度*交易總次數
        self.supportData = []        
        
        self._init() #初始化
        self._work() #開始迭代運算直到找到支援資料集
        
    def _isSubset(self, A, B):
        '''判斷序列a是否序列b的子集，且是有序子集，此處有序子集詳見下面Note
        :param a, 一維序列
        :param b, 一維序列
        :return True:a是b的子集，False：a不是b的子集
        :Note [1, 3] 是 [1, 2, 3]的有序子集，[3, 1]不是[1, 2, 3]的有序子集
        '''
        A,B = list(A),list(B)
        if np.shape(A)[0] == 0:
            return False

        pre_ind = -1
        for e in A:
            if e not in B: #不是子集
                return False
            elif B.index(e) < pre_ind: #不滿足有序
                return False
            pre_ind = B.index(e)

        return True

    def _support(self, item, goods):
        '''
        :param item, 頻繁項
        :param goods, 交易商品列表
        :return 頻繁項的支援度
        '''
        subset_cnt = [self._isSubset(item, e) for e in goods]
        cnt = subset_cnt.count(True)
        support = cnt * 1.0 / self.N
        return support
        
    def _init(self):
        '''初始化支援資料集和迭代計數器
        '''
        self.supportData = []
        #設定迭代計數器
        for item in self.goods:
            if np.shape(item)[0] > self.max_len:
                self.max_len = np.shape(item)[0]
        #交易商品資料，一維list
        goods_data = []
        for e in self.goods:
            goods_data.extend(e)

        #交易商品集合，set
        self.goodsSet = set(goods_data)
        
        #初始資料集(頻繁項，支援項)
        for i in range(len(self.goodsSet)):
            e = list(self.goodsSet)[i] #頻繁項，單個商品
            cnt = goods_data.count(e)  #支援項
            support = cnt *1.0 / self.N
            if (support >= self.minSupport):
                self.supportData.append([[e], cnt])
        return self.supportData, self.max_len
        
    def _uniq(self, supportData):
        '''去除支援資料集中的重複頻繁項，重複頻繁項的產生示例:
         [1, 2, 3] 和 [1, 3, 5] 組合成頻繁項: [1, 2, 3, 5]
         [1, 2, 3] 和 [2, 3, 5] 組合成頻繁項: [1, 2, 3, 5]
        '''
        newSupportData = []
        data = []  #頻繁項
        for e in supportData:
            if e[0] not in data:
                data.append(e[0])
                newSupportData.append(e)
        return newSupportData
        
    def _work(self):
        '''Apriori發現頻繁項和支援項，即支援資料集
        '''
        preData = self.supportData
        
        #Apriori演算法發現頻繁項集的過程程式碼
        new_supportData = []
        for i in range(np.shape(preData)[0]):
            e = preData[i][0] #就頻繁項, current item in current supportdata
            #舊頻繁項發現新的頻繁項，只考慮後面的舊頻繁項配對發現新的頻繁項（提高演算法時間效能）
            for j in np.arange(start=i+1, stop=len(preData)):
                be = preData[j][0] #item at the back of current item 
                #發現新資料集的頻繁項, new_e
                new_e = []
                if 1 == np.shape(e)[0]:#舊頻繁項是初始頻繁項
                    new_e = e + be
                elif be.count(e[-1]) > 0 and be[-1] != e[-1]:
                    ind = be.index(e[-1])
                    new_e = e + be[ind+1:len(be)]
                if 0 == np.shape(new_e)[0]:
                    continue
                #支援度過濾
                support = self._support(new_e, self.goods)
                if (support >= self.minSupport):
                    new_supportData.append([new_e, support*self.N]) #[頻繁項，支援項]
        #更新支援資料集,使用重複頻繁項去重後的支援資料集
        self.supportData = self._uniq(new_supportData)
        if 0 == np.shape(self.supportData)[0] or self.max_len == np.shape(self.supportData[0][0])[0]:
            return self.supportData #exit apriori algorithm
        else:
            return self._work() #開始下次迭代計算
        
    def GetSupportData(self):
        return self.supportData

if __name__=='__main__':
    goods = [[1, 2, 5],
             [2, 4],
             [2, 3],
             [1, 2, 4],
             [1, 3],
             [2, 3],
             [1, 3],
             [1, 2, 3, 5],
             [1, 2, 3, 5],
             [1, 2, 3]]
    minSupport = 0.2
    apr = CApriori(goods, minSupport)
    
    supportData = apr.GetSupportData()
    print('最小支援度:', minSupport)
    print('交易商品列表:\n', goods)
    print('Apriori得到的支援資料集:\n', np.array(supportData))

執行結果

最小支援度: 0.2
交易商品列表:
 [[1, 2, 5], [2, 4], [2, 3], [1, 2, 4], [1, 3], [2, 3], [1, 3], [1, 2, 3, 5], [1, 2, 3, 5], [1, 2, 3]]
Apriori得到的支援資料集:
 [[[1, 2, 3, 5] 2.0]]

2.3Python實現程式碼

人肉出品，程式碼詳見：

# -*- coding: utf-8 -*-
"""
@author: Tom
Talk is cheap, show me the code
Aim:實現Apriori演算法
"""

import numpy as np

class CApriori(object):
    '''
    實現Apriori演算法
    '''
    def __init__(self, goods, minSupport):
        self.goods = goods           #交易商品列表
        #最小支援度，支援度過濾時支援度小於此值的頻繁項會被過濾掉
        self.minSupport = minSupport 
        
        self.N = np.shape(goods)[0]  #交易次數
        self.goodsSet = set([])      #商品集合, 元素是單個商品
        self.max_len  = 0            #最長交易的商品總數
        self.debug_cnt = 0           #記錄迭代次數，除錯使用，可以刪除此變數
        #支援資料集，元素是[頻繁項, 支援項]，頻繁項=商品序列list, 支援項=支援度*交易總次數
        self.supportData = []        
        
        self._init() #初始化
        self._work() #開始迭代運算直到找到支援資料集
        
    def _isSubset(self, A, B):
        '''判斷序列a是否序列b的子集，且是有序子集，此處有序子集詳見下面Note
        :param a, 一維序列
        :param b, 一維序列
        :return True:a是b的子集，False：a不是b的子集
        :Note [1, 3] 是 [1, 2, 3]的有序子集，[3, 1]不是[1, 2, 3]的有序子集
        '''
        A,B = list(A),list(B)
        if np.shape(A)[0] == 0:
            return False

        pre_ind = -1
        for e in A:
            if e not in B: #不是子集
                return False
            elif B.index(e) < pre_ind: #不滿足有序
                return False
            pre_ind = B.index(e)

        return True

    def _support(self, item, goods):
        '''
        :param item, 頻繁項
        :param goods, 交易商品列表
        :return 頻繁項的支援度
        '''
        subset_cnt = [self._isSubset(item, e) for e in goods]
        cnt = subset_cnt.count(True)
        support = cnt * 1.0 / self.N
        return support
        
    def _init(self):
        '''初始化支援資料集和迭代計數器
        '''
        N,goods,minSupport = self.N, self.goods,self.minSupport
        self.supportData = []
        
        #設定迭代計數器
        for item in goods:
            if np.shape(item)[0] > self.max_len:
                self.max_len = np.shape(item)[0]
        
        #交易商品資料，一維list
        goods_data = []
        for e in goods:
            goods_data.extend(e)

        #交易商品集合，set
        self.goodsSet = set(goods_data)
        
        #初始資料集(頻繁項，支援項)
        for i in range(len(self.goodsSet)):
            e = list(self.goodsSet)[i] #初始頻繁項
            cnt = goods_data.count(e)
            support = cnt *1.0 / N
            if (support >= minSupport):
                self.supportData.append([[e], cnt])
                
        #debug
        self.debug_cnt += 1
        print('=================迭代執行次數:', self.debug_cnt)
        print('交易商品列表:\n', goods)
        print('最長交易記錄的商品總數為:', self.max_len)
        print('交易商品集合:\n', self.goodsSet)
        print('初始資料集:\n', self.supportData)
        
    def _uniq(self, supportData):
        '''去除支援資料集中的重複頻繁項，重複頻繁項的產生示例:
         [1, 2, 3] 和 [1, 3, 5] 組合成頻繁項: [1, 2, 3, 5]
         [1, 2, 3] 和 [2, 3, 5] 組合成頻繁項: [1, 2, 3, 5]
        '''
        newSupportData = []
        data = []  #頻繁項
        for e in supportData:
            if e[0] not in data:
                data.append(e[0])
                newSupportData.append(e)
        return newSupportData
        
    def _work(self):
        '''Apriori的主體函式，發現新的頻繁項和支援項，即由舊的支援資料集發現新的支援資料集，直到發現完成
        '''        
        self.debug_cnt += 1
        print('\n=================迭代執行次數:', self.debug_cnt)
        N,goods,minSupport = self.N, self.goods,self.minSupport
        preData = self.supportData
        
        #Apriori演算法發現頻繁項集的過程程式碼
        new_supportData = []
        for i in range(np.shape(preData)[0]):
            print('\n',preData[i][0],'go to 發現新的頻繁項:')
            #舊頻繁項e
            e = preData[i][0] #current item in current supportdata
            #舊頻繁項發現新的頻繁項，只考慮後面的舊頻繁項配對發現新的頻繁項（提高演算法時間效能）
            for j in np.arange(start=i+1, stop=len(preData)):
                be = preData[j][0] #item at the back of current item 
                #發現新資料集的頻繁項, new_e
                new_e = []
                if 1 == np.shape(e)[0]:#舊頻繁項是初始頻繁項
                    new_e = e + be
                elif be.count(e[-1]) > 0 and be[-1] != e[-1]:
                    ind = be.index(e[-1])
                    new_e = e + be[ind+1:len(be)]
                if 0 == np.shape(new_e)[0]:
                    print('\t',e,'和',be ,'無法組合成新的頻繁項.')
                    continue
                #支援度過濾
                support = self._support(new_e, goods)
                if (support >= minSupport):
                    new_supportData.append([new_e, support*N])
                    print('\t',e,'和',be ,'組合成頻繁項:',new_e,'支援度:',support,'經過支援度過濾，增加此頻繁項:', np.array([new_e, support*N]))
                else: #debug
                    print('\t',e,'和',be ,'組合成頻繁項:',new_e,'支援度:',support,'經過支援度過濾，丟棄此頻繁項:', np.array([new_e, support*N]))
        #更新支援資料集,使用重複頻繁項去重後的支援資料集
        self.supportData = self._uniq(new_supportData)
        print('\nnew_supportData:\n', np.array(new_supportData))
        if 0 == np.shape(self.supportData)[0] or self.max_len == np.shape(self.supportData[0][0])[0]:
            print('Apriori succeed, supportData:\n', np.array(self.supportData))
        else:
            return self._work()

        print('======exit Apriori======\n')
        return self.supportData
        
    def GetSupportData(self):
        return self.supportData

if __name__=='__main__':
    goods = [[1, 2, 5],
             [2, 4],
             [2, 3],
             [1, 2, 4],
             [1, 3],
             [2, 3],
             [1, 3],
             [1, 2, 3, 5],
             [1, 2, 3, 5],
             [1, 2, 3]]
    minSupport = 0.2
    apr = CApriori(goods, minSupport)
    
    supportData = apr.GetSupportData()
    print('最小支援度:', minSupport)
    print('交易商品列表:\n', goods)
    print('最小支援度為%f時的支援資料集為:\n'%minSupport, np.array(supportData))

3.執行結果

=================迭代執行次數: 1
交易商品列表:
 [[1, 2, 5], [2, 4], [2, 3], [1, 2, 4], [1, 3], [2, 3], [1, 3], [1, 2, 3, 5], [1, 2, 3, 5], [1, 2, 3]]
最長交易記錄的商品總數為: 4
交易商品集合:
 {1, 2, 3, 4, 5}
初始資料集:
 [[[1], 7], [[2], 8], [[3], 7], [[4], 2], [[5], 3]]

=================迭代執行次數: 2

 [1] go to 發現新的頻繁項:
         [1] 和 [2] 組合成頻繁項: [1, 2] 支援度: 0.5 經過支援度過濾，增加此頻繁項: [[1, 2] 5.0]
         [1] 和 [3] 組合成頻繁項: [1, 3] 支援度: 0.5 經過支援度過濾，增加此頻繁項: [[1, 3] 5.0]
         [1] 和 [4] 組合成頻繁項: [1, 4] 支援度: 0.1 經過支援度過濾，丟棄此頻繁項: [[1, 4] 1.0]
         [1] 和 [5] 組合成頻繁項: [1, 5] 支援度: 0.3 經過支援度過濾，增加此頻繁項: [[1, 5] 3.0]

 [2] go to 發現新的頻繁項:
         [2] 和 [3] 組合成頻繁項: [2, 3] 支援度: 0.5 經過支援度過濾，增加此頻繁項: [[2, 3] 5.0]
         [2] 和 [4] 組合成頻繁項: [2, 4] 支援度: 0.2 經過支援度過濾，增加此頻繁項: [[2, 4] 2.0]
         [2] 和 [5] 組合成頻繁項: [2, 5] 支援度: 0.3 經過支援度過濾，增加此頻繁項: [[2, 5] 3.0]

 [3] go to 發現新的頻繁項:
         [3] 和 [4] 組合成頻繁項: [3, 4] 支援度: 0.0 經過支援度過濾，丟棄此頻繁項: [[3, 4] 0.0]
         [3] 和 [5] 組合成頻繁項: [3, 5] 支援度: 0.2 經過支援度過濾，增加此頻繁項: [[3, 5] 2.0]

 [4] go to 發現新的頻繁項:
         [4] 和 [5] 組合成頻繁項: [4, 5] 支援度: 0.0 經過支援度過濾，丟棄此頻繁項: [[4, 5] 0.0]

 [5] go to 發現新的頻繁項:

new_supportData:
 [[[1, 2] 5.0]
 [[1, 3] 5.0]
 [[1, 5] 3.0]
 [[2, 3] 5.0]
 [[2, 4] 2.0]
 [[2, 5] 3.0]
 [[3, 5] 2.0]]

=================迭代執行次數: 3

 [1, 2] go to 發現新的頻繁項:
         [1, 2] 和 [1, 3] 無法組合成新的頻繁項.
         [1, 2] 和 [1, 5] 無法組合成新的頻繁項.
         [1, 2] 和 [2, 3] 組合成頻繁項: [1, 2, 3] 支援度: 0.3 經過支援度過濾，增加此頻繁項: [[1, 2, 3] 3.0]
         [1, 2] 和 [2, 4] 組合成頻繁項: [1, 2, 4] 支援度: 0.1 經過支援度過濾，丟棄此頻繁項: [[1, 2, 4] 1.0]
         [1, 2] 和 [2, 5] 組合成頻繁項: [1, 2, 5] 支援度: 0.3 經過支援度過濾，增加此頻繁項: [[1, 2, 5] 3.0]
         [1, 2] 和 [3, 5] 無法組合成新的頻繁項.

 [1, 3] go to 發現新的頻繁項:
         [1, 3] 和 [1, 5] 無法組合成新的頻繁項.
         [1, 3] 和 [2, 3] 無法組合成新的頻繁項.
         [1, 3] 和 [2, 4] 無法組合成新的頻繁項.
         [1, 3] 和 [2, 5] 無法組合成新的頻繁項.
         [1, 3] 和 [3, 5] 組合成頻繁項: [1, 3, 5] 支援度: 0.2 經過支援度過濾，增加此頻繁項: [[1, 3, 5] 2.0]

 [1, 5] go to 發現新的頻繁項:
         [1, 5] 和 [2, 3] 無法組合成新的頻繁項.
         [1, 5] 和 [2, 4] 無法組合成新的頻繁項.
         [1, 5] 和 [2, 5] 無法組合成新的頻繁項.
         [1, 5] 和 [3, 5] 無法組合成新的頻繁項.

 [2, 3] go to 發現新的頻繁項:
         [2, 3] 和 [2, 4] 無法組合成新的頻繁項.
         [2, 3] 和 [2, 5] 無法組合成新的頻繁項.
         [2, 3] 和 [3, 5] 組合成頻繁項: [2, 3, 5] 支援度: 0.2 經過支援度過濾，增加此頻繁項: [[2, 3, 5] 2.0]

 [2, 4] go to 發現新的頻繁項:
         [2, 4] 和 [2, 5] 無法組合成新的頻繁項.
         [2, 4] 和 [3, 5] 無法組合成新的頻繁項.

 [2, 5] go to 發現新的頻繁項:
         [2, 5] 和 [3, 5] 無法組合成新的頻繁項.

 [3, 5] go to 發現新的頻繁項:

new_supportData:
 [[[1, 2, 3] 3.0]
 [[1, 2, 5] 3.0]
 [[1, 3, 5] 2.0]
 [[2, 3, 5] 2.0]]

=================迭代執行次數: 4

 [1, 2, 3] go to 發現新的頻繁項:
         [1, 2, 3] 和 [1, 2, 5] 無法組合成新的頻繁項.
         [1, 2, 3] 和 [1, 3, 5] 組合成頻繁項: [1, 2, 3, 5] 支援度: 0.2 經過支援度過濾，增加此頻繁項: [[1, 2, 3, 5] 2.0]
         [1, 2, 3] 和 [2, 3, 5] 組合成頻繁項: [1, 2, 3, 5] 支援度: 0.2 經過支援度過濾，增加此頻繁項: [[1, 2, 3, 5] 2.0]

 [1, 2, 5] go to 發現新的頻繁項:
         [1, 2, 5] 和 [1, 3, 5] 無法組合成新的頻繁項.
         [1, 2, 5] 和 [2, 3, 5] 無法組合成新的頻繁項.

 [1, 3, 5] go to 發現新的頻繁項:
         [1, 3, 5] 和 [2, 3, 5] 無法組合成新的頻繁項.

 [2, 3, 5] go to 發現新的頻繁項:

new_supportData:
 [[[1, 2, 3, 5] 2.0]
 [[1, 2, 3, 5] 2.0]]
Apriori succeed, supportData:
 [[[1, 2, 3, 5] 2.0]]
======exit Apriori======

最小支援度: 0.2
交易商品列表:
 [[1, 2, 5], [2, 4], [2, 3], [1, 2, 4], [1, 3], [2, 3], [1, 3], [1, 2, 3, 5], [1, 2, 3, 5], [1, 2, 3]]
最小支援度為0.200000時的支援資料集為:
 [[[1, 2, 3, 5] 2.0]]

（end）

【機器學習】【Apriori演算法-2】Apriori演算法的Python實現 + 程式碼講解

1.Apriori演算法原理詳解2.Python實現Apriori演算法2.1演算法的重要Python操作知識點實現Apriori演算法時，重要Python知識點：1）如何實現二維list 轉化為set2）如何判斷list A是list B的子集此處A和B是一維序列；

#Apache Spark系列技術直播# 第四講【機器學習介紹與Spark MLlib實踐】

Apache Spark系列技術直播--第四講機器學習介紹與Spark MLlib實踐直播時間：2018.12.06 19:00 - 20:00 主講人：江宇(燕回) 阿里巴巴計算平臺EMR技術專家內容提要：本次講座主要面對的是機器學習的入門者，以及想要使用Spark來進行機器學習的使用者。我們會

# Apache Spark系列技術直播# 第四講【機器學習介紹與Spark MLlib實踐】

主講人：江宇(燕回) 阿里巴巴計算平臺EMR技術專家直播時間：2018.12.06 19:00 - 20:00 內容提要：本次講座主要面對的是機器學習的入門者，以及想要使用Spark來進行機器學習的使用者。我們會介紹一下機器學習相關領域的基礎知識，以及機器學習在spark上面的實踐，同時給出我們的一些使

【機器學習（李巨集毅）】三、Bias and Variance

本講核心問題：Where does the error come from？ Review：更復雜的模型不一定在測試集上有更好的表現誤差由偏差“bias”導致誤差由方差“variance”導致 Estimator： f^是計算poke

OpenCV機器學習（1）：貝葉斯分類器實現程式碼分析

OpenCV的機器學習類定義在ml.hpp檔案中，基礎類是CvStatModel，其他各種分類器從這裡繼承而來。今天研究CvNormalBayesClassifier分類器。 1.類定義在ml.hpp中有以下類定義： class CV_EXPORTS_W CvNorm

K近鄰演算法(KNN)原理解析及python實現程式碼

KNN演算法是一個有監督的演算法，也就是樣本是有標籤的。KNN可以用於分類，也可以用於迴歸。這裡主要講knn在分類上的原理。KNN的原理很簡單：放入一個待分類的樣本，使用者指定k的大小，然後計算所有訓練樣本與該樣

【機器學習】Apriori演算法——原理及程式碼實現（Python版）

Apriopri演算法 Apriori演算法在資料探勘中應用較為廣泛，常用來挖掘屬性與結果之間的相關程度。對於這種尋找資料內部關聯關係的做法，我們稱之為：關聯分析或者關聯規則學習。而Apriori演算法就是其中非常著名的演算法之一。關聯分析，主要是通過演算法在大規模資料集中尋找頻繁項集和關聯規則。

【機器學習實戰】第2章 K-近鄰演算法(k-NearestNeighbor，KNN)

第2章 k-近鄰演算法 <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=default"></script>

【機器學習實戰】11.使用Apriori演算法進行關聯分析——python3程式

之前費心費力寫了一篇，結果沒有儲存。這一篇主要放上書本上的程式分析及執行結果。關聯分析主要分為：頻繁項集生成和關聯規則生成1.頻繁項集生成——Apriori演算法程式碼：def createC1(dataSet): ''' 構建大小為1的所有候

【機器學習】Windows +Anaconda3(python3.5)+opencv3.4.1 安裝（2）

Windows +Anaconda3(python3.5)+opencv3.4.1 安裝（2）原文參考：https://www.cnblogs.com/

【機器學習演算法實現】主成分分析 PCA ——基於python+numpy

分享一下我老師大神的人工智慧教程！零基礎，通俗易懂！http://blog.csdn.net/jiangjunshow 也歡迎大家轉載本篇文章。分享知識，造福人民，實現我們中華民族偉大復興！

【機器學習演算法實現】logistic迴歸基於Python和Numpy函式庫

【機器學習演算法實現】kNN演算法手寫識別——基於Python和NumPy函式庫

【機器學習筆記35】蟻群演算法

【參考資料】【1】《蟻群演算法原理及其應用》【2】測試資料: https://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/tsp/att48.tsp.gz 演算法原理（以TSP問題為例）（1）引數初始化。令時間t=0和迴圈次數

【機器學習】線性迴歸演算法的過擬合比較

回顧過擬合與欠擬合主要介紹了什麼是欠擬合什麼是過擬合對抗過擬合主要介紹了線性迴歸中對抗過擬合的方法，主要包括：L1-norm的LASSO迴歸、L2-norm的Ridge迴歸，此外還有一個沒有提到，L1-norm和L2-norm結合的Elasitc Net(彈性網

【機器學習模型詳細推導2】- 邏輯迴歸

邏輯迴歸 1. 模型引入 2. 模型描述 3. 模型求解策略（代價函式） 4. 模型求解演算法 - 梯度下降 1. 模型引入線性模型可以進行迴歸學習（參見【機器學習模型1】- 線性迴歸），但如何用於分類任務？需要找一個單調可

【機器學習經典演算法梳理】一.線性迴歸

【機器學習經典演算法梳理】是一個專門梳理幾大經典機器學習演算法的部落格。我在這個系列部落格中，爭取用最簡練的語言、較簡潔的數學公式，和清晰成體系的提綱，來盡我所能，對於演算法進行詳盡的梳理。【機器學習經典演算法梳理】系列部落格對於機器學習演算法的梳理，將從“基本思想”、“基本形式”、“過程推導”、“

【機器學習】EM演算法詳細推導和講解

眾所周知，極大似然估計是一種應用很廣泛的引數估計方法。例如我手頭有一些東北人的身高的資料，又知道身高的概率模型是高斯分佈，那麼利用極大化似然函式的方法可以估計出高斯分佈的兩個引數，均值和方差。這個方法基本上所有概率課本上都會講，我這就不多說了，不清楚的請百度。　　然而現在我面臨的是這種情況，我

【機器學習三】梯度下降法K-means優化演算法

K-means演算法延伸對於之前的一篇文章中說過K-means雖然效果可以，但是對給定的K值敏感，簇中心位置敏感以及計算量大。所以針對以上兩點有了一些優化的方法。對於給定的K值偏大或者偏小都將影響聚類效果。而由於對於需要聚類的資料本身沒有一個y值即分類值，這正是需要演算法最後得出的。所以

【機器學習】接地氣地解釋K-means聚類演算法

俗話說“物以類聚，人以群分”，這句話在K-means聚類演算法裡面得到了充分的繼承。而K-means演算法的實際應用範圍可謂是大到無法估量，基本可以說，只要你想不到，沒有聚類聚不起來的東西！ &nbs

【機器學習】【Apriori演算法-2】Apriori演算法的Python實現 + 程式碼講解

1.Apriori演算法原理詳解

2.Python實現Apriori演算法

2.1演算法的重要Python操作知識點

2.2精簡的Python程式碼

2.3Python實現程式碼

3.執行結果

相關推薦