SVM支援向量機和美麗的畫圖方法

阿新 • • 發佈：2018-12-19

SVM支援向量機python

線性可分的資料初探

生成一點線性可分的資料看看
什麼樣的直線可以分開這些點呢

SVM的獨特思想：最小間隔最大化

直觀理解
訓練

線性可分的資料初探

生成一點線性可分的資料看看

利用sklearn中make_blobs函式，其引數為
1. n_samples: int, optional (default=100) The total number of points equally divided among clusters. 待生成的樣本的總數。
2. **n_features: **int, optional (default=2) The number of features for each sample. 每個樣本的特徵數。
3. centers: int or array of shape [n_centers, n_features], optional (default=3) The number of centers to generate, or the fixed center locations. 要生成的樣本中心（類別）數，或者是確定的中心點。要生成的樣本中心（類別）數，或者是確定的中心點。
4. cluster_std:
  
  float or sequence of floats, optional (default=1.0) The standard deviation of the clusters. 每個類別的方差，例如我們希望生成2類資料，其中一類比另一類具有更大的方差，可以將cluster_std設定為[1.0,3.0]。
5. center_box: pair of floats (min, max), optional (default=(-10.0, 10.0))
  The bounding box for each cluster center when centers are generated at random.
6. shuffle: boolean, optional (default=True) Shuffle the samples.
7. random_state: int, RandomState instance or None, optional (default=None)
  If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
  簡而言之，選擇生成樣本的個數，特徵數，類別數，類方差就足夠用了

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets.samples_generator import make_blobs   #類資料的生成  
X, y = make_blobs(n_samples=50,n_features=2,centers = 2,
                  random_state=0, cluster_std=0.60)
print(X.shape)	#完全是自己想看一看X的格式
plt.scatter(X[:, 0], X[:, 1], c=y, s=50,marker='o',cmap='summer')

生成資料的散點圖

什麼樣的直線可以分開這些點呢

plt.figure(figsize = (10,6))
xfit = np.linspace(-1, 3.5)
plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')
plt.plot([0.6], [2.1], 'x', color='blue', markeredgewidth=3, markersize=10)

for m, b in [(1.1, 0.65), (0.5, 1.6), (-0.2, 2.9)]:
    plt.plot(xfit, m * xfit + b,)

plt.xlim(-1, 3.5)

選取了三條直線，均可以將這兩類點分離。直觀上，X點歸屬於哪一類，線就應該相應的變化。
在這裡插入圖片描述

SVM的獨特思想：最小間隔最大化

直觀理解

plt.figure(figsize = (8,5))
xfit = np.linspace(-1, 3.5)
plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')

for m, b, d in [(1, 0.65, 0.33), (0.5, 1.6, 0.55), (-0.2, 2.9, 0.2)]:
    yfit = m * xfit + b
    plt.plot(xfit, yfit, )
    plt.fill_between(xfit, yfit - d, yfit + d, edgecolor='blue',
                     color='#AAAAAA', alpha=0.5)

plt.xlim(-1, 3.5);

在這裡插入圖片描述

本質上就是陰影部分的區域最大化，分類邊界到最近的點的距離最大化。

訓練

from sklearn.svm import SVC    # "Support vector classifier"
model = SVC(kernel='linear')   #kernel選擇線性的
model.fit(X, y)

進行繪圖

#繪圖函式
def plot_svc_decision_function(model, ax=None, plot_support=True):
    """Plot the decision function for a 2D SVC"""
    if ax is None:
        ax = plt.gca()
    xlim = ax.get_xlim()
    ylim = ax.get_ylim()
    
    # create grid to evaluate model
    x = np.linspace(xlim[0], xlim[1], 30)
    y = np.linspace(ylim[0], ylim[1], 30)
    Y, X = np.meshgrid(y, x)  
    xy = np.vstack([X.ravel(), Y.ravel()]).T
    P = model.decision_function(xy).reshape(X.shape)
    
    # plot decision boundary and margins
    ax.contour(X, Y, P, colors='k',
               levels=[-1, 0, 1], alpha=0.5,
               linestyles=['--', '-', '--'])
    
    # plot support vectors
    if plot_support:
        ax.scatter(model.support_vectors_[:, 0],
                   model.support_vectors_[:, 1],
                   s=300, linewidth=1, facecolors='black');
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)
plt.figure(figsize = (10,8))
plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')
plot_svc_decision_function(model)

在這裡插入圖片描述

這條線就是我們希望得到的決策邊界啦
觀察發現有3個黑色的點點，它們恰好都是邊界上的點就是我們的support vectors（支援向量）
在Scikit-Learn中, 它們儲存在這個位置 support_vectors_（一個屬性）

model.support_vectors_

在這裡插入圖片描述

只要支援向量不變，資料點增加無所謂。

SVM支援向量機和美麗的畫圖方法

SVM支援向量機python 線性可分的資料初探生成一點線性可分的資料看看什麼樣的直線可以分開這些點呢 SVM的獨特思想：最小間隔最大化直觀理解訓練線性可分的資料初探

SVM支援向量機系列理論（六） SVM過擬合的原因和SVM模型選擇

6.1 SVM 過擬合的原因實際我們應用的SVM模型都是核函式+軟間隔的支援向量機，那麼，有以下原因導致SVM過擬合：選擇的核函式過於powerful，比如多項式核中的Q設定的次數過高要求的間隔過大，即在軟間隔支援向量機中C的引數過大時，表示比較重視間隔，堅持要資

斯坦福CS229機器學習筆記-Lecture8- SVM支援向量機之核方法 + 軟間隔 + SMO 演算法

作者：teeyohuang 本文系原創，供交流學習使用，轉載請註明出處，謝謝宣告：此係列博文根據斯坦福CS229課程，吳恩達主講所寫，為本人自學筆記，寫成部落格分享出來博文中部分圖片和公式都來源於CS229官方notes。

SVM支援向量機方法——故事篇

一、什麼是SVM？ Mark一個我從百度找來的一個故事：劉強西救愛人很久以前的情人節，魔鬼搶走了旅店老闆劉強西的愛人，劉強西便發誓要救他的愛人。來到魔鬼的城堡前，魔鬼和他玩了一個遊戲，只要他通過了就放走他的愛人。

SVM 支援向量機(2) 軟間隔最大化與核方法

對於某些資料集, 並不能找到一個超平面把它們分開, 也就是說不能找到一組w⃗ ,b, 滿足yi(w⃗ ⋅x⃗ i+b)≥1, 解決辦法就是引入一個鬆弛變數ξi, 讓所有樣本點都滿足yi(w⃗ ⋅x⃗ i+b)≥1−ξi, 這樣得到一個新的約束條件, 可以注意到ξ

[機器學習]svm支援向量機介紹

1 什麼是支援向量機支援向量機是一種分類器，之所以稱為機是因為它會產生一個二值決策結果，即它是一個決策機。 Support Vector Machine, 一個普通的SVM就是一條直線罷了，用來完美劃分linearly separable的兩類。但這又不是一條

【SVM-tutorial】SVM-支援向量機綜述

原文地址：https://www.svm-tutorial.com/ （這篇文章是翻譯 Alexandre KOWALCZYK 的SVM tutorial ，這篇tutorial 寫的很詳細，沒有很好的數學背景的同學也可以看的懂，作者細心的從最基礎的知識講起，帶領我們一步步的認識這個複雜

機器學習實戰——SVM支援向量機實現記錄

問題：TypeError: data type not understood alphas = mat(zeros(m,1)) 原因是zeros(())格式不對，更改後： alphas = mat(zeros((m,1))) 問題：關於IDLE中換行，回車前面出現很多空格的情況

SVM(支援向量機)

Basically, the support vector machine is a binary learning machine with some highly elegant properties. Given a training sample, the support vector machi