Seaborn分佈資料視覺化---散點分佈圖

阿新 • • 發佈：2022-01-07

散點分佈圖

綜合表示散點圖和直方分佈圖。

Jointplot()

繪製二變數或單變數的圖形，底層是JointGrid()。

sns.jointplot(
    x,
    y,
    data=None,
    kind='scatter',
    stat_func=None,
    color=None,
    height=6,
    ratio=5,
    space=0.2,
    dropna=True,
    xlim=None,
    ylim=None,
    joint_kws=None,
    marginal_kws=None,
    annot_kws=None,
    **kwargs,
)
Docstring:
Draw a plot of two variables with bivariate and univariate graphs.

This function provides a convenient interface to the :class:`JointGrid`
class, with several canned plot kinds. This is intended to be a fairly
lightweight wrapper; if you need more flexibility, you should use
:class:`JointGrid` directly.

Parameters
----------
x, y : strings or vectors
    Data or names of variables in ``data``.
data : DataFrame, optional
    DataFrame when ``x`` and ``y`` are variable names.
kind : { "scatter" | "reg" | "resid" | "kde" | "hex" }, optional
    Kind of plot to draw.
stat_func : callable or None, optional
    *Deprecated*
color : matplotlib color, optional
    Color used for the plot elements.
height : numeric, optional
    Size of the figure (it will be square).
ratio : numeric, optional
    Ratio of joint axes height to marginal axes height.
space : numeric, optional
    Space between the joint and marginal axes
dropna : bool, optional
    If True, remove observations that are missing from ``x`` and ``y``.
{x, y}lim : two-tuples, optional
    Axis limits to set before plotting.
{joint, marginal, annot}_kws : dicts, optional
    Additional keyword arguments for the plot components.
kwargs : key, value pairings
    Additional keyword arguments are passed to the function used to
    draw the plot on the joint Axes, superseding items in the
    ``joint_kws`` dictionary.

Returns
-------
grid : :class:`JointGrid`
    :class:`JointGrid` object with the plot on it.

See Also
--------
JointGrid : The Grid class used for drawing this plot. Use it directly if
            you need more flexibility.

#綜合散點分佈圖-jointplot

#建立DataFrame陣列
rs = np.random.RandomState(3)
df = pd.DataFrame(rs.randn(200,2), columns=['A','B'])

#繪製綜合散點分佈圖jointplot()
sns.jointplot(x=df['A'], y=df['B'],     #設定x和y軸的資料
              data=df,                  #設定資料
              color='k',
              s=50, edgecolor='w', linewidth=1,  #散點大小、邊緣線顏色和寬度（只針對scatter）
              kind='scatter',                    #預設型別：“scatter”，其他有“reg”、“resid”、“kde” 
              space=0.2,                         #設定散點圖和佈局圖的間距
              height=8,                          #圖表的大小（自動調整為正方形）
              ratio=5,                           #散點圖與佈局圖高度比率
              stat_func= sci.pearsonr,           #pearson相關係數           
              marginal_kws=dict(bins=15, rug=True))    #邊際圖的引數

sns.jointplot(x=df['A'], y=df['B'],
              data=df,
              color='k',
              kind='reg',             #reg新增線性迴歸線
              height=8,
              ratio=5,
              stat_func= sci.pearsonr, 
              marginal_kws=dict(bins=15, rug=True))

sns.jointplot(x=df['A'], y=df['B'],
              data=df,
              color='k',
              kind='resid',             #resid
              height=8,
              ratio=5, 
              marginal_kws=dict(bins=15, rug=True))

sns.jointplot(x=df['A'], y=df['B'],
              data=df,
              color='k',
              kind='kde',             #kde密度圖
              height=8,
              ratio=5)

sns.jointplot(x=df['A'], y=df['B'],
              data=df,
              color='k',
              kind='hex',             #hex蜂窩圖(六角形)
              height=8,
              ratio=5)

g = sns.jointplot(x=df['A'], y=df['B'],
              data=df,
              color='k',
              kind='kde',             #kde密度圖
              height=8,
              ratio=5,
              shade_lowest=False)

#新增散點圖(c-->顏色，s-->大小)
g.plot_joint(plt.scatter, c='w', s=10, linewidth=1, marker='+')

JointGrid()

建立圖形網格，用於繪製二變數或單變數的圖形，作用和Jointplot()一樣，不過比Jointplot()更靈活。

sns.JointGrid(
    x,
    y,
    data=None,
    height=6,
    ratio=5,
    space=0.2,
    dropna=True,
    xlim=None,
    ylim=None,
    size=None,
)
Docstring:      Grid for drawing a bivariate plot with marginal univariate plots.
Init docstring:
Set up the grid of subplots.

Parameters
----------
x, y : strings or vectors
    Data or names of variables in ``data``.
data : DataFrame, optional
    DataFrame when ``x`` and ``y`` are variable names.
height : numeric
    Size of each side of the figure in inches (it will be square).
ratio : numeric
    Ratio of joint axes size to marginal axes height.
space : numeric, optional
    Space between the joint and marginal axes
dropna : bool, optional
    If True, remove observations that are missing from `x` and `y`.
{x, y}lim : two-tuples, optional
    Axis limits to set before plotting.

See Also
--------
jointplot : High-level interface for drawing bivariate plots with
            several different default plot kinds.

#設定風格
sns.set_style('white')
#匯入資料
tip_datas = sns.load_dataset('tips', data_home='seaborn-data')

#繪製繪圖網格，包含三部分：一個主繪圖區域，兩個邊際繪圖區域
g = sns.JointGrid(x='total_bill', y='tip', data=tip_datas)

#主繪圖區域：散點圖
g.plot_joint(plt.scatter, color='m', edgecolor='w', alpha=.3)

#邊際繪圖區域：x和y軸
g.ax_marg_x.hist(tip_datas['total_bill'], color='b', alpha=.3)
g.ax_marg_y.hist(tip_datas['tip'], color='r', alpha=.3,
                 orientation='horizontal')

#相關係數標籤
from scipy import stats
g.annotate(stats.pearsonr)

#繪製表格線
plt.grid(linestyle='--')

g = sns.JointGrid(x='total_bill', y='tip', data=tip_datas)
g = g.plot_joint(plt.scatter, color='g', s=40, edgecolor='white')
plt.grid(linestyle='--')
#兩邊邊際圖用統一函式設定統一風格
g.plot_marginals(sns.distplot, kde=True, color='g')

g = sns.JointGrid(x='total_bill', y='tip', data=tip_datas)
#主繪圖設定密度圖
g = g.plot_joint(sns.kdeplot, cmap='Reds_r')
plt.grid(linestyle='--')
#兩邊邊際圖用統一函式設定統一風格
g.plot_marginals(sns.distplot, kde=True, color='g')

Pairplot()

用於資料集的相關性圖形繪製，如：矩陣圖，底層是PairGrid()。

sns.pairplot(
    data,
    hue=None,
    hue_order=None,
    palette=None,
    vars=None,
    x_vars=None,
    y_vars=None,
    kind='scatter',
    diag_kind='auto',
    markers=None,
    height=2.5,
    aspect=1,
    dropna=True,
    plot_kws=None,
    diag_kws=None,
    grid_kws=None,
    size=None,
)
Docstring:
Plot pairwise relationships in a dataset.

By default, this function will create a grid of Axes such that each
variable in ``data`` will by shared in the y-axis across a single row and
in the x-axis across a single column. The diagonal Axes are treated
differently, drawing a plot to show the univariate distribution of the data
for the variable in that column.

It is also possible to show a subset of variables or plot different
variables on the rows and columns.

This is a high-level interface for :class:`PairGrid` that is intended to
make it easy to draw a few common styles. You should use :class:`PairGrid`
directly if you need more flexibility.

Parameters
----------
data : DataFrame
    Tidy (long-form) dataframe where each column is a variable and
    each row is an observation.
hue : string (variable name), optional
    Variable in ``data`` to map plot aspects to different colors.
hue_order : list of strings
    Order for the levels of the hue variable in the palette
palette : dict or seaborn color palette
    Set of colors for mapping the ``hue`` variable. If a dict, keys
    should be values  in the ``hue`` variable.
vars : list of variable names, optional
    Variables within ``data`` to use, otherwise use every column with
    a numeric datatype.
{x, y}_vars : lists of variable names, optional
    Variables within ``data`` to use separately for the rows and
    columns of the figure; i.e. to make a non-square plot.
kind : {'scatter', 'reg'}, optional
    Kind of plot for the non-identity relationships.
diag_kind : {'auto', 'hist', 'kde'}, optional
    Kind of plot for the diagonal subplots. The default depends on whether
    ``"hue"`` is used or not.
markers : single matplotlib marker code or list, optional
    Either the marker to use for all datapoints or a list of markers with
    a length the same as the number of levels in the hue variable so that
    differently colored points will also have different scatterplot
    markers.
height : scalar, optional
    Height (in inches) of each facet.
aspect : scalar, optional
    Aspect * height gives the width (in inches) of each facet.
dropna : boolean, optional
    Drop missing values from the data before plotting.
{plot, diag, grid}_kws : dicts, optional
    Dictionaries of keyword arguments.

Returns
-------
grid : PairGrid
    Returns the underlying ``PairGrid`` instance for further tweaking.

See Also
--------
PairGrid : Subplot grid for more flexible plotting of pairwise
           relationships.

#匯入鳶尾花資料
i_datas = sns.load_dataset('iris', data_home='seaborn-data')
i_datas

#矩陣散點圖
sns.pairplot(i_datas,
             kind='scatter',                 #圖形型別（散點圖：scatter, 迴歸分佈圖：reg）
             diag_kind='hist',               #對角線的圖形型別（直方圖：hist, 密度圖：kde）
             hue='species',                  #按照某一欄位分類
             palette='husl',                 #設定調色盤
             markers=['o','s','D'],          #設定點樣式
             height=2)                       #設定圖示大小

#矩陣迴歸分析圖
sns.pairplot(i_datas,
             kind='reg',                     #圖形型別（散點圖：scatter, 迴歸分佈圖：reg）
             diag_kind='kde',                #對角線的圖形型別（直方圖：hist, 密度圖：kde）
             hue='species',                  #按照某一欄位分類
             palette='husl',                 #設定調色盤
             markers=['o','s','D'],          #設定點樣式
             height=2)                       #設定圖示大小

#區域性變數選擇,vars
g = sns.pairplot(i_datas, vars=['sepal_width', 'sepal_length'],
                 kind='reg', diag_kind='kde',
                 hue='species', palette='husl')

#綜合引數設定
sns.pairplot(i_datas, diag_kind='kde', markers='+', hue='species',
             #散點圖的引數
             plot_kws=dict(s=50, edgecolor='b', linewidth=1),
             #對角線圖的引數
             diag_kws=dict(shade=True))

PairGrid()

用於資料集的相關性圖形繪製，如：矩陣圖。功能比Pairplot()更加靈活。

sns.PairGrid(
    data,
    hue=None,
    hue_order=None,
    palette=None,
    hue_kws=None,
    vars=None,
    x_vars=None,
    y_vars=None,
    diag_sharey=True,
    height=2.5,
    aspect=1,
    despine=True,
    dropna=True,
    size=None,
)
Docstring:     
Subplot grid for plotting pairwise relationships in a dataset.

This class maps each variable in a dataset onto a column and row in a
grid of multiple axes. Different axes-level plotting functions can be
used to draw bivariate plots in the upper and lower triangles, and the
the marginal distribution of each variable can be shown on the diagonal.

It can also represent an additional level of conditionalization with the
``hue`` parameter, which plots different subets of data in different
colors. This uses color to resolve elements on a third dimension, but
only draws subsets on top of each other and will not tailor the ``hue``
parameter for the specific visualization the way that axes-level functions
that accept ``hue`` will.

See the :ref:`tutorial <grid_tutorial>` for more information.
Init docstring:
Initialize the plot figure and PairGrid object.

Parameters
----------
data : DataFrame
    Tidy (long-form) dataframe where each column is a variable and
    each row is an observation.
hue : string (variable name), optional
    Variable in ``data`` to map plot aspects to different colors.
hue_order : list of strings
    Order for the levels of the hue variable in the palette
palette : dict or seaborn color palette
    Set of colors for mapping the ``hue`` variable. If a dict, keys
    should be values  in the ``hue`` variable.
hue_kws : dictionary of param -> list of values mapping
    Other keyword arguments to insert into the plotting call to let
    other plot attributes vary across levels of the hue variable (e.g.
    the markers in a scatterplot).
vars : list of variable names, optional
    Variables within ``data`` to use, otherwise use every column with
    a numeric datatype.
{x, y}_vars : lists of variable names, optional
    Variables within ``data`` to use separately for the rows and
    columns of the figure; i.e. to make a non-square plot.
height : scalar, optional
    Height (in inches) of each facet.
aspect : scalar, optional
    Aspect * height gives the width (in inches) of each facet.
despine : boolean, optional
    Remove the top and right spines from the plots.
dropna : boolean, optional
    Drop missing values from the data before plotting.

See Also
--------
pairplot : Easily drawing common uses of :class:`PairGrid`.
FacetGrid : Subplot grid for plotting conditional relationships.

#繪製四個引數vars的繪圖網格(subplots)
g = sns.PairGrid(i_datas, hue='species', palette='hls',
                 vars=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'])

#對角線圖形繪製
g.map_diag(plt.hist,
           histtype='step',             #可選：'bar'\ 'barstacked'\'step'\'stepfilled'
           linewidth=1)

#非對角線圖形繪製
g.map_offdiag(plt.scatter, s=40, linewidth=1)

#新增圖例
g.add_legend()

g = sns.PairGrid(i_datas)

#主對角線圖形
g.map_diag(sns.kdeplot)

#上三角圖形
g.map_upper(plt.scatter)

#下三角圖形
g.map_lower(sns.kdeplot, cmap='Blues_d')

Seaborn分佈資料視覺化---散點分佈圖

散點分佈圖綜合表示散點圖和直方分佈圖。 Jointplot() 繪製二變數或單變數的圖形，底層是JointGrid()。

Seaborn分佈資料視覺化---箱型分佈圖

箱型分佈圖 boxplot() sns.boxplot( x=None, y=None, hue=None, data=None, order=None, hue_order=None, orient=None,

Seaborn分佈資料視覺化---直方圖/密度圖

直方圖\\密度圖直方圖和密度圖一般用於分佈資料的視覺化。 distplot 用於繪製單變數的分佈圖，包括直方圖和密度圖。

資料視覺化：散點圖

效果圖繪製散點圖導包匯入視覺化所用的包matplotlib from matplotlib import pyplot as plt

SoviChart資料視覺化：散點圖（Scatter plot）

什麼是散點圖散點圖也可以稱為 x-y 圖，用於展示資料的相關性和分佈關係，由X軸和Y軸兩個變數組成。通過因變數(Y軸數值)隨自變數(X軸數值)變化的呈現資料的大致趨勢，同時支援從類別和顏色兩個維度觀察資料的分佈情

Python資料視覺化:冪律分佈例項詳解

1、公式推導對冪律分佈公式：對公式兩邊同時取以10為底的對數：所以對於冪律公式，對X,Y取對數後，在座標軸上為線性方程。

Python資料視覺化:泊松分佈詳解

一個服從泊松分佈的隨機變數X，表示在具有比率引數（rate parameter）λ的一段固定時間間隔內，事件發生的次數。引數λ告訴你該事件發生的比率。隨機變數X的平均值和方差都是λ。

Python seaborn資料視覺化

使用seaborn進行資料視覺化¶ seaborn 簡介¶ Seaborn是一種基於matplotlib的圖形視覺化python libraty。它提供了一種高度互動式介面，便於使用者能夠做出各種有吸引力的統計圖表。Seaborn其實

（在模仿中精進資料視覺化04）舊金山街道樹木分佈視覺化

本文完整程式碼及資料已上傳至我的Github倉庫https://github.com/CNFeffery/FefferyViz 1 簡介

對磚石屬性表進行資料視覺化分析（使用seaborn工具）

對磚石屬性表進行視覺化分析一、資料描述 1.資料解釋該資料是對磚石的一些屬性進行視覺化分析，共53940條資料，共10個欄位，下面開始介紹個欄位：

資料視覺化（直方圖、累積分佈圖、箱線圖、點陣圖）

1視覺化探索 1.1 直方圖這是一種簡單快速探索資料分佈的方式。以Insurance資料集中過的“索賠量”變數Claims為例，觀察該變數的分佈情況。

Seaborn線性關係資料視覺化

regplot() 繪製兩個變數的線性擬合圖。 sns.regplot( x, y, data=None, x_estimator=None, x_bins=None,

C#呼叫PclSharp點雲資料視覺化

一、在C#中呼叫點雲庫PCL 自己做專案一直使用的C#，用來寫介面也比較方便。由於需要做3D點雲資料處理方面的操作，用到了開源庫PCL,但是PCL點雲庫是用C++寫的。自己封裝來實現呼叫確實是一種比較靠譜的方法，但對於時

超級好用的 Java 資料視覺化庫：Tablesaw

本文作者：HelloGitHub-秦人本文適合剛學習完 Java 語言基礎的人群，跟著本文可瞭解和使用 Tablesaw 專案。示例均在 Windows 作業系統下演示

Python資料視覺化:頂級繪相簿plotly詳解

有史以來最牛逼的繪圖工具，沒有之一 plotly是現代平臺的敏捷商業智慧和資料科學庫，它作為一款開源的繪相簿，可以應用於Python、R、MATLAB、Excel、JavaScript和jupyter等多種語言，主要使用的js進行圖形繪製，實現

Python資料視覺化:餅狀圖的例項講解

使用python實現論文裡面的餅狀圖：原圖： python程式碼實現： # # 餅狀圖 # plot.figure(figsize=(8,8))

wxPython繪圖模組wxPyPlot實現資料視覺化

本文例項為大家分享了wxPython繪圖模組wxPyPlot實現資料視覺化的具體程式碼，供大家參考，具體內容如下

Python資料視覺化：箱線圖多種庫畫法

概念箱線圖通過資料的四分位數來展示資料的分佈情況。例如：資料的中心位置，資料間的離散程度，是否有異常值等。

Python資料視覺化處理庫PyEcharts柱狀圖,餅圖,線性圖,詞雲圖常用例項詳解

python可以在處理各種資料時，如果可以將這些資料，利用圖表將其視覺化，這樣在分析處理起來，將更加直觀、清晰，以下是利用 PyEcharts 常用圖表的視覺化Demo,開發環境 python3

python程式碼實現TSNE降維資料視覺化教程

TSNE降維降維就是用2維或3維表示多維資料（彼此具有相關性的多個特徵資料）的技術，利用降維演算法，可以顯式地表現資料。（t-SNE）t分佈隨機鄰域嵌入是一種用於探索高維資料的非線性降維演算法。它將多維資料對映

Seaborn分佈資料視覺化---散點分佈圖

散點分佈圖

Jointplot()

JointGrid()

Pairplot()

PairGrid()

相關推薦