【Pandas】Pandas處理本文資料

阿新 • • 發佈：2018-12-10

使用文字資料

Series和Index配備了一組字串處理方法，可以輕鬆地對陣列的每個元素進行操作。也許最重要的是，這些方法會自動排除缺失/ NA值。這些是通過str屬性訪問的，通常具有與等效（標量）內建字串方法匹配的名稱：

In [1]: s = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca', np.nan, 'CABA', 'dog', 'cat'])

# 轉小寫
In [2]: s.str.lower()
Out[2]: 
0       a
1       b
2       c
3    aaba
4    baca
5     NaN
6    caba
7     dog
8     cat
dtype: object

# 轉大寫
In [3]: s.str.upper()
Out[3]: 
0       A
1       B
2       C
3    AABA
4    BACA
5     NaN
6    CABA
7     DOG
8     CAT
dtype: object

# 轉長度
In [4]: s.str.len()
Out[4]: 
0    1.0
1    1.0
2    1.0
3    4.0
4    4.0
5    NaN
6    4.0
7    3.0
8    3.0
dtype: float64

In [5]: idx = pd.Index([' jack', 'jill ', ' jesse ', 'frank'])

# 去除前後空格
In [6]: idx.str.strip()
Out[6]: Index(['jack', 'jill', 'jesse', 'frank'], dtype='object')

# 去除前空格
In [7]: idx.str.lstrip()
Out[7]: Index(['jack', 'jill ', 'jesse ', 'frank'], dtype='object')

# 去除後空格
In [8]: idx.str.rstrip()
Out[8]: Index([' jack', 'jill', ' jesse', 'frank'], dtype='object')

Index上的字串方法對於清理或轉換DataFrame列特別有用。例如，您可能有包含前導或尾隨空格的列：

In [9]: df = pd.DataFrame(randn(3, 2), columns=[' Column A ', ' Column B '],
   ...:                   index=range(3))
   ...: 

In [10]: df
Out[10]: 
    Column A    Column B 
0   -1.425575   -1.336299
1    0.740933    1.032121
2   -1.585660    0.913812

由於df.columns是Index物件，我們可以使用.str訪問器

In [11]: df.columns.str.strip()
Out[11]: Index(['Column A', 'Column B'], dtype='object')

In [12]: df.columns.str.lower()
Out[12]: Index([' column a ', ' column b '], dtype='object')

然後可以根據需要使用這些字串方法清理列。這裡我們刪除前導和尾隨空格，小寫所有名稱，並用下劃線替換任何剩餘的空格：

In [13]: df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')

In [14]: df
Out[14]: 
   column_a  column_b
0 -1.425575 -1.336299
1  0.740933  1.032121
2 -1.585660  0.913812

注意：如果你有一個Series重複大量元素的地方（即它中的獨特元素的數量Series比它的長度小很多 Series），將原始元素轉換Series為型別之一 category然後使用.str.<method>或者更快就可以更快.dt.<property>。效能差異來自以下事實：對於Series型別category，字串操作是在.categories和不在每個元素上完成的 Series。

請注意，帶字串Series的型別在型別字串的比較中有一些限制（例如，您不能相互新增字串：如果是型別，則無法工作）。此外，在這樣的型別上不能使用對型別元素進行操作的方法。category.categoriesSeriess + " " +ssSeriescategory.strlistSeries

拆分和替換字串

方法如split返回一個列表的系列：

In [15]: s2 = pd.Series(['a_b_c', 'c_d_e', np.nan, 'f_g_h'])

In [16]: s2.str.split('_')
Out[16]: 
0    [a, b, c]
1    [c, d, e]
2          NaN
3    [f, g, h]
dtype: object

可以使用get或[]表示法訪問拆分列表中的元素：

In [17]: s2.str.split('_').str.get(1)
Out[17]: 
0      b
1      d
2    NaN
3      g
dtype: object

In [18]: s2.str.split('_').str[1]
Out[18]: 
0      b
1      d
2    NaN
3      g
dtype: object

很容易擴充套件它以使用返回DataFrame expand。

In [19]: s2.str.split('_', expand=True)
Out[19]: 
     0    1    2
0    a    b    c
1    c    d    e
2  NaN  NaN  NaN
3    f    g    h

也可以限制拆分的數量：

In [20]: s2.str.split('_', expand=True, n=1)
Out[20]: 
     0    1
0    a  b_c
1    c  d_e
2  NaN  NaN
3    f  g_h

rsplit類似於split它除了反向工作，即從字串的結尾到字串的開頭：

In [21]: s2.str.rsplit('_', expand=True, n=1)
Out[21]: 
     0    1
0  a_b    c
1  c_d    e
2  NaN  NaN
3  f_g    h

replace預設情況下替換正則表示式：

In [22]: s3 = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca',
   ....:                '', np.nan, 'CABA', 'dog', 'cat'])
   ....: 

In [23]: s3
Out[23]: 
0       A
1       B
2       C
3    Aaba
4    Baca
5        
6     NaN
7    CABA
8     dog
9     cat
dtype: object

In [24]: s3.str.replace('^.a|dog', 'XX-XX ', case=False)
Out[24]: 
0           A
1           B
2           C
3    XX-XX ba
4    XX-XX ca
5            
6         NaN
7    XX-XX BA
8      XX-XX 
9     XX-XX t
dtype: object

必須謹慎保持正則表示式！例如，以下程式碼將導致麻煩，因為$的正則表示式含義：

# 考慮以下格式糟糕的財務資料
In [25]: dollars = pd.Series(['12', '-$10', '$10,000'])

# 這是你天真地期待的：
In [26]: dollars.str.replace('$', '')
Out[26]: 
0        12
1       -10
2    10,000
dtype: object

# 但這並不是:
In [27]: dollars.str.replace('-$', '-')
Out[27]: 
0         12
1       -$10
2    $10,000
dtype: object

# 我們需要擺脫特殊字元（用於1個len模式）
In [28]: dollars.str.replace(r'-\$', '-')
Out[28]: 
0         12
1        -10
2    $10,000
dtype: object

版本0.23.0中的新功能。

如果確實需要文字替換字串（相當於 str.replace()），可以將可選regex引數設定為 False，而不是轉義每個字元。在這種情況下，兩個pat 和repl必須是字串：

# 這些線是等價的
In [29]: dollars.str.replace(r'-\$', '-')
Out[29]: 
0         12
1        -10
2    $10,000
dtype: object

In [30]: dollars.str.replace('-$', '-', regex=False)
Out[30]: 
0         12
1        -10
2    $10,000
dtype: object

版本0.20.0中的新功能。

該replace方法還可以將可呼叫作為替換。每次pat使用都會呼叫它re.sub()。callable應該期望一個位置引數（一個正則表示式物件）並返回一個字串。

# 顛倒每個小寫字母的單詞
In [31]: pat = r'[a-z]+'

In [32]: repl = lambda m: m.group(0)[::-1]

In [33]: pd.Series(['foo 123', 'bar baz', np.nan]).str.replace(pat, repl)
Out[33]: 
0    oof 123
1    rab zab
2        NaN
dtype: object

# 使用正則表示式組
In [34]: pat = r"(?P<one>\w+) (?P<two>\w+) (?P<three>\w+)"

In [35]: repl = lambda m: m.group('two').swapcase()

In [36]: pd.Series(['Foo Bar Baz', np.nan]).str.replace(pat, repl)
Out[36]: 
0    bAR
1    NaN
dtype: object

版本0.20.0中的新功能。

該replace方法還接受來自re.compile()模式的已編譯正則表示式物件。所有標誌都應包含在已編譯的正則表示式物件中。

In [37]: import re

In [38]: regex_pat = re.compile(r'^.a|dog', flags=re.IGNORECASE)

In [39]: s3.str.replace(regex_pat, 'XX-XX ')
Out[39]: 
0           A
1           B
2           C
3    XX-XX ba
4    XX-XX ca
5            
6         NaN
7    XX-XX BA
8      XX-XX 
9     XX-XX t
dtype: object

使用編譯的正則表示式物件flags呼叫時包含一個引數replace將引發一個ValueError。

In [40]: s3.str.replace(regex_pat, 'XX-XX ', flags=re.IGNORECASE)
---------------------------------------------------------------------------
ValueError: case and flags cannot be set when pat is a compiled regex

連線

有幾種方法可以將一個Series或多個方法Index與自身或其他方法連線起來，所有這些方法都基於cat()，或者。Index.str.cat。

將單個系列連線成字串

Series（或Index）的內容可以連線在一起：

In [41]: s = pd.Series(['a', 'b', 'c', 'd'])

In [42]: s.str.cat(sep=',')
Out[42]: 'a,b,c,d'

如果未指定，則sep分隔符的關鍵字預設為空字串，sep=''：

In [43]: s.str.cat()
Out[43]: 'abcd'

預設情況下，將忽略缺失值。使用na_rep，他們可以給出一個表示：

In [44]: t = pd.Series(['a', 'b', np.nan, 'd'])

In [45]: t.str.cat(sep=',')
Out[45]: 'a,b,d'

In [46]: t.str.cat(sep=',', na_rep='-')
Out[46]: 'a,b,-,d'

將系列和類似列表連線成一個系列

第一個引數cat()可以是類似列表的物件，前提是它與呼叫Series（或Index）的長度匹配。

In [47]: s.str.cat(['A', 'B', 'C', 'D'])
Out[47]: 
0    aA
1    bB
2    cC
3    dD
dtype: object

除非 na_rep指定，否則任何一方缺少值都會導致結果中缺少值：

In [48]: s.str.cat(t)
Out[48]: 
0     aa
1     bb
2    NaN
3     dd
dtype: object

In [49]: s.str.cat(t, na_rep='-')
Out[49]: 
0    aa
1    bb
2    c-
3    dd
dtype: object

將系列和類似陣列的類連線成一個系列

版本0.23.0中的新功能。

該引數others也可以是二維的。在這種情況下，一個或多個行必須與呼叫Series（或Index）的長度匹配。

In [50]: d = pd.concat([t, s], axis=1)

In [51]: s
Out[51]: 
0    a
1    b
2    c
3    d
dtype: object

In [52]: d
Out[52]: 
     0  1
0    a  a
1    b  b
2  NaN  c
3    d  d

In [53]: s.str.cat(d, na_rep='-')
Out[53]: 
0    aaa
1    bbb
2    c-c
3    ddd
dtype: object

將系列和索引物件連線成一個系列，具有對齊

版本0.23.0中的新功能。

對於具有串聯Series或DataFrame，可以通過設定對齊連線前的指標join-關鍵詞。

In [54]: u = pd.Series(['b', 'd', 'a', 'c'], index=[1, 3, 0, 2])

In [55]: s
Out[55]: 
0    a
1    b
2    c
3    d
dtype: object

In [56]: u
Out[56]: 
1    b
3    d
0    a
2    c
dtype: object

In [57]: s.str.cat(u)
Out[57]: 
0    ab
1    bd
2    ca
3    dc
dtype: object

In [58]: s.str.cat(u, join='left')
Out[58]: 
0    aa
1    bb
2    cc
3    dd
dtype: object

警告：如果join未傳遞關鍵字，則該方法cat()將回退到版本0.23.0之前的行為（即無對齊），但FutureWarning如果任何涉及的索引不同，則將引發a ，因為此預設值將更改為join='left'將來的版本。

通常的選項可用於join（其中之一）。特別地，對齊還意味著不同長度不再需要重合。'left', 'outer', 'inner','right'

In [59]: v = pd.Series(['z', 'a', 'b', 'd', 'e'], index=[-1, 0, 1, 3, 4])

In [60]: s
Out[60]: 
0    a
1    b
2    c
3    d
dtype: object

In [61]: v
Out[61]: 
-1    z
 0    a
 1    b
 3    d
 4    e
dtype: object

In [62]: s.str.cat(v, join='left', na_rep='-')
Out[62]: 
0    aa
1    bb
2    c-
3    dd
dtype: object

In [63]: s.str.cat(v, join='outer', na_rep='-')
Out[63]: 
-1    -z
 0    aa
 1    bb
 2    c-
 3    dd
 4    -e
dtype: object

當othersa DataFrame：時，可以使用相同的對齊方式：

In [64]: f = d.loc[[3, 2, 1, 0], :]

In [65]: s
Out[65]: 
0    a
1    b
2    c
3    d
dtype: object

In [66]: f
Out[66]: 
     0  1
3    d  d
2  NaN  c
1    b  b
0    a  a

In [67]: s.str.cat(f, join='left', na_rep='-')
Out[67]: 
0    aaa
1    bbb
2    c-c
3    ddd
dtype: object

將一系列和多個物件連線成一個系列

所有一維列表都可以任意組合在類似列表的容器中（包括迭代器，dict-views等）：

In [68]: s
Out[68]: 
0    a
1    b
2    c
3    d
dtype: object

In [69]: u
Out[69]: 
1    b
3    d
0    a
2    c
dtype: object

In [70]: s.str.cat([u, pd.Index(u.values), ['A', 'B', 'C', 'D'], map(str, u.index)], na_rep='-')
Out[70]: 
0    abbA1
1    bddB3
2    caaC0
3    dccD2
dtype: object

所有元素的長度必須與呼叫Series（或Index）匹配，但具有索引的那些元素（如果join不是None ）除外：

In [71]: v
Out[71]: 
-1    z
 0    a
 1    b
 3    d
 4    e
dtype: object

In [72]: s.str.cat([u, v, ['A', 'B', 'C', 'D']], join='outer', na_rep='-')
Out[72]: 
-1    --z-
 0    aaaA
 1    bbbB
 2    cc-C
 3    dddD
 4    --e-
dtype: object

如果join='right'在others包含不同索引的列表上使用，則這些索引的並集將用作最終串聯的基礎：

In [73]: u.loc[[3]]
Out[73]: 
3    d
dtype: object

In [74]: v.loc[[-1, 0]]
Out[74]: 
-1    z
 0    a
dtype: object

In [75]: s.str.cat([u.loc[[3]], v.loc[[-1, 0]]], join='right', na_rep='-')
Out[75]: 
-1    --z
 0    a-a
 3    dd-
dtype: object

索引與`.str`

您可以使用[]符號直接按位置索引。如果索引超過字串的結尾，則結果為a NaN。

In [76]: s = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca', np.nan,
   ....:                'CABA', 'dog', 'cat'])
   ....: 

In [77]: s.str[0]
Out[77]: 
0      A
1      B
2      C
3      A
4      B
5    NaN
6      C
7      d
8      c
dtype: object

In [78]: s.str[1]
Out[78]: 
0    NaN
1    NaN
2    NaN
3      a
4      a
5    NaN
6      A
7      o
8      a
dtype: object

提取子字串

提取每個主題的第一場比賽（摘錄）

警告：在版本0.18.0中，extract獲得了expand爭論。當 expand=False它返回a Series，Index或時 DataFrame，取決於主題和正則表示式模式（與0.18.0之前的行為相同）。當expand=True它總是返回a時DataFrame，從使用者的角度來看，它更加一致並且不那麼混亂。expand=True是自版本0.23.0以來的預設值。

該extract方法接受具有至少一個捕獲組的正則表示式。

提取具有多個組的正則表示式將返回一個DataFrame，每個組包含一列。

In [79]: pd.Series(['a1', 'b2', 'c3']).str.extract('([ab])(\d)', expand=False)
Out[79]: 
     0    1
0    a    1
1    b    2
2  NaN  NaN

不匹配的元素返回一行填充NaN。因此，一系列雜亂的字串可以“轉換”為類似索引的系列或DataFrame中的清理或更有用的字串，而無需get()訪問元組或re.match物件。結果的dtype始終是object，即使找不到匹配且結果只包含NaN。

命名組如

In [80]: pd.Series(['a1', 'b2', 'c3']).str.extract('(?P<letter>[ab])(?P<digit>\d)', expand=False)
Out[80]: 
  letter digit
0      a     1
1      b     2
2    NaN   NaN

和可選組一樣

In [81]: pd.Series(['a1', 'b2', '3']).str.extract('([ab])?(\d)', expand=False)
Out[81]: 
     0  1
0    a  1
1    b  2
2  NaN  3

也可以使用。請注意，正則表示式中的任何捕獲組名稱都將用於列名稱; 否則將使用捕獲組編號。

DataFrame 如果使用一個組提取正則表示式，則返回一個列expand=True。

In [82]: pd.Series(['a1', 'b2', 'c3']).str.extract('[ab](\d)', expand=True)
Out[82]: 
     0
0    1
1    2
2  NaN

它返回一個Series if expand=False。

In [83]: pd.Series(['a1', 'b2', 'c3']).str.extract('[ab](\d)', expand=False)
Out[83]: 
0      1
1      2
2    NaN
dtype: object

呼叫上Index用正則表示式可以精確到一個捕獲組返回一個DataFrame帶有如果一列expand=True。

In [84]: s = pd.Series(["a1", "b2", "c3"], ["A11", "B22", "C33"])

In [85]: s
Out[85]: 
A11    a1
B22    b2
C33    c3
dtype: object

In [86]: s.index.str.extract("(?P<letter>[a-zA-Z])", expand=True)
Out[86]: 
  letter
0      A
1      B
2      C

它返回一個Indexif expand=False。

In [87]: s.index.str.extract("(?P<letter>[a-zA-Z])", expand=False)
Out[87]: Index(['A', 'B', 'C'], dtype='object', name='letter')

Index使用具有多個捕獲組的正則表示式呼叫返回DataFrameif expand=True。

In [88]: s.index.str.extract("(?P<letter>[a-zA-Z])([0-9]+)", expand=True)
Out[88]: 
  letter   1
0      A  11
1      B  22
2      C  33

它提出ValueError如果expand=False。

>>> s.index.str.extract("(?P<letter>[a-zA-Z])([0-9]+)", expand=False)
ValueError: only one regex group is supported with Index

下表總結了extract(expand=False) （第一列中的輸入主題，第一行中的正則表示式中的組數）的行為

1組	> 1組
指數	指數	ValueError異常
系列	系列	資料幀

提取每個主題的所有匹配（extractall）

版本0.18.0中的新功能。

與extract（僅返回第一場比賽）不同，

In [89]: s = pd.Series(["a1a2", "b1", "c1"], index=["A", "B", "C"])

In [90]: s
Out[90]: 
A    a1a2
B      b1
C      c1
dtype: object

In [91]: two_groups = '(?P<letter>[a-z])(?P<digit>[0-9])'

In [92]: s.str.extract(two_groups, expand=True)
Out[92]: 
  letter digit
A      a     1
B      b     1
C      c     1

該extractall方法返回每個匹配。結果 extractall總是DataFrame帶有MultiIndex一行。最後一級MultiIndex被命名match並指示主題中的順序。

In [93]: s.str.extractall(two_groups)
Out[93]: 
        letter digit
  match             
A 0          a     1
  1          a     2
B 0          b     1
C 0          c     1

當系列中的每個主題字串只有一個匹配時，

In [94]: s = pd.Series(['a3', 'b3', 'c2'])

In [95]: s
Out[95]: 
0    a3
1    b3
2    c2
dtype: object

然後給出相同的結果。extractall(pat).xs(0, level='match')extract(pat)

In [96]: extract_result = s.str.extract(two_groups, expand=True)

In [97]: extract_result
Out[97]: 
  letter digit
0      a     3
1      b     3
2      c     2

In [98]: extractall_result = s.str.extractall(two_groups)

In [99]: extractall_result
Out[99]: 
        letter digit
  match             
0 0          a     3
1 0          b     3
2 0          c     2

In [100]: extractall_result.xs(0, level="match")
Out[100]: 
  letter digit
0      a     3
1      b     3
2      c     2

Index也支援.str.extractall。它返回的DataFrame結果Series.str.extractall與使用預設索引的結果相同（從0開始）。

版本0.19.0中的新功能。

In [101]: pd.Index(["a1a2", "b1", "c1"]).str.extractall(two_groups)
Out[101]: 
        letter digit
  match             
0 0          a     1
  1          a     2
1 0          b     1
2 0          c     1

In [102]: pd.Series(["a1a2", "b1", "c1"]).str.extractall(two_groups)
Out[102]: 
        letter digit
  match             
0 0          a     1
  1          a     2
1 0          b     1
2 0          c     1

測試匹配或包含模式的字串

您可以檢查元素是否包含模式：

In [103]: pattern = r'[0-9][a-z]'

In [104]: pd.Series(['1', '2', '3a', '3b', '03c']).str.contains(pattern)
Out[104]: 
0    False
1    False
2     True
3     True
4     True
dtype: bool

或者元素是否與模式匹配：

In [105]: pd.Series(['1', '2', '3a', '3b', '03c']).str.match(pattern)
Out[105]: 
0    False
1    False
2     True
3     True
4    False
dtype: bool

區別match和contains嚴格之間的區別：match 依賴嚴格re.match，contains依賴re.search。

方法，如match，contains，startswith，並endswith採取額外的na引數，所以遺漏值可以被認為是真或假：

In [106]: s4 = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca', np.nan, 'CABA', 'dog', 'cat'])

In [107]: s4.str.contains('A', na=False)
Out[107]: 
0     True
1    False
2    False
3     True
4    False
5    False
6     True
7    False
8    False
dtype: bool

建立指標變數

您可以從字串列中提取虛擬變數。例如，如果它們被以下分隔'|'：

In [108]: s = pd.Series(['a', 'a|b', np.nan, 'a|c'])

In [109]: s.str.get_dummies(sep='|')
Out[109]: 
   a  b  c
0  1  0  0
1  1  1  0
2  0  0  0
3  1  0  1

String Index還支援get_dummies返回a MultiIndex。

版本0.18.1中的新功能。

In [110]: idx = pd.Index(['a', 'a|b', np.nan, 'a|c'])

In [111]: idx.str.get_dummies(sep='|')
Out[111]: 
MultiIndex(levels=[[0, 1], [0, 1], [0, 1]],
           labels=[[1, 1, 0, 1], [0, 1, 0, 0], [0, 0, 0, 1]],
           names=['a', 'b', 'c'])

方法摘要

方法	描述
`cat()`	連線字串
從字串末尾開始分隔字串上的字串
`get()`	索引到每個元素（檢索第i個元素）
使用傳遞的分隔符連線Series的每個元素中的字串
在分隔符上拆分字串，返回虛擬變數的DataFrame
如果每個字串包含pattern / regex，則返回布林陣列
將pattern / regex / string的出現替換為其他字串或給定事件的可呼叫的返回值
重複值（`s.str.repeat(3)`相當於）`x * 3`
`pad()`	在字串的左側，右側或兩側新增空格
相當於 `str.center`
相當於 `str.ljust`
相當於 `str.rjust`
相當於 `str.zfill`
將長字串拆分為長度小於給定寬度的行
切割系列中的每個字串
相當於`str.startswith(pat)`每個元素
相當於`str.endswith(pat)`每個元素
計算每個字串的所有出現的pattern / regex的列表
呼叫`re.match`每個元素，將匹配的組作為列表返回
呼叫`re.search`每個元素，為每個元素返回一行DataFrame，為每個正則表示式捕獲組返回一列
呼叫`re.findall`每個元素，為每個匹配返回一行DataFrame，為每個正則表示式捕獲組返回一列
`len()`	計算字串長度
相當於 `str.strip`
相當於 `str.rstrip`
相當於 `str.lstrip`
相當於 `str.lower`
相當於 `str.upper`
相當於 `str.find`
相當於 `str.rfind`
相當於 `str.index`
相當於 `str.rindex`
相當於 `str.swapcase`
返回Unicode普通表單。相當於`unicodedata.normalize`
相當於 `str.isalnum`
相當於 `str.isalpha`
相當於 `str.isdigit`
相當於 `str.isspace`
相當於 `str.islower`
相當於 `str.isupper`
相當於 `str.istitle`

【Pandas】Pandas處理本文資料

使用文字資料

拆分和替換字串

連線

將單個系列連線成字串

將系列和類似列表連線成一個系列

將系列和類似陣列的類連線成一個系列

將系列和索引物件連線成一個系列，具有對齊

將一系列和多個物件連線成一個系列

索引與`.str`

提取子字串

提取每個主題的第一場比賽（摘錄）

提取每個主題的所有匹配（extractall）

測試匹配或包含模式的字串

建立指標變數

方法摘要

【SpringMVC】5.處理模型資料

【原創】Python處理海量資料的實戰研究

【Pandas】Pandas處理本文資料

【pandas】[3] DataFrame通過資料型別選擇子資料框

【python】pandas庫pd.read_pickle操作讀取pickle資料與.to_pickle()永久儲存資料

【Python】“pandas”庫“to_sql”報錯“Invalid MySQL identifier”處理記錄

【轉】pandas DataFrame 逐行操作（可修改資料）

【轉】PANDAS 數據合並與重塑（concat篇）

【pandas】pandas.Series.str.split()---字符串分割

【pandas】pandas.to_datatime()---時間格式轉換

【338】Pandas.DataFrame

【Pandas】Pandas的時間與日期

【model02】pandas

pandas知識點（處理缺失資料）

【Python】pandas軸旋轉stack和unstack用法詳解

【Python3】pandas.read_csv詳解

【Python】Pandas 的 apply 函式使用示例

【JavaScript】ajax請求的資料返回時間戳使用js處理方案

【mysql】mysql插入中文資料變成問號怎麼處理

【翻譯】Pandas 十分鐘入門

【Pandas】Pandas處理本文資料

使用文字資料

拆分和替換字串

連線

將單個系列連線成字串

將系列和類似列表連線成一個系列

將系列和類似陣列的類連線成一個系列

將系列和索引物件連線成一個系列，具有對齊

將一系列和多個物件連線成一個系列

索引與.str

提取子字串

提取每個主題的第一場比賽（摘錄）

提取每個主題的所有匹配（extractall）

測試匹配或包含模式的字串

建立指標變數

方法摘要

相關推薦

索引與`.str`