在Python-Pandas中迴圈或遍歷資料框的所有或某些列
阿新 • • 發佈:2020-10-25
在本文中,我們將討論如何迴圈或迭代DataFrame的全部或某些列?有多種方法可以完成此任務。
首先建立一個數據框,然後看一下:
程式碼:
# import pandas package import pandas as pd # List of Tuples students = [('Ankit', 22, 'A'), ('Swapnil', 22, 'B'), ('Priya', 22, 'B'), ('Shivangi', 22, 'B'), ] # Create a DataFrameobject stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'], index =['1', '2', '3', '4']) stu_df
現在,讓我們看看不同的方式來迭代DataFrame或某些列:
方法#1:使用DataFrame.iteritems():
Dataframe類提供了一個成員函式iteritems()
,該函式提供了一個迭代器,該迭代器可用於迭代資料幀的所有列。對於Dataframe中的每一列,它將返回一個迭代器到包含列名稱及其內容為序列的元組。
程式碼:
import pandas as pd # List of Tuples students = [('Ankit', 22, 'A'), ('Swapnil', 22, 'B'), ('Priya', 22, 'B'), ('Shivangi', 22, 'B'), ] # Create a DataFrame object stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'], index=['1', '2', '3', '4']) # gives a tuple of column name and series # for each column in the dataframe for (columnName, columnData) in stu_df.iteritems(): print('Colunm Name : ', columnName) print('Column Contents : ', columnData.values)
方法2:使用[]運算子:
我們可以遍歷列名並選擇所需的列。
程式碼:
import pandas as pd # List of Tuples students = [('Ankit', 22, 'A'), ('Swapnil', 22, 'B'), ('Priya', 22, 'B'), ('Shivangi', 22, 'B'), ] # Create a DataFrame object stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'], index =['1', '2', '3', '4']) # Iterate over column names for column in stu_df: # Select column contents by column # name using [] operator columnSeriesObj = stu_df[column] print('Colunm Name : ', column) print('Column Contents : ', columnSeriesObj.values)
輸出:
方法3:迭代多於一列:
假設我們需要迭代多於一列。為此,我們可以從資料框中選擇多個列並對其進行迭代。
程式碼:
import pandas as pd # List of Tuples students = [('Ankit', 22, 'A'), ('Swapnil', 22, 'B'), ('Priya', 22, 'B'), ('Shivangi', 22, 'B'), ] # Create a DataFrame object stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'], index =['1', '2', '3', '4']) # Iterate over two given columns # only from the dataframe for column in stu_df[['Name', 'Section']]: # Select column contents by column # name using [] operator columnSeriesObj = stu_df[column] print('Colunm Name : ', column) print('Column Contents : ', columnSeriesObj.values)
輸出:
方法4:以相反的順序迭代列:
我們也可以以相反的順序遍歷列。
程式碼:
import pandas as pd # List of Tuples students = [('Ankit', 22, 'A'), ('Swapnil', 22, 'B'), ('Priya', 22, 'B'), ('Shivangi', 22, 'B'), ] # Create a DataFrame object stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'], index =['1', '2', '3', '4']) # Iterate over the sequence of column names # in reverse order for column in reversed(stu_df.columns): # Select column contents by column # name using [] operator columnSeriesObj = stu_df[column] print('Colunm Name : ', column) print('Column Contents : ', columnSeriesObj.values)
方法5:使用索引(iloc):
要按索引遍歷Dataframe的列,我們可以遍歷一個範圍(即0到最大列數),而對於每個索引,我們可以使用iloc []選擇列的內容。
程式碼:
import pandas as pd # List of Tuples students = [('Ankit', 22, 'A'), ('Swapnil', 22, 'B'), ('Priya', 22, 'B'), ('Shivangi', 22, 'B'), ] # Create a DataFrame object stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'], index =['1', '2', '3', '4']) # Iterate over the index range from # 0 to max number of columns in dataframe for index in range(stu_df.shape[1]): print('Column Number : ', index) # Select column by index position using iloc[] columnSeriesObj = stu_df.iloc[:, index] print('Column Contents : ', columnSeriesObj.values)輸出: