Python-Pandas(6)資料索引變換

阿新 • • 發佈：2019-02-15

import pandas as pd

這裡寫圖片描述

#will return a new DataFrame that is indexed by the values in the specified column 
#and will drop that column from the DataFrame
#without the FILM column dropped 
fandango = pd.read_csv('fandango_score_comparison.csv')
print type(fandango)
fandango_films = fandango.set_index('FILM' 
, drop=False)
#print(fandango_films.index)

這裡寫圖片描述

# Slice using either bracket notation or loc[]
fandango_films["Avengers: Age of Ultron (2015)":"Hot Tub Time Machine 2 (2015)"]
fandango_films.loc["Avengers: Age of Ultron (2015)":"Hot Tub Time Machine 2 (2015)"]

# Specific movie
fandango_films.loc['Kumiko, The Treasure Hunter (2015)' 
]

# Selecting list of movies
movies = ['Kumiko, The Treasure Hunter (2015)', 'Do You Believe? (2015)', 'Ant-Man (2015)']
fandango_films.loc[movies]

#When selecting multiple rows, a DataFrame is returned, 
#but when selecting an individual row, a Series object is returned instead

這裡寫圖片描述

#The apply() method in Pandas allows us to specify Python logic 

#The apply() method requires you to pass in a vectorized operation 
#that can be applied over each Series object.
import numpy as np

# returns the data types as a Series
types = fandango_films.dtypes
#print types
# filter data types to just floats, index attributes returns just column names
float_columns = types[types.values == 'float64'].index
# use bracket notation to filter columns to just float columns
float_df = fandango_films[float_columns]
#print float_df
# `x` is a Series object representing a column
deviations = float_df.apply(lambda x: np.std(x))

print(deviations)

這裡寫圖片描述

rt_mt_user = float_df[['RT_user_norm', 'Metacritic_user_nom']]
rt_mt_user.apply(lambda x: np.std(x), axis=1)

這裡寫圖片描述

Python-Pandas(6)資料索引變換

import pandas as pd #will return a new DataFrame that is indexed by the values in the specified

python pandas常用資料處理方法

pandas 1、header = 0 不同於 header = None header = 0 表示第0行為列 header = None 表示讀取的時候認為沒有標題，全是資料可以用 skiprows = 1 跳過列名 2、pandas 獲取指定的行列資料 df.ilo

Python Pandas 做資料分析之玩轉 Excel 報表分析

Python Pandas 是大資料分析的基礎，這裡將分享和Excel報表相關的分析技巧，都是工作中的實戰內容。本場 Chat 主要內容： Excel、CSV 資料的讀、寫、儲存； DataFrame 的 Index、Columns 相關操作； loc、iloc、XS 和 Mul

python pandas 合併資料函式merge join concat combine_first 區分

pandas物件中的資料可以通過一些內建的方法進行合併：pandas.merge，pandas.concat，例項方法join，combine_first，它們的使用物件和效果都是不同的，下面進行區分和比較。資料的合併可以在列方向和行方向上進行，即下圖所示的兩

利用python Pandas進行資料預處理

目錄： 1.安裝pandas 2.pandas的引入 3.資料清洗 ①處理缺

Python+pandas計算資料相關係數

本文主要演示pandas中DataFrame物件corr()方法的用法，該方法用來計算DataFrame物件中所有列之間的相關係數（包括pearson相關係數、Kendall Tau相關係數和spearman秩相關）。 >>> import numpy as np >>&g

【python pandas】資料框行轉列，列轉行

測試資料： context_id subject_gmt differtime browse_count click_count like_count commet_count reply_count score_value last1

python筆記6:資料處理之匯入資料

# -*- coding: utf-8 -*- #資料一般儲存在檔案（csv、txt、excel）和資料庫中 #1. 匯入csv檔案（第一行是列名） from pandas import read_csv #檔案的編碼格式也應該是 utf-8 才行，否則報錯 df = re

python-pandas基本資料操作

一、檢視資料（檢視物件的方法對於Series來說同樣適用） 1.檢視DataFrame前xx行或後xx行 a=DataFrame(data); a.head(6)表示顯示前6行資料，若head()中不帶引數則會顯示全部資料。 a.tail(6)表示顯示後

python/pandas/numpy資料分析（七）-MultiIndex

data=Series(np.random.randn(10),index=[list('aaabbbccdd'),list('1231231223')]) data a 1 0.198134 2 0.657700 3 -0.98

利用Python Pandas進行資料預處理-資料清洗

資料缺失、檢測和過濾異常值、移除重複資料資料缺失資料缺失在大部分資料分析應用中都很常見，Pandas使用浮點值NaN表示浮點和非浮點陣列中的缺失資料，他只是一個便於被檢測出來的資料而已。 from pandas import Series,Da

Python+pandas+matplotlib資料分析與視覺化案例（附原始碼）

問題描述：執行下面的程式，在當前資料夾中生成飯店營業額模擬資料檔案data.csv然後完成下面的

[Python] Pandas 對資料進行查詢、替換、篩選、排序、重複值和缺失值處理

[TOC] 查詢和替換是日常工作中很常見的資料預處理操作，下面就來講解如何使用pandas模組中的函式對DataFrame中的資料進行查詢和替換。 ## 1. 資料檔案 [產品統計表.7z](https://files.cnblogs.com/files/feily/%E4%BA%A7%E5%93%81%E

python之pandas的層級索引與資料重構

import numpy as np import pandas as pd #層級索引 s1 = pd.Series(np.random.randint(-5,10,12),index=[list('aaabbbcccddd'),[1,2,4,1,2,3,1,2,3,1,2,3]]) p

python入門6 python檢視資料型別及型別轉換

檢視資料型別：type() 型別轉換：int(),float(),char(),ord(),str(),bool() #coding:utf-8 #/usr/bin/python """ 2018-11-03 dinghanhua 檢視資料型別，型別轉換 """ """檢視資料型別

python分析患者資料：pandas 和matplotlib

使用python進行資料清洗及視覺化今天第一次使用pandas和matplotlib處理資料，以下紀錄一些使用心得： 1、首先第一步就是要匯入一些使用包： import numpy as np import pandas as pd import matplotlib.pypl

python pandas+matplotlib 簡化資料視覺化

一、pandas中的繪圖函式 1.series繪製圖像 # 準備一個Series s = Series(np.random.randn(10),index=np.arange(10,110,10)) # 最簡單的畫個圖 s.plot() plt.show() 2.Data

【python學習筆記】43：Pandas時序資料處理

學習《Python3爬蟲、資料清洗與視覺化實戰》時自己的一些實踐。 Python中時間的一些常用操作 import time # 從格林威治時間到現在,單位秒 print('系統時間戳:', time.time()) print('本地時間按格式轉成str:', tim

Python資料處理之（十一）Pandas 選擇資料

首先先建立一個6X4的矩陣 >>> import pandas as pd >>> import numpy as np >>> dates=pd.date_range('20181121',periods=6) >>

Python（6）--列表 Python（5）--資料結構-序列-通用操作

列表：　　序列中已經使用了列表　　列表是用[]定義的序列，[]內包含0個或者多個元素　　列表是可變的，可以修改其內容函式list：　　將序列作為list函式的引數，常見列表 #字串建立列表 >>> s = "hello" >

Python-Pandas(6)資料索引變換

相關推薦