pandas.read_excel詳解

阿新 • • 發佈：2019-02-10

#coding:utf-8
import pandas as pd
import numpy as np

filefullpath = r"/home/geeklee/temp/all_gov_file/pol_gov_mon/downloads/1.xls"
#filefullpath = r"/home/geeklee/temp/all_gov_file/pol_gov_mon/downloads/26368f3a-ea03-46b9-8033-73615ed07816.xls"
df = pd.read_excel(filefullpath,skiprows=[0])
#df = pd.read_excel(filefullpath, sheetname=[0,2],skiprows=[0]) 

#sheetname指定為讀取幾個sheet，sheet數目從0開始
#如果sheetname=[0,2]，那代表讀取第0頁和第2頁的sheet
#skiprows=[0]代表讀取跳過的行數第0行，不寫代表不跳過標題
#df = pd.read_excel(filefullpath, sheetname=None ,skiprows=[0])

print df
print type(df)
#若果有多頁，type(df)就為<type 'dict'>
#如果就一頁，type(df)就為<class 'pandas.core.frame.DataFrame'>
#{0:dataframe,1:dataframe,2:dataframe}

pandas.read_excel(io, sheetname=0, header=0, skiprows=None, skip_footer=0, index_col=None, names=None, parse_cols=None, parse_dates=False, date_parser=None, na_values=None, thousands=None, convert_float=True, has_index_names=None, converters=None, engine=None, squeeze=False, **kwds)

Read an Excel table into a pandas DataFrame

引數解析：

io : string, path object (pathlib.Path or py._path.local.LocalPath),

    file-like object, pandas ExcelFile, or xlrd workbook. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. For instance, a local file could be file://localhost/path/to/workbook.xlsx

sheetname : string, int, mixed list of strings/ints, or None, default 0

    Strings are used for sheet names, Integers are used in zero-indexed sheet positions.

    Lists of strings/integers are used to request multiple sheets.

    Specify None to get all sheets.

    str|int -> DataFrame is returned. list|None -> Dict of DataFrames is returned, with keys representing sheets.

    Available Cases

        Defaults to 0 -> 1st sheet as a DataFrame
        1 -> 2nd sheet as a DataFrame
        “Sheet1” -> 1st sheet as a DataFrame
        [0,1,”Sheet5”] -> 1st, 2nd & 5th sheet as a dictionary of DataFrames
        None -> All sheets as a dictionary of DataFrames

header : int, list of ints, default 0

    Row (0-indexed) to use for the column labels of the parsed DataFrame. If a list of integers is passed those row positions will be combined into a MultiIndex

skiprows : list-like

    Rows to skip at the beginning (0-indexed)

skip_footer : int, default 0

    Rows at the end to skip (0-indexed)

index_col : int, list of ints, default None

    Column (0-indexed) to use as the row labels of the DataFrame. Pass None if there is no such column. If a list is passed, those columns will be combined into a MultiIndex

names : array-like, default None

    List of column names to use. If file contains no header row, then you should explicitly pass header=None

converters : dict, default None

    Dict of functions for converting values in certain columns. Keys can either be integers or column labels, values are functions that take one input argument, the Excel cell content, and return the transformed content.

parse_cols : int or list, default None

        If None then parse all columns,
        If int then indicates last column to be parsed
        If list of ints then indicates list of column numbers to be parsed
        If string then indicates comma separated list of column names and column ranges (e.g. “A:E” or “A,C,E:F”)

squeeze : boolean, default False

    If the parsed data only contains one column then return a Series

na_values : list-like, default None

    List of additional strings to recognize as NA/NaN

thousands : str, default None

    Thousands separator for parsing string columns to numeric. Note that this parameter is only necessary for columns stored as TEXT in Excel, any numeric columns will automatically be parsed, regardless of display format.

keep_default_na : bool, default True

    If na_values are specified and keep_default_na is False the default NaN values are overridden, otherwise they’re appended to

verbose : boolean, default False

    Indicate number of NA values placed in non-numeric columns

engine: string, default None

    If io is not a buffer or path, this must be set to identify io. Acceptable values are None or xlrd

convert_float : boolean, default True

    convert integral floats to int (i.e., 1.0 –> 1). If False, all numeric data will be read in as floats: Excel stores all numbers as floats internally

has_index_names : boolean, default None

    DEPRECATED: for version 0.17+ index names will be automatically inferred based on index_col. To read Excel output from 0.16.2 and prior that had saved index names, use True.

return返回的結果

parsed : DataFrame or Dict of DataFrames

    DataFrame from the passed in Excel file. See notes in sheetname argument for more information on when a Dict of Dataframes is returned.

pandas.read_excel詳解

#coding:utf-8 import pandas as pd import numpy as np filefullpath = r"/home/geeklee/temp/all_gov_file/pol_gov_mon/downloads/1.xls"

pandas groupby 詳解

Name Brand Cloth Count girl uniql sweater 3 girl etam suit 1 girl etam pants 1 girl lagogo jacket 2 boy p

【Python3】pandas.read_csv詳解

Python資料分析，一般第一步就是讀取資料，這篇詳解pandas讀取資料read_csv。 read_csv函式引數幾個常用的引數包括path、sep、header、index_col、names、skiprows、na_values、nrows、skip_footer、e

Pandas index詳解

總括 pandas裡對索引的操作主要有 1. DataFrame.rename 2. DataFrame.rename_axis 3. DataFrame.reindex 4. DataFra

03 -1 pandas 中 DataFrame理解與建立、索引、運算的詳解以及例項

DataFrame DataFrame是一個【表格型】的資料結構，可以看做是【由Series組成的字典】（共用同一個索引）。DataFrame由按一定順序排列的多列資料組成。設計初衷是將Series的使用場景從一維拓展到多維。DataFrame既有行索引，也有列索引。行索引

學機器學習，不會資料處理怎麼行？—— 二、Pandas詳解

在上篇文章學機器學習，不會資料處理怎麼行？—— 一、NumPy詳解中，介紹了NumPy的一些基本內容，以及使用方法，在這篇文章中，將接著介紹另一模組——Pandas。（本文所用程式碼在這裡） Pandas資料結構介紹大家應該都聽過表結構，但是，如果讓你自己來實現這麼一個結構，並且能對其進行資料處理，能實

python和mysql互動詳解---- （pandas）讀csv檔案,executemny批量寫入db中

主要用到pandas從csv檔案中抓資料，pandas抓出的資料是dataframe格式的，而且有的可能是Nan,抓出df格式的資料需要再處理，才能批處理的寫入資料庫中，executemany批出的格式不能是df,這裡處理成list import pymysql import codec

【python】詳解pandas.DataFrame.plot( ) 中引數secondary_y實現雙座標軸使用

首先看官網的DataFrame.plot( )函式 secondary_y : boolean or sequence, default False # 可以是布林值或者是數列 Whether to plot on the secondary y-axis

【python】詳解pandas庫的pd.merge_ordered與pd.merge_asof

merge_ordered: 函式允許組合時間序列和其他有序資料。特別是它有一個可選的fill_method關鍵字來填充/插入缺失的資料。 import pandas as pd left = pd.DataFrame({'k': ['K0', 'K1', 'K1'

【python】詳解pandas庫的pd.merge函式

本篇詳細說明merge的應用，join 和concatenate的拼接方法的與之相似。 pd.merge(left, right, how='inner', on=None, left_on=None, right_on=None, left_index=False

[Python3]pandas.merge用法詳解

摘要資料分析與建模的時候大部分時間在資料準備上，包括對資料的載入、清理、轉換以及重塑。pandas提供了一組高階的、靈活的、高效的核心函式，能夠輕鬆的將資料規整化。這節主要對pandas合併資料集的merge函式進行詳解。(用過SQL或其他關係型資料庫的可能會對這個方法比較熟悉。)

Pandas 中的四中索引方式詳解

Pandas 中的四中索引方式詳解總結 Pandas 中的四中索引方式詳解第一次使用pandas 對於其中的Series 和DataFrame 的索引弄暈了

【python】詳解pandas庫的df.merge函式

本篇詳細說明merge的應用，join 和concatenate的拼接方法的與之相似。 pd.merge(left, right, how='inner', on=None, left_on=None,

pandas教程：pandas主要功能詳解

pandas基本功能將檔案資料匯入Pandas 通過pandas提供的read_xxx相關的函式可以讀取檔案中的資料，並形成DataFrame,常用的資料讀取方法為：read_csv，主要可以讀取文字型別的資料 df =pd.read_csv("Counts.

【Python】pandas軸旋轉stack和unstack用法詳解

摘要前面給大家分享了pandas做資料合併的兩篇[pandas.merge]和[pandas.cancat]的用法。今天這篇主要講的是pandas的DataFrame的軸旋轉操作，stack和unstack的用法。首先，要知道以下五點： 1.stack：將資料的列“旋轉”為行 2

[Python3]pandas.concat用法詳解

前面給大家分享了pandas.merge用法詳解，這節分享pandas資料合併處理的姊妹篇，pandas.concat用法詳解，參考利用Python進行資料分析與pandas官網進行整理。 pandas.merge引數列表如下圖，其中只有objs是必須得引數，另外常用引數包括objs、a

Pandas詳解十八之DataFrame物件的-Join合併

約定： import pandas as pd 物件的例項方法-Join DataFrame物件有個df.join()方法也能進行pd.merge()的合併，它能更加方便地按照物件df的索引進行合併，且能同時合併多個DataFr

【python】詳解numpy庫與pandas庫axis=0，axis= 1軸的用法

對資料進行操作時，經常需要在橫軸方向或者數軸方向對資料進行操作，這時需要設定引數axis的值： axis = 0 代表對橫軸操作，也就是第0軸； axis = 1 代表對縱軸操作，也就是第1軸；

Pandas詳解十四之DataFrame物件的列和索引之間的轉化

約定： import pandas as pd DataFrame物件的列和索引之間的轉化我們常常需要將DataFrame物件中的某列或某幾列作為索引，或者將索引轉化為物件的列。pandas提供了set_index()/res

Pandas詳解七之DatetimeIndex、PeriodIndex和TimedeltaIndex時間序列

約定： import pandas as pd import numpy as np 時間序列上節介紹的Timestamp、Period和Timedelta物件都是單個值，這些值都可以放在索引或資料中。作為索引的時間序列有：DatetimeInd

pandas.read_excel詳解

相關推薦