Java使用poi讀取excel資料（excel可能很大，先轉換為csv再讀取）

阿新 • • 發佈：2019-01-07

————————————配置————————————

jdbc.properties中加入： excelUrl=/……xlsx檔案目錄路徑/ (excelUrl + “xxxx.xlsx” 為完整路徑)

匯入poi-3.16下的6個jar包，poi-3.16/lib下的5個jar包，poi-3.16/ooxml-lib下的2個jar包

將Excel_reader.java 和 XLSX2CSV.java 匯入專案

————————————方法————————————

Excel_reader類中的:

xlsx_reader(String excel_name , ArrayList<Object> args)

//excel_name為要讀取的xlsx檔名（帶字尾） , args為要獲取的列號的列表

//返回二維陣列ArrayList<ArrayList<String>> 第一維表示xlsx的行，第二維表示xlsx中該行的單元格

//空單元格返回null，需要自己處理成—--或0

//args可以填 int 或者 String ，若args[i]為int,那麼返回的二維陣列的第I列為xlsx中的第args[i]列

//若args[i]為String,那麼返回的二維陣列的第i列為改字元常量

//例如 xlsx_reader(“崇明縣-表15：“夏淡”綠葉菜種植補貼-2014.xlsx”,args)

// 其中 args=[7,8,9,”綠肥”]

//那麼返回的二維陣列內容如下：

[小明 , 350401219948383**** , null , 綠肥]

[小紅 , 645354354354323**** , null , 綠肥]

[小蘭 , 445353453425643**** , null , 綠肥]

。。。。。。

XLSX2CSV.java

import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;

import javax.xml.parsers.ParserConfigurationException;

import org.apache.poi.openxml4j.exceptions.OpenXML4JException;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.ss.usermodel.DataFormatter;
import org.apache.poi.ss.util.CellAddress;
import org.apache.poi.ss.util.CellReference;
import org.apache.poi.util.SAXHelper;
import org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler;
import org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.SheetContentsHandler;
import org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor;
import org.apache.poi.xssf.model.StylesTable;
import org.apache.poi.xssf.usermodel.XSSFComment;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;  
  
/** 
 * A rudimentary XLSX -> CSV processor modeled on the 
 * POI sample program XLS2CSVmra from the package 
 * org.apache.poi.hssf.eventusermodel.examples. 
 * As with the HSSF version, this tries to spot missing 
 * rows and cells, and output empty entries for them. 
 * <p/> 
 * Data sheets are read using a SAX parser to keep the 
 * memory footprint relatively small, so this should be 
 * able to read enormous workbooks.  The styles table and 
 * the shared-string table must be kept in memory.  The 
 * standard POI styles table class is used, but a custom 
 * (read-only) class is used for the shared string table 
 * because the standard POI SharedStringsTable grows very 
 * quickly with the number of unique strings. 
 * <p/> 
 * For a more advanced implementation of SAX event parsing 
 * of XLSX files, see {@link XSSFEventBasedExcelExtractor} 
 * and {@link XSSFSheetXMLHandler}. Note that for many cases, 
 * it may be possible to simply use those with a custom 
 * {@link SheetContentsHandler} and no SAX code needed of 
 * your own! 
 */  
public class XLSX2CSV {  
    /** 
     * Uses the XSSF Event SAX helpers to do most of the work 
     * of parsing the Sheet XML, and outputs the contents 
     * as a (basic) CSV. 
     */  
    private class SheetToCSV implements SheetContentsHandler {  
        private boolean firstCellOfRow = false;  
        private int currentRow = -1;  
        private int currentCol = -1;  
  
        private void outputMissingRows(int number) {  
            for (int i = 0; i < number; i++) {  
            	curstr = new ArrayList<String>();
                for (int j = 0; j < minColumns; j++) {  
                	curstr.add(null);  
                }  
                output.add(curstr);  
            }  
        }  
  
        @Override  
        public void startRow(int rowNum) {  
        	curstr = new ArrayList<String>();
            // If there were gaps, output the missing rows  
            outputMissingRows(rowNum - currentRow - 1);  
            // Prepare for this row  
            firstCellOfRow = true;  
            currentRow = rowNum;  
            currentCol = -1;  
        }  
  
        @Override  
        public void endRow(int rowNum) {  
            // Ensure the minimum number of columns  
            for (int i = currentCol; i < minColumns ; i++) {  
                curstr.add(null);  
            }  
            output.add(curstr);  
        }  
  
        @Override  
        public void cell(String cellReference, String formattedValue,  
                         XSSFComment comment) {  
//            if (firstCellOfRow) {  
//                firstCellOfRow = false;  
//            } else {  
//                curstr.append(',');  
//            }  
  
            // gracefully handle missing CellRef here in a similar way as XSSFCell does  
            if (cellReference == null) {  
                cellReference = new CellAddress(currentRow, currentCol).formatAsString();  
            }  
  
            // Did we miss any cells?  
            int thisCol = (new CellReference(cellReference)).getCol();  
            int missedCols = thisCol - currentCol - 1;  
            for (int i = 0; i < missedCols; i++) {  
                curstr.add(null);  
            }  
            currentCol = thisCol;  
  
            // Number or string?  
            try {  
                Double.parseDouble(formattedValue);  
                curstr.add(formattedValue);  
            } catch (NumberFormatException e) {  
               // output.append('"');  
            	curstr.add(formattedValue);  
             //   output.append('"');  
            }  
        }  
  
        @Override  
        public void headerFooter(String text, boolean isHeader, String tagName) {  
            // Skip, no headers or footers in CSV  
        }  
    }  
  
  
    ///////////////////////////////////////  
  
    private final OPCPackage xlsxPackage;  
  
    /** 
     * Number of columns to read starting with leftmost 
     */  
    private final int minColumns;  
  
    /** 
     * Destination for data 
     */  
    
    private ArrayList<ArrayList<String>> output;
    private ArrayList<String> curstr;
    
    public  ArrayList<ArrayList<String>> get_output(){
    	return output;
    }
    
    /** 
     * Creates a new XLSX -> CSV converter 
     * 
     * @param pkg        The XLSX package to process 
     * @param output     The PrintStream to output the CSV to 
     * @param minColumns The minimum number of columns to output, or -1 for no minimum 
     */  
    public XLSX2CSV(OPCPackage pkg, int minColumns) {  
        this.xlsxPackage = pkg;  
        this.minColumns = minColumns;  
    }  
    
  
    /** 
     * Parses and shows the content of one sheet 
     * using the specified styles and shared-strings tables. 
     * 
     * @param styles 
     * @param strings 
     * @param sheetInputStream 
     */  
    public void processSheet(  
            StylesTable styles,  
            ReadOnlySharedStringsTable strings,  
            SheetContentsHandler sheetHandler,  
            InputStream sheetInputStream)  
            throws IOException, ParserConfigurationException, SAXException {  
        DataFormatter formatter = new DataFormatter();  
        InputSource sheetSource = new InputSource(sheetInputStream);  
        try {  
            XMLReader sheetParser = SAXHelper.newXMLReader();  
            ContentHandler handler = new XSSFSheetXMLHandler(  
                    styles, null, strings, sheetHandler, formatter, false);  
            sheetParser.setContentHandler(handler);  
            sheetParser.parse(sheetSource);  
        } catch (ParserConfigurationException e) {  
            throw new RuntimeException("SAX parser appears to be broken - " + e.getMessage());  
        }  
    }  
  
    /** 
     * Initiates the processing of the XLS workbook file to CSV. 
     * 
     * @throws IOException 
     * @throws OpenXML4JException 
     * @throws ParserConfigurationException 
     * @throws SAXException 
     */  
    public void process()  
            throws IOException, OpenXML4JException, ParserConfigurationException, SAXException {  
        ReadOnlySharedStringsTable strings = new ReadOnlySharedStringsTable(this.xlsxPackage);  
        XSSFReader xssfReader = new XSSFReader(this.xlsxPackage);  
        StylesTable styles = xssfReader.getStylesTable();  
        XSSFReader.SheetIterator iter = (XSSFReader.SheetIterator) xssfReader.getSheetsData();  
        int index = 0;  
        while (iter.hasNext()) {  
        	output = new ArrayList<ArrayList<String>> ();
            InputStream stream = iter.next();  
            String sheetName = iter.getSheetName();  
            System.out.println("正在讀取sheet： "+sheetName + " [index=" + index + "]:");  
            processSheet(styles, strings, new SheetToCSV(), stream);  
            System.out.println("sheet 讀取完成!");
            stream.close();  
            ++index;  
        }  
    }  
  
    
//    public static void main(String[] args) throws Exception {  
//      /*  if (args.length < 1) { 
//            System.err.println("Use:"); 
//            System.err.println("  XLSX2CSV <xlsx file> [min columns]"); 
//            return; 
//        }*/  
//  
//        File xlsxFile = new File("F:\\8月資料.xlsx");  
//        if (!xlsxFile.exists()) {  
//            System.err.println("Not found or not a file: " + xlsxFile.getPath());  
//            return;  
//        }  
//  
//        int minColumns = -1;  
//        if (args.length >= 2)  
//            minColumns = Integer.parseInt(args[1]);  
//  
//        // The package open is instantaneous, as it should be.  
//        OPCPackage p = OPCPackage.open(xlsxFile.getPath(), PackageAccess.READ);  
//        XLSX2CSV xlsx2csv = new XLSX2CSV(p, System.out, minColumns);  
//        xlsx2csv.process();  
//        p.close();  
//    }  
}

Excel_reader.java

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.Properties;

import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.openxml4j.exceptions.InvalidOperationException;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.openxml4j.opc.PackageAccess;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class Excel_reader {

	// *************xlsx檔案讀取函式************************
	// 在jdbc.properties上加上 excelUrl：xlsx檔案的目錄
	// excel_name為檔名，arg為需要查詢的列號(輸入數字則返回對應列 , 輸入字串則固定返回這個字串)
	// 返回
	@SuppressWarnings({ "resource", "unused" })
	public static ArrayList<ArrayList<String>> xlsx_reader(String excel_name, ArrayList<Object> args)
			throws IOException {
		// 讀取excel資料夾url
		Properties properties = new Properties();
		InputStream inStream = JDBCTools.class.getClassLoader().getResourceAsStream("jdbc.properties");
		properties.load(inStream);
		String excelUrl = properties.getProperty("excelUrl");

		File xlsxFile = new File(excelUrl + excel_name);
		if (!xlsxFile.exists()) {
			System.err.println("Not found or not a file: " + xlsxFile.getPath());
			return null;
		}
		ArrayList<ArrayList<String>> excel_output = new ArrayList<ArrayList<String>>();
		try {
			OPCPackage p;
			p = OPCPackage.open(xlsxFile.getPath(), PackageAccess.READ);
			XLSX2CSV xlsx2csv = new XLSX2CSV(p, 20); // 20代表最大列數
			xlsx2csv.process();
			excel_output = xlsx2csv.get_output();
			p.close();   //釋放
		} catch (Exception e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}

		System.out.println(excel_name + " 讀取完畢");

		// //讀取xlsx檔案
		// XSSFWorkbook xssfWorkbook = null;
		// //尋找目錄讀取檔案
		// System.out.println("開始讀取 "+excel_name);
		// File excelFile = new File(excelUrl+excel_name);
		// InputStream is = new FileInputStream(excelFile);
		// xssfWorkbook = new XSSFWorkbook(is);
		//
		// if(xssfWorkbook==null){
		// System.out.println("未讀取到內容,請檢查路徑！");
		// return null;
		// }else{
		// System.out.println(excel_name+" 讀取完畢");
		// }

		ArrayList<ArrayList<String>> ans = new ArrayList<ArrayList<String>>();
		// 遍歷xlsx中的sheet

		// 對於每個sheet，讀取其中的每一行
		for (int rowNum = 0; rowNum < excel_output.size(); rowNum++) {
			ArrayList<String> cur_output = excel_output.get(rowNum);
			ArrayList<String> curarr = new ArrayList<String>();
			for (int columnNum = 0; columnNum < args.size(); columnNum++) {
				Object obj = args.get(columnNum);
				if (obj instanceof String) {
					curarr.add(obj.toString());
				} else if (obj instanceof Integer) {
					String cell = cur_output.get((int) obj);
					curarr.add(cell);
				} else {
					System.out.print("型別錯誤！");
					return null;
				}
			}
			ans.add(curarr);
		}

		return ans;
	}

//	// 判斷後綴為xlsx的excel檔案的資料類
//	@SuppressWarnings("deprecation")
//	private static String getValue(XSSFCell xssfRow) {
//		if (xssfRow == null) {
//			return null;
//		}
//		if (xssfRow.getCellType() == xssfRow.CELL_TYPE_BOOLEAN) {
//			return String.valueOf(xssfRow.getBooleanCellValue());
//		} else if (xssfRow.getCellType() == xssfRow.CELL_TYPE_NUMERIC) {
//			double cur = xssfRow.getNumericCellValue();
//			long longVal = Math.round(cur);
//			Object inputValue = null;
//			if (Double.parseDouble(longVal + ".0") == cur)
//				inputValue = longVal;
//			else
//				inputValue = cur;
//			return String.valueOf(inputValue);
//		} else if (xssfRow.getCellType() == xssfRow.CELL_TYPE_BLANK
//				|| xssfRow.getCellType() == xssfRow.CELL_TYPE_ERROR) {
//			return "";
//		} else {
//			return String.valueOf(xssfRow.getStringCellValue());
//		}
//	}

}

Java使用poi讀取excel資料（excel可能很大，先轉換為csv再讀取）

————————————配置———————————— jdbc.properties中加入： excelUrl=/……xlsx檔案目錄路徑/ (excelUrl + “xxxx.xlsx” 為完整路徑) 匯入poi-3.16下的6個jar包，poi-3.16/l

怎麼在delphi中讀取Excel資料（各種詳細操作）

( 一 ) 使用動態建立的方法首先建立 Excel 物件，使用ComObj :VarExcelApp : Variant ;ExcelApp := CreateOleObject ( '' Excel.Application'' ) ;1 ) 顯示當前視窗：Exce

python介面自動化（三十七）-封裝與呼叫--讀取excel 資料（詳解）

簡介　　在進行軟體介面測試或設計自動化測試框架時，一個不比可避免的過程就是: 引數化，在利用python進行自動化測試開發時，通常會使用excel來做資料管理，利用xlrd、xlwt開源包來讀寫excel。例如：當我們登入的賬號有多個的時候，我們一般用 excel 存放測試資料，本篇文章介紹，pytho

千萬級別資料的匯出到excel實現（以自己以前做的訂單匯出為demo給大家參考）

考慮幾個重點： 1，伺服器承載 2，redis資料快取避免資料重複匯出，3，匯出後的資料處理 4，死迴圈 5，資料大小，限制大變數的出現遇到這樣的需求，大家根據自己的需求去處理業務，多方位去考慮程式的可執行性，效能等多方面因素（儘量減少迴圈中的查詢次數）不多說附

js 實現下載當前頁table資料（Excel）

下載方法1需要引入js，方法2不需要 js /* * jQuery table2excel - v1.1.1 * jQuery plugin to export an .xls file in browser from an HTML table * https://gi

Java jxl 操作Excel資料（檔案型別為xls）

Maven檔案如下： <dependency> <groupId>net.sourceforge.jexcelapi</groupId> <artifactId>jxl</artifactId> <ver

python生成每日報表資料（Excel）並郵件傳送

邏輯比較簡單，直接上程式碼定時傳送直接使用了win伺服器的定時任務來定時執行指令碼 #coding:utf-8 from __future__ import division import pymssql,sys,datetime,xlwt import sm

python使用xlrd讀取excel資料作為requests的請求引數，並把返回的資料寫入excel中

實現功能：從excel中的第一列資料作為post請求的資料，資料為json格式；把post返回的結果寫入到excel的第二列資料中每一行的資料都不一樣，可實現迴圈呼叫 # !/usr/bin/env python # -*- coding:utf-8 -*- #import xlwt

用Python3生成30萬條excel資料（xlsx格式）

在B/S架構的系統測試中，有時需要通過匯入excel檔案來生成一些資料記錄，當資料量小的時候，一般不會出現什麼問題，而當匯入的資料量巨大時，對系統的效能就是一個考驗了。為了驗證系統的效能，有時需要匯入海量的資料，如30萬條資料記錄，而手頭並沒有這麼多資料時該怎麼辦呢？一條一條複製貼上，或者通過excel的下拉

python讀取外部資料之excel資料獲取及引數說明

本文簡單介紹pandas.read_excel()引數應用官方函式引數 pandas.read_excel(io, sheetname=0,

php讀取mysqltext資料（續）

如何使用mfc中的ondraw在picture顯示圖片如何使用mfc中的ondraw在picture顯示圖片 Oracledataguard備庫歸檔日誌不全如何恢復Oracledataguard備庫歸檔日誌不全如何恢復分享loaddatafrommaster不可用老分享lo

讀取關聯資料（EF Core2.1.1）

物件-關係對映框架比如EF有三種方式使用模型中的導航屬性來載入關聯資料。一、.Lazy Loading.（關聯資料在訪問導航屬性時被透明的載入，不需要特別的程式碼，自動的載入）當一個實體第一次讀取的時候，關聯資料不會被檢索。然後，當你第一次訪問這個實體的導航屬性的時候，導航屬性需要的資料自動的被檢索。

android 讀取JSON資料（遍歷JSONObject和JSONArray）

private String getJson() { //jsonString中含有比如unicode編碼字元\u67ef的話，getString時自動會轉化為相應語言字元。 //從伺服器獲取的資料片段一般為以下格式 /*{"showid":"38f5ef6ae

spark讀取kafka資料（兩種方式比較及flume配置檔案）

a1.sources = r1 a1.channels = c1 a1.sinks = k1 a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1 a1.channels.c1.type = memory a1.channels.c1.capacity

Postgresql快速寫入\/讀取大量資料（.net）

環境及測試使用.net驅動npgsql連線post資料庫。配置：win10 x64, i5-4590, 16G DDR3, SSD 850EVO. postgresql 9.6.3，資料庫與資料都安裝在SSD上，預設配置，無擴充套件。 CREATE TABLE pub

spark讀取redis資料（互動式，scala單機版，java單機版）

互動式第一步：向redis中新增資料第二步：將jedis jar包放入~/lib目錄下，開啟spark服務第三步：通過spark-shell讀取redis資料，並做相應處理

C++ 二進位制操作 Excel 原理（Excel 寫操作）

簡介賞析原始碼簡介開發IDE: VS2013 使用語言：C/C++ 作業系統：window 7 x64 賞析利用寫檔案的形式，根據excel的書寫格式，進行編碼。實現了excel的寫，只允許寫，並不支援讀操作。親測

VTK讀取原始資料（裸資料）和序列影象

有時候我們想讀取得影象格式VTK中沒有相應的IO類支援或者影象格式是一些商家自定義的（醫療影象中經常會有這種情況）。這時候VTK為我們提供了一個vtkImageReader 類，該類提供了讀取

solr facet查詢及solrj 讀取facet資料（相當有用）

Facet 是 solr 的高階搜尋功能之一 , 可以給使用者提供更友好的搜尋體驗 . 在搜尋關鍵字的同時 , 能夠按照 Facet 的欄位進行分組並統計 . 一般代表了實體的某種公共屬性 , 如商品的分類 , 商品的製造廠家 , 書籍的出版商等等 . Face

Postgresql快速寫入/讀取大量資料（.net）

環境及測試使用.net驅動npgsql連線post資料庫。配置：win10 x64, i5-4590, 16G DDR3, SSD 850EVO. postgresql 9.6.3，資料庫與資料都安裝在SSD上，預設配置，無擴充套件。 CREATE TABLE p

Java使用poi讀取excel資料（excel可能很大，先轉換為csv再讀取）

相關推薦