easyexcel解決POI解析Excel出現OOM

阿新 • • 發佈：2019-01-24

我自己是阿里的一名普通碼農，今年年初通過公司對外開源一個工具easyexcel主要用於解決POI在解析excel時候存在的3個問題。

1、POI非常耗記憶體（大的Excel需要上G的記憶體），系統容易OOM。

2、POI使用複雜，需要非常多重複程式碼。

3、POI一些BUG未修復。

基於以上問題自己寫了easyexcel，在阿里內部使用比較多，阿里內部我是基於POI的SAX模式做的，也已經非常穩定，但仍舊存在記憶體消耗比較大的問題，POI sax並不能完全解決OOM問題。開源版本自己重寫07版的解析過程，記憶體消耗更低，可以從根本上解決OOM的問題，理論無論EXCEL再打記憶體消耗可以控制在KB的級別。開源4個月左右，使用人數也比較多，大家也在git上提了非常多的bug。雖然自己寫了一些框架，在阿里內使用比較多的有2個框架，部但由於自己是一名業務碼農，前面大部分時間在做業務，大促無時間進行升級，在本週末也集做了處理，目前穩定版本是1.0.2。在這裡還是感謝那些提了建議，bug以及給出優化方案的同學。後面自己週末會投入一部分時間持續升級和開源自己寫的感覺還不錯的東西，還希望大家多多支援。

使用前最好諮詢下最新版，或者到mvn倉庫搜尋先easyexcel的最新版

<dependency>
	<groupId>com.alibaba</groupId>
	<artifactId>easyexcel</artifactId>
    <version>1.0.2</version>
</dependency>

讀Excel

使用easyexcel解析03、07版本的Excel只是ExcelTypeEnum不同，其他使用完全相同，使用者無需知道底層解析的差異。

無java模型直接把excel解析的每行結果以List<String>返回在ExcelListener獲取解析結果

讀excel程式碼示例如下：

    @Test
    public void testExcel2003NoModel() {
        InputStream inputStream = getInputStream("loan1.xls");
        try {
            // 解析每行結果在listener中處理
            ExcelListener listener = new ExcelListener();

            ExcelReader excelReader = new ExcelReader(inputStream, ExcelTypeEnum.XLS, null, listener);
            excelReader.read();
        } catch (Exception e) {

        } finally {
            try {
                inputStream.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

ExcelListener示例程式碼如下：

 /* 解析監聽器，
 * 每解析一行會回撥invoke()方法。
 * 整個excel解析結束會執行doAfterAllAnalysed()方法
 *
 * 下面只是我寫的一個樣例而已，可以根據自己的邏輯修改該類。
 * @author jipengfei
 * @date 2017/03/14
 */
public class ExcelListener extends AnalysisEventListener {

    //自定義用於暫時儲存data。
    //可以通過例項獲取該值
    private List<Object> datas = new ArrayList<Object>();
    public void invoke(Object object, AnalysisContext context) {
        System.out.println("當前行："+context.getCurrentRowNum());
        System.out.println(object);
        datas.add(object);//資料儲存到list，供批量處理，或後續自己業務邏輯處理。
        doSomething(object);//根據自己業務做處理
    }
    private void doSomething(Object object) {
        //1、入庫呼叫介面
    }
    public void doAfterAllAnalysed(AnalysisContext context) {
       // datas.clear();//解析結束銷燬不用的資源
    }
    public List<Object> getDatas() {
        return datas;
    }
    public void setDatas(List<Object> datas) {
        this.datas = datas;
    }
}

有java模型對映

java模型寫法如下：

public class LoanInfo extends BaseRowModel {
    @ExcelProperty(index = 0)
    private String bankLoanId;
    
    @ExcelProperty(index = 1)
    private Long customerId;
    
    @ExcelProperty(index = 2,format = "yyyy/MM/dd")
    private Date loanDate;
    
    @ExcelProperty(index = 3)
    private BigDecimal quota;
    
    @ExcelProperty(index = 4)
    private String bankInterestRate;
    
    @ExcelProperty(index = 5)
    private Integer loanTerm;
    
    @ExcelProperty(index = 6,format = "yyyy/MM/dd")
    private Date loanEndDate;
    
    @ExcelProperty(index = 7)
    private BigDecimal interestPerMonth;

    @ExcelProperty(value = {"一級表頭","二級表頭"})
    private BigDecimal sax;
}

@ExcelProperty(index = 3)數字代表該欄位與excel對應列號做對映，也可以採用 @ExcelProperty(value = {"一級表頭","二級表頭"})用於解決不確切知道excel第幾列和該欄位對映，位置不固定，但表頭的內容知道的情況。

    @Test
    public void testExcel2003WithReflectModel() {
        InputStream inputStream = getInputStream("loan1.xls");
        try {
            // 解析每行結果在listener中處理
            AnalysisEventListener listener = new ExcelListener();

            ExcelReader excelReader = new ExcelReader(inputStream, ExcelTypeEnum.XLS, null, listener);

            excelReader.read(new Sheet(1, 2, LoanInfo.class));
        } catch (Exception e) {

        } finally {
            try {
                inputStream.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

    }

帶模型解析與不帶模型解析主要在構造new Sheet(1, 2, LoanInfo.class)時候包含class。Class需要繼承BaseRowModel暫時BaseRowModel沒有任何內容，後面升級可能會增加一些預設的資料。

寫Excel

每行資料是List<String>無表頭

  OutputStream out = new FileOutputStream("/Users/jipengfei/77.xlsx");
        try {
            ExcelWriter writer = new ExcelWriter(out, ExcelTypeEnum.XLSX,false);
            //寫第一個sheet, sheet1  資料全是List<String> 無模型對映關係
            Sheet sheet1 = new Sheet(1, 0);
            sheet1.setSheetName("第一個sheet");
            writer.write(getListString(), sheet1);
            writer.finish();
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                out.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

每行資料是一個java模型有表頭----表頭層級為一

生成Excel格式如下圖

模型寫法如下：

public class ExcelPropertyIndexModel extends BaseRowModel {

    @ExcelProperty(value = "姓名" ,index = 0)
    private String name;

    @ExcelProperty(value = "年齡",index = 1)
    private String age;

    @ExcelProperty(value = "郵箱",index = 2)
    private String email;

    @ExcelProperty(value = "地址",index = 3)
    private String address;

    @ExcelProperty(value = "性別",index = 4)
    private String sax;

    @ExcelProperty(value = "高度",index = 5)
    private String heigh;

    @ExcelProperty(value = "備註",index = 6)
    private String last;
}

@ExcelProperty(value = "姓名",index = 0) value是表頭資料，預設會寫在excel的表頭位置，index代表第幾列。

 @Test
    public void test1() throws FileNotFoundException {
        OutputStream out = new FileOutputStream("/Users/jipengfei/78.xlsx");
        try {
            ExcelWriter writer = new ExcelWriter(out, ExcelTypeEnum.XLSX);
            //寫第一個sheet, sheet1  資料全是List<String> 無模型對映關係
            Sheet sheet1 = new Sheet(1, 0,ExcelPropertyIndexModel.class);
            writer.write(getData(), sheet1);
            writer.finish();
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                out.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

每行資料是一個java模型有表頭----表頭層級為多層級

生成Excel格式如下圖： java模型寫法如下：

public class MultiLineHeadExcelModel extends BaseRowModel {

    @ExcelProperty(value = {"表頭1","表頭1","表頭31"},index = 0)
    private String p1;

    @ExcelProperty(value = {"表頭1","表頭1","表頭32"},index = 1)
    private String p2;

    @ExcelProperty(value = {"表頭3","表頭3","表頭3"},index = 2)
    private int p3;

    @ExcelProperty(value = {"表頭4","表頭4","表頭4"},index = 3)
    private long p4;

    @ExcelProperty(value = {"表頭5","表頭51","表頭52"},index = 4)
    private String p5;

    @ExcelProperty(value = {"表頭6","表頭61","表頭611"},index = 5)
    private String p6;

    @ExcelProperty(value = {"表頭6","表頭61","表頭612"},index = 6)
    private String p7;

    @ExcelProperty(value = {"表頭6","表頭62","表頭621"},index = 7)
    private String p8;

    @ExcelProperty(value = {"表頭6","表頭62","表頭622"},index = 8)
    private String p9;
}

寫Excel寫法同上，只需將ExcelPropertyIndexModel.class改為MultiLineHeadExcelModel.class

一個Excel多個sheet寫法

 @Test
    public void test1() throws FileNotFoundException {

        OutputStream out = new FileOutputStream("/Users/jipengfei/77.xlsx");
        try {
            ExcelWriter writer = new ExcelWriter(out, ExcelTypeEnum.XLSX,false);
            //寫第一個sheet, sheet1  資料全是List<String> 無模型對映關係
            Sheet sheet1 = new Sheet(1, 0);
            sheet1.setSheetName("第一個sheet");
            writer.write(getListString(), sheet1);

            //寫第二個sheet sheet2  模型上打有表頭的註解，合併單元格
            Sheet sheet2 = new Sheet(2, 3, MultiLineHeadExcelModel.class, "第二個sheet", null);
            sheet2.setTableStyle(getTableStyle1());
            writer.write(getModeldatas(), sheet2);

            //寫sheet3  模型上沒有註解，表頭資料動態傳入
            List<List<String>> head = new ArrayList<List<String>>();
            List<String> headCoulumn1 = new ArrayList<String>();
            List<String> headCoulumn2 = new ArrayList<String>();
            List<String> headCoulumn3 = new ArrayList<String>();
            headCoulumn1.add("第一列");
            headCoulumn2.add("第二列");
            headCoulumn3.add("第三列");
            head.add(headCoulumn1);
            head.add(headCoulumn2);
            head.add(headCoulumn3);
            Sheet sheet3 = new Sheet(3, 1, NoAnnModel.class, "第三個sheet", head);
            writer.write(getNoAnnModels(), sheet3);
            writer.finish();
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                out.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

一個sheet中有多個表格

@Test
    public void test2() throws FileNotFoundException {
        OutputStream out = new FileOutputStream("/Users/jipengfei/77.xlsx");
        try {
            ExcelWriter writer = new ExcelWriter(out, ExcelTypeEnum.XLSX,false);

            //寫sheet1  資料全是List<String> 無模型對映關係
            Sheet sheet1 = new Sheet(1, 0);
            sheet1.setSheetName("第一個sheet");
            Table table1 = new Table(1);
            writer.write(getListString(), sheet1, table1);
            writer.write(getListString(), sheet1, table1);

            //寫sheet2  模型上打有表頭的註解
            Table table2 = new Table(2);
            table2.setTableStyle(getTableStyle1());
            table2.setClazz(MultiLineHeadExcelModel.class);
            writer.write(getModeldatas(), sheet1, table2);

            //寫sheet3  模型上沒有註解，表頭資料動態傳入,此情況下模型field順序與excel現實順序一致
            List<List<String>> head = new ArrayList<List<String>>();
            List<String> headCoulumn1 = new ArrayList<String>();
            List<String> headCoulumn2 = new ArrayList<String>();
            List<String> headCoulumn3 = new ArrayList<String>();
            headCoulumn1.add("第一列");
            headCoulumn2.add("第二列");
            headCoulumn3.add("第三列");
            head.add(headCoulumn1);
            head.add(headCoulumn2);
            head.add(headCoulumn3);
            Table table3 = new Table(3);
            table3.setHead(head);
            table3.setClazz(NoAnnModel.class);
            table3.setTableStyle(getTableStyle2());
            writer.write(getNoAnnModels(), sheet1, table3);
            writer.write(getNoAnnModels(), sheet1, table3);

            writer.finish();
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                out.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

測試資料分析

從上面的效能測試可以看出easyexcel在解析耗時上比poiuserModel模式弱了一些。主要原因是我內部採用了反射做模型欄位對映，中間我也加了cache，但感覺這點差距可以接受的。但在記憶體消耗上差別就比較明顯了，easyexcel在後面檔案再增大，記憶體消耗幾乎不會增加了。但poi userModel就不一樣了，簡直就要爆掉了。想想一個excel解析200M，同時有20個人再用估計一臺機器就掛了。

easyexcel解決POI解析Excel出現OOM

讀Excel

無java模型直接把excel解析的每行結果以List<String>返回在ExcelListener獲取解析結果

有java模型對映

寫Excel

每行資料是List<String>無表頭

每行資料是一個java模型有表頭----表頭層級為一

每行資料是一個java模型有表頭----表頭層級為多層級

一個Excel多個sheet寫法

一個sheet中有多個表格

測試資料分析

easyexcel解決POI解析Excel出現OOM

java使用POI解析Excel表格中由純數字組成的字串報錯問題&解決資料自動儲存為科學計數法問題

POI操作EXCEL出現的單元格格式丟失問題的解決方案

POI解析Excel，解決長數字變成科學計數法或double的問題

POI:操作EXCEL出現的單元格格式丟失問題的解決方案

Poi解析Excel

使用apache POI解析Excel文件

java利用poi解析excel文件

Jquery的一鍵上傳元件OCUpload及POI解析Excel檔案

springmvc 使用poi解析excel並通過hibernate連續插入多條數據實際數據庫只能保存最後一條

你需要一個新的model實體的時候必須new一個.奇怪的問題: 使用poi解析Excel的把資料插入資料庫同時把資料放在一個list中,返回到頁面展示,結果頁面把最後一條資料顯示了N次

poi解析excel遇到的問題

Excel轉Html(六)-POI解析excel轉HTML-表項內容位置-樣式關係對映-Excel-cell>Html-td

Excel轉Html(五)-POI解析excel轉HTML-表格邊框-樣式對應關係

Excel轉Html(三)-POI解析Excel-css轉Html-css-class

POI解析Excel之應用反射等技術實現動態讀取

poi解析excel 成List 結構

poi解析excel驗證檔案是否符合模板，以及前端提示後端上傳驗證結果

POI解析Excel檔案工具類

poi 解析 excel

easyexcel解決POI解析Excel出現OOM

讀Excel

無java模型直接把excel解析的每行結果以List<String>返回 在ExcelListener獲取解析結果

有java模型對映

寫Excel

每行資料是List<String>無表頭

每行資料是一個java模型有表頭----表頭層級為一

每行資料是一個java模型有表頭----表頭層級為多層級

一個Excel多個sheet寫法

一個sheet中有多個表格

測試資料分析

相關推薦

無java模型直接把excel解析的每行結果以List<String>返回在ExcelListener獲取解析結果