tesseract-ocr 使用java進行識別
阿新 • • 發佈:2018-11-21
需要加入如下的jar
<dependency> <groupId>net.java.dev.jna</groupId> <artifactId>jna</artifactId> <version>4.1.0</version> </dependency> <!-- https://mvnrepository.com/artifact/net.sourceforge.tess4j/tess4j --> <dependency> <groupId>net.sourceforge.tess4j</groupId> <artifactId>tess4j</artifactId> <version>3.2.1</version> <exclusions> <exclusion> <groupId>com.sun.jna</groupId> <artifactId>jna</artifactId> </exclusion> </exclusions> </dependency>
具體程式碼
public static void ocr(String filename) { try { File tifFile = new File(filename);//要識別檔案 ITesseract instance = new Tesseract(); //指定放著庫資料夾的資料夾 instance.setDatapath("/usr/local/share"); instance.setLanguage("chi_sim");//設定為中文 System.out.println( tifFile.canRead() );//檢視檔案是不是能被找到,可讀 String result = instance.doOCR(tifFile);//進行識別 System.out.println( result ); } catch (TesseractException e) { e.printStackTrace(); } }