XML轉換成TXT行資料的Java程式
阿新 • • 發佈:2020-07-03
ZKe
-------------------
XML資料的一個塊內的所有屬性,轉換成TXT檔案的一行。眾所周知XML檔案是通過類似HTML的標籤進行資料的定義如圖所示
屬性由id, article, discuss, insertTime, oigin, person_id, time, transmit,整個資料由RECORD標籤括住。
這是一個典型的括號匹配問題,可以定義一個訊號量標記資料的開始與結束,另外可以宣告一個String型別的變數作為資料緩衝區,遇到</RECORD>標籤就將改變數的值寫入新檔案,遇到<RECORD>便清空改變數。遇到任一屬性欄位的標籤,便寫入緩衝變數。
處理方法:
private String oneLine = null; private boolean canPrint = false; private void process(String line){ if(line.startsWith("<RECORD>")){ oneLine = ""; canPrint = false; return; }else if(line.startsWith("<RECORDS>")){return; }else if(line.startsWith("</RECORD>")){ canPrint = true; return; }else if(line.startsWith("</RECORDS>")){ return; } line = line.trim(); if(line.trim().startsWith("<id>")){ oneLine += line.substring(4, line.length()-5); oneLine+= " | "; } else if(line.trim().startsWith("<article>")){ if(line.indexOf("</article>")==-1){ oneLine+= line.substring(9); return; } oneLine += line.substring(9, line.length()-10); oneLine += " | "; }else if(line.trim().startsWith("<discuss>")){ oneLine += line.substring(9, line.length()-10); oneLine += " | "; }else if(line.trim().startsWith("<insertTime>")){ oneLine += line.substring(12, line.length()-13); oneLine += " | "; }else if(line.trim().startsWith("<origin>")){ oneLine += line.substring(8, line.length()-9); oneLine += " | "; }else if(line.trim().startsWith("<person_id>")){ oneLine += line.substring(11, line.length()-12); oneLine += " | "; }else if(line.trim().startsWith("<time>")){ oneLine += line.substring(6, line.length()-7); oneLine += " | "; }else if(line.trim().startsWith("<transmit>")){ oneLine += line.substring(10, line.length()-11); }else if(line.indexOf("</article>")!=-1){ oneLine += line.substring(0, line.length()-10); } }
XML資料的讀取使用BufferedReader,寫入TXT使用BufferedWriter,注意其中訊號量的控制
public void printToTXTFile(){ File file =new File(this.path); File targetFile = new File("/root/myCodes/finalClassDesign/stardardAllData.txt"); try { FileReader fr = new FileReader(file); BufferedReader br = new BufferedReader(fr); FileWriter fw = new FileWriter(targetFile); BufferedWriter bw = new BufferedWriter(fw); String line = ""; while((line = br.readLine())!= null){ process(line); if(canPrint){ bw.write(oneLine); bw.newLine(); // System.out.println(oneLine); } } bw.flush(); bw.close(); fw.close(); br.close(); fr.close(); } catch (FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } }
其他部分就是類裡面的其他屬性了,比如原始檔路徑,目標檔案路徑,函式呼叫了,省略自己補充
轉換後的TXT檔案內容如下,效果挺好,我是用"|"作為分割,其實有弊端,因為"|"在正則表示式裡面被視作萬用字元,大家改成逗號","或者分號";"甚至斜槓"|"什麼的即可