jdom 或 dom4j讀取xml檔案時如何讓dtd驗證使用本地dtd檔案或者不生效
阿新 • • 發佈:2019-02-02
一、寫在所有之前:
因為dom4j和jdom在這個問題上處理的方法是一模一樣的,只是一個是SAXBuilder 一個SAXReader,這裡以jdom距離,至於dom4j只需要同理替換一下就可以了。
二、問題發生的情況
當你用jdom讀取一個有dtd驗證的xml檔案,同時你的網路是不通的情況下。會出現以下錯誤:
1,程式碼如下
importjava.io.File;
importorg.jdom.Document;
importorg.jdom.input.SAXBuilder;
publicclassTestJdom{
publicstaticvoidmain(String[]args)
Filefile=newFile("./src/dom/aiwf_aiService.xml");
if(file.exists()){
SAXBuilderbuilder=newSAXBuilder();
try{
Documentdoc=builder.build(file);
System.out.println(doc);
}catch(Exceptione){
e.printStackTrace();
}
}else{
System.out.println("cannotfindxmlfile:"
+file.getAbsolutePath());
}
}
}
2,xml檔案
<!DOCTYPEworkflowPUBLIC"-//OpenSymphonyGroup//DTDOSWorkflow2.8//EN""http://www.opensymphony.com/osworkflow/workflow_2_8.dtd">
<workflow>
...............
</workflow>
3,錯誤如下
atjava.net.PlainSocketImpl.socketConnect(NativeMethod)
atjava.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:
atjava.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
atjava.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
atjava.net.Socket.connect(Socket.java:507)
atjava.net.Socket.connect(Socket.java:457)
atsun.net.NetworkClient.doConnect(NetworkClient.java:157)
atsun.net.www.http.HttpClient.openServer(HttpClient.java:365)
atsun.net.www.http.HttpClient.openServer(HttpClient.java:477)
atsun.net.www.http.HttpClient.<init>(HttpClient.java:214)
atsun.net.www.http.HttpClient.New(HttpClient.java:287)
atsun.net.www.http.HttpClient.New(HttpClient.java:299)
atsun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:792)
atsun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:744)
atsun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:669)
atsun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:913)
atcom.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:973)
atcom.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:905)
atcom.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:872)
atcom.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:282)
atcom.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(XMLDocumentScannerImpl.java:1021)
atcom.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)
atcom.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)
atcom.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
atcom.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
atcom.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)
atorg.jdom.input.SAXBuilder.build(SAXBuilder.java:453)
atorg.jdom.input.SAXBuilder.build(SAXBuilder.java:810)
atorg.jdom.input.SAXBuilder.build(SAXBuilder.java:789)
atdom.TestJdom.main(TestJdom.java:26)
三、分析原因
當執行build的時候jdom分析到
DOCTYPEworkflowPUBLIC"-/OpenSymphonyGroup//DTDOSWorkflow2.8//EN""http://www.opensymphony.com/osworkflow/workflow_2_8.dtd
就會去讀取這裡的dtd檔案來驗證,但是因為網路是不通的所以就會報socket錯誤。
四、解決辦法
1,最開始檢視jdom api發現了這樣一個方法
builder.setValidation(false);
這樣可以讓jdom不做驗證,但是結果依然出問題,查了一下原因,說雖然不驗證但是還是會下載
2,參照jdom網站的FAQ http://www.jdom.org/docs/faq.html#a0100
這是原文內容
HowdoIkeeptheDTDfromloading?EvenwhenIturnoffvalidationtheparsertriestoloadtheDTDfile.
Evenwhenvalidationisturnedoff,anXMLparserwillbydefaultloadtheexternalDTDfileinordertoparsetheDTDforexternalentitydeclarations.Xerceshasafeaturetoturnoffthisbehaviornamed"http://apache.org/xml/features/nonvalidating/load-external-dtd"andifyouknowyou'reusingXercesyoucansetthisfeatureonthebuilder.
builder.setFeature(
"http://apache.org/xml/features/nonvalidating/load-external-dtd",false);
Ifyou'reusinganotherparserlikeCrimson,yourbestbetistosetupanEntityResolverthatresolvestheDTDwithoutactuallyreadingtheseparatefile.
importorg.xml.sax.*;
importjava.io.*;
publicclassNoOpEntityResolverimplementsEntityResolver{
publicInputSourceresolveEntity(StringpublicId,StringsystemId){
returnnewInputSource(newStringBufferInputStream(""));
}
}
Theninthebuilder
builder.setEntityResolver(newNoOpEntityResolver());
Thereisadownsidetothisapproach.Anyentitiesinthedocumentwillberesolvedtotheemptystring,andwilleffectivelydisappear.Ifyourdocumenthasentities,youneedtosetExpandEntities(false)codeandensuretheEntityResolveronlysuppressestheDocType.
裡邊教我們定義個類
publicclassNoOpEntityResolverimplementsEntityResolver{
publicInputSourceresolveEntity(StringpublicId,StringsystemId){
returnnewInputSource(newStringBufferInputStream(""));
}
}
通過builder.setEntityResolver(newNoOpEntityResolver())方法來隱蔽起dtd驗證器。這樣就不會出錯了。試了一下確實沒問題了。但要知道xml沒有dtd驗證是不好的,我們是否能讓它使用本地dtd驗證呢。例如本文的oswork
我把驗證檔案workflow_2_8.dtd拷貝到本地,能否驗證的時候用本地的呢?
3,用本地dtd驗證
方法有兩種
方法一、更改xml中的doctype宣告,但是一般情況下更改這個是不好的。更改後就不是標準的了。
方法二、驗證期替換
看到上邊FAQ講的方法你是否有什麼靈感呢?
看看下邊這段程式碼
importjava.io.File;
importjava.io.IOException;
importorg.jdom.Document;
importorg.jdom.input.SAXBuilder;
importorg.xml.sax.EntityResolver;
importorg.xml.sax.InputSource;
importorg.xml.sax.SAXException;
publicclassTestJdom{
publicstaticvoidmain(String[]args){
Filefile=newFile("./src/dom/aiwf_aiService.xml");
if(file.exists()){
SAXBuilderbuilder=newSAXBuilder();
builder.setValidation(false);
builder.setEntityResolver(newEntityResolver(){
publicInputSourceresolveEntity(StringpublicId,
StringsystemId)throwsSAXException,IOException{
returnnewInputSource("./workflow_2_8.dtd");
}
});
try{
Documentdoc=builder.build(file);
System.out.println(doc);
}catch(Exceptione){
e.printStackTrace();
}
}else{
System.out.println("cannotfindxmlfile:"
+file.getAbsolutePath());
}
}
}
對了,同樣是自己實現一個EntityResolver(這裡用了匿名類),不同的是在裡邊使用本地的dtd驗證
另外,匿名類內部,似乎這樣寫起來更順眼些
InputSourceis=newInputSource(stream);
is.setPublicId(publicId);
is.setSystemId(systemId);
returnis;