1. 程式人生 > >jdom 或 dom4j讀取xml檔案時如何讓dtd驗證使用本地dtd檔案或者不生效

jdom 或 dom4j讀取xml檔案時如何讓dtd驗證使用本地dtd檔案或者不生效

一、寫在所有之前:
因為dom4j和jdom在這個問題上處理的方法是一模一樣的,只是一個是SAXBuilder 一個SAXReader,這裡以jdom距離,至於dom4j只需要同理替換一下就可以了。
二、問題發生的情況
當你用jdom讀取一個有dtd驗證的xml檔案,同時你的網路是不通的情況下。會出現以下錯誤:
1,程式碼如下

None.gifpackagedom;
None.gif
None.gif
importjava.io.File;
None.gif
None.gif
importorg.jdom.Document;
None.gif
importorg.jdom.input.SAXBuilder;
None.gif
ExpandedBlockStart.gifContractedBlock.gif
publicclassTestJdomdot.gif{
ExpandedSubBlockStart.gifContractedSubBlock.gif
publicstaticvoidmain(String[]args)
dot.gif{
InBlock.gifFilefile
=newFile("./src/dom/aiwf_aiService.xml");
ExpandedSubBlockStart.gifContractedSubBlock.gif
if(file.exists())dot.gif{
InBlock.gifSAXBuilderbuilder
=newSAXBuilder();
ExpandedSubBlockStart.gifContractedSubBlock.gif
trydot.gif{
InBlock.gifDocumentdoc
=builder.build(file);
InBlock.gifSystem.out.println(doc);
ExpandedSubBlockStart.gifContractedSubBlock.gif}
catch(Exceptione)dot.gif{
InBlock.gife.printStackTrace();
ExpandedSubBlockEnd.gif}

ExpandedSubBlockStart.gifContractedSubBlock.gif}
elsedot.gif{
InBlock.gifSystem.out.println(
"cannotfindxmlfile:"
InBlock.gif
+file.getAbsolutePath());
ExpandedSubBlockEnd.gif}

ExpandedSubBlockEnd.gif}

ExpandedBlockEnd.gif}

None.gif

2,xml檔案

None.gif<?xmlversion="1.0"encoding="GBK"
?>
None.gif
<!DOCTYPEworkflowPUBLIC"-//OpenSymphonyGroup//DTDOSWorkflow2.8//EN""http://www.opensymphony.com/osworkflow/workflow_2_8.dtd">
None.gif
<workflow>
...............
None.gif
</workflow>


3,錯誤如下

None.gifjava.net.SocketException:Permissiondenied:connect
None.gifatjava.net.PlainSocketImpl.socketConnect(NativeMethod)
None.gifatjava.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:
333)
None.gifatjava.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:
195)
None.gifatjava.net.PlainSocketImpl.connect(PlainSocketImpl.java:
182)
None.gifatjava.net.Socket.connect(Socket.java:
507)
None.gifatjava.net.Socket.connect(Socket.java:
457)
None.gifatsun.net.NetworkClient.doConnect(NetworkClient.java:
157)
None.gifatsun.net.www.http.HttpClient.openServer(HttpClient.java:
365)
None.gifatsun.net.www.http.HttpClient.openServer(HttpClient.java:
477)
None.gifatsun.net.www.http.HttpClient.
<init>(HttpClient.java:214)
None.gifatsun.net.www.http.HttpClient.New(HttpClient.java:
287)
None.gifatsun.net.www.http.HttpClient.New(HttpClient.java:
299)
None.gifatsun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:
792)
None.gifatsun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:
744)
None.gifatsun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:
669)
None.gifatsun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:
913)
None.gifatcom.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:
973)
None.gifatcom.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:
905)
None.gifatcom.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:
872)
None.gifatcom.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:
282)
None.gifatcom.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(XMLDocumentScannerImpl.java:
1021)
None.gifatcom.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:
368)
None.gifatcom.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:
834)
None.gifatcom.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:
764)
None.gifatcom.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:
148)
None.gifatcom.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:
1242)
None.gifatorg.jdom.input.SAXBuilder.build(SAXBuilder.java:
453)
None.gifatorg.jdom.input.SAXBuilder.build(SAXBuilder.java:
810)
None.gifatorg.jdom.input.SAXBuilder.build(SAXBuilder.java:
789)
None.gifatdom.TestJdom.main(TestJdom.java:
26)
None.gif

三、分析原因
當執行build的時候jdom分析到
DOCTYPEworkflowPUBLIC"-/OpenSymphonyGroup//DTDOSWorkflow2.8//EN""http://www.opensymphony.com/osworkflow/workflow_2_8.dtd
就會去讀取這裡的dtd檔案來驗證,但是因為網路是不通的所以就會報socket錯誤。

四、解決辦法
1,最開始檢視jdom api發現了這樣一個方法
builder.setValidation(false);
這樣可以讓jdom不做驗證,但是結果依然出問題,查了一下原因,說雖然不驗證但是還是會下載
2,參照jdom網站的FAQ http://www.jdom.org/docs/faq.html#a0100
這是原文內容
None.gifHowdoIkeeptheDTDfromloading?EvenwhenIturnoffvalidationtheparsertriestoloadtheDTDfile.
None.gif
None.gifEvenwhenvalidationisturnedoff,anXMLparserwillbydefaultloadtheexternalDTDfileinordertoparsetheDTDforexternalentitydeclarations.Xerceshasafeaturetoturnoffthisbehaviornamed"http://apache.org/xml/features/nonvalidating/load-external-dtd"andifyouknowyou'reusingXercesyoucansetthisfeatureonthebuilder.
None.gif
None.gifbuilder.setFeature(
None.gif"http://apache.org/xml/features/nonvalidating/load-external-dtd",false);
None.gif
None.gifIfyou'reusinganotherparserlikeCrimson,yourbestbetistosetupanEntityResolverthatresolvestheDTDwithoutactuallyreadingtheseparatefile.
None.gif
None.gifimportorg.xml.sax.*;
None.gifimportjava.io.*;
None.gif
None.gifpublicclassNoOpEntityResolverimplementsEntityResolver{
None.gifpublicInputSourceresolveEntity(StringpublicId,StringsystemId){
None.gifreturnnewInputSource(newStringBufferInputStream(""));
None.gif}
None.gif}
None.gif
None.gifTheninthebuilderdot.gif
None.gif
None.gif
None.gifbuilder.setEntityResolver(newNoOpEntityResolver());
None.gif
None.gifThereisadownsidetothisapproach.Anyentitiesinthedocumentwillberesolvedtotheemptystring,andwilleffectivelydisappear.Ifyourdocumenthasentities,youneedtosetExpandEntities(false)codeandensuretheEntityResolveronlysuppressestheDocType.
None.gif
裡邊教我們定義個類
ExpandedBlockStart.gifContractedBlock.gifpublicclassNoOpEntityResolverimplementsEntityResolverdot.gif{
ExpandedSubBlockStart.gifContractedSubBlock.gif
publicInputSourceresolveEntity(StringpublicId,StringsystemId)dot.gif{
InBlock.gif
returnnewInputSource(newStringBufferInputStream(""));
ExpandedSubBlockEnd.gif}

ExpandedBlockEnd.gif}

None.gif

通過builder.setEntityResolver(newNoOpEntityResolver())方法來隱蔽起dtd驗證器。這樣就不會出錯了。試了一下確實沒問題了。但要知道xml沒有dtd驗證是不好的,我們是否能讓它使用本地dtd驗證呢。例如本文的oswork
我把驗證檔案workflow_2_8.dtd拷貝到本地,能否驗證的時候用本地的呢?
3,用本地dtd驗證
方法有兩種
方法一、更改xml中的doctype宣告,但是一般情況下更改這個是不好的。更改後就不是標準的了。
方法二、驗證期替換
看到上邊FAQ講的方法你是否有什麼靈感呢?
看看下邊這段程式碼

None.gifpackagedom;
None.gif
None.gif
importjava.io.File;
None.gif
importjava.io.IOException;
None.gif
None.gif
importorg.jdom.Document;
None.gif
importorg.jdom.input.SAXBuilder;
None.gif
importorg.xml.sax.EntityResolver;
None.gif
importorg.xml.sax.InputSource;
None.gif
importorg.xml.sax.SAXException;
None.gif
ExpandedBlockStart.gifContractedBlock.gif
publicclassTestJdomdot.gif{
ExpandedSubBlockStart.gifContractedSubBlock.gif
publicstaticvoidmain(String[]args)dot.gif{
InBlock.gifFilefile
=newFile("./src/dom/aiwf_aiService.xml");
ExpandedSubBlockStart.gifContractedSubBlock.gif
if(file.exists())dot.gif{
InBlock.gifSAXBuilderbuilder
=newSAXBuilder();
InBlock.gifbuilder.setValidation(
false);
ExpandedSubBlockStart.gifContractedSubBlock.gifbuilder.setEntityResolver(
newEntityResolver()dot.gif{
InBlock.gif
publicInputSourceresolveEntity(StringpublicId,
ExpandedSubBlockStart.gifContractedSubBlock.gifStringsystemId)
throwsSAXException,IOExceptiondot.gif{
InBlock.gif
returnnewInputSource("./workflow_2_8.dtd");
ExpandedSubBlockEnd.gif}

ExpandedSubBlockEnd.gif}
);
ExpandedSubBlockStart.gifContractedSubBlock.gif
trydot.gif{
InBlock.gifDocumentdoc
=builder.build(file);
InBlock.gifSystem.out.println(doc);
ExpandedSubBlockStart.gifContractedSubBlock.gif}
catch(Exceptione)dot.gif{
InBlock.gife.printStackTrace();
ExpandedSubBlockEnd.gif}

ExpandedSubBlockStart.gifContractedSubBlock.gif}
elsedot.gif{
InBlock.gifSystem.out.println(
"cannotfindxmlfile:"
InBlock.gif
+file.getAbsolutePath());
ExpandedSubBlockEnd.gif}

ExpandedSubBlockEnd.gif}

ExpandedBlockEnd.gif}

None.gif

對了,同樣是自己實現一個EntityResolver(這裡用了匿名類),不同的是在裡邊使用本地的dtd驗證
另外,匿名類內部,似乎這樣寫起來更順眼些

None.gifInputStreamstream=newFileInputStream("yourdtdfilepath");
None.gifInputSourceis
=newInputSource(stream);
None.gifis.setPublicId(publicId);
None.gifis.setSystemId(systemId);
None.gif
returnis;