XML檔案的解析以及XML外部實體注入防護
XML外部實體注入
例:
InputStream is = Test01.class.getClassLoader().getResourceAsStream("evil.xml");//source
XMLInputFactory xmlFactory = XMLInputFactory.newInstance();
XMLEventReader reader = xmlFactory.createXMLEventReader(is); //sink
如果evil.xml檔案中包含如下內容,就可能會造成xml外部實體注入
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///etc/passwd" >]><foo>&xxe;</foo>
XML檔案的解析與XXE防護
DOM
DOM的全稱是Document Object Model,也即文件物件模型。在應用程式中,基於DOM的XML分析器將一個XML文件轉換成一個物件模型的集合(通常稱DOM樹),應用程式正是通過對這個物件模型的操作,來實現對XML文件資料的操作。
import javax.xml.parsers .DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
...
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
System.out.println("class name: " + dbf.getClass().getName());
// step 2:獲得具體的dom解析器
DocumentBuilder db = dbf.newDocumentBuilder ();
// step3: 解析一個xml文件,獲得Document物件(根結點)
Document document = db.parse(new File("candidate.xml"));
NodeList list = document.getElementsByTagName("PERSON");
防護建議1
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// 這是優先選擇. 如果不允許DTDs (doctypes) ,幾乎可以阻止所有的XML實體攻擊
String FEATURE = "http://apache.org/xml/features/disallow-doctype-decl";
dbf.setFeature(FEATURE, true);
catch (ParserConfigurationException e) {
// This should catch a failed setFeature feature
logger.info("ParserConfigurationException was thrown. The feature '" +
FEATURE +
"' is probably not supported by your XML processor.");
...
}
catch (SAXException e) {
// On Apache, this should be thrown when disallowing DOCTYPE
logger.warning("A DOCTYPE was passed into the XML document");
...
}
catch (IOException e) {
// XXE that points to a file that doesn't exist
logger.error("IOException occurred, XXE may still possible: " + e.getMessage());
...
}
防護建議2
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// 如果不能完全禁用DTDs,最少採取以下措施
FEATURE = "http://xml.org/sax/features/external-general-entities";
dbf.setFeature(FEATURE, false);
FEATURE = "http://xml.org/sax/features/external-parameter-entities";
dbf.setFeature(FEATURE, false);
// and these as well, per Timothy Morgan's 2014 paper: "XML Schema, DTD, and Entity Attacks" (see reference below)
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
// And, per Timothy Morgan: "If for some reason support for inline DOCTYPEs are a requirement, then ensure the entity settings are disabled (as shown above) and beware that SSRF attacks(http://cwe.mitre.org/data/definitions/918.html) and denial of service attacks (such as billion laughs or decompression bombs via "jar:") are a risk."
...
catch (ParserConfigurationException e) {
// This should catch a failed setFeature feature
logger.info("ParserConfigurationException was thrown. The feature '" +
FEATURE +
"' is probably not supported by your XML processor.");
...
}
catch (SAXException e) {
// On Apache, this should be thrown when disallowing DOCTYPE
logger.warning("A DOCTYPE was passed into the XML document");
...
}
catch (IOException e) {
// XXE that points to a file that doesn't exist
logger.error("IOException occurred, XXE may still possible: " + e.getMessage());
...
}
SAX
SAX的全稱是Simple APIs for XML,也即XML簡單應用程式介面。與DOM不同,SAX提供的訪問模式是一種順序模式,這是一種快速讀寫XML資料的方式。當使用SAX分析器對XML文件進行分析時,會觸發一系列事件,並激活相應的事件處理函式,應用程式通過這些事件處理函式實現對XML文件的訪問,因而SAX介面也被稱作事件驅動介面。
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
SAXParserFactory factory = SAXParserFactory.newInstance();
//step2: 獲得SAX解析器例項
SAXParser parser = factory.newSAXParser();
//step3: 開始進行解析
parser.parse(new File("student.xml"), new MyHandler());
防護建議
參考DocumentBuilderFactory
JDOM
JDOM(Java-based Document Object Model)是一個開源專案,它基於樹型結構,利用純JAVA的技術對XML文件實現解析、生成、序列化以及多種操作。
import org.jdom.Attribute;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.input.SAXBuilder;
import org.jdom.output.Format;
import org.jdom.output.XMLOutputter;
...
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(new File("jdom.xml"));
Element element = doc.getRootElement();
DOM4J
DOM4J(Document Object Model for Java),採用Java集合框架,並完全支援DOM、SAX和JAXP
StAX
StAX(Streaming API for XML) 就是一種拉分析式的XML解析技術(基於流模型中拉模型的分析方式就稱為拉分析)。StAX包括兩套處理XML的API,分別提供了不同程度的抽象。它們是:基於指標的API和基於迭代器的API。
可以讓我們使用基於指標的API的介面是javax.xml.stream.XMLStreamReader(很遺憾,你不能直接例項化它),要得到它的例項,我們需要藉助於javax.xml.stream.XMLInputFactory類。
//獲得一個XMLInputFactory例項
XMLInputFactory factory = XMLInputFactory.newInstance();
//開始解析
XMLStreamReader reader = factory.createXMLStreamReader(new FileReader("users.xml"));
防護建議
XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty(XMLInputFactory.SUPPORT_DTD, false); //會完全禁止DTD
XMLStreamReader reader = factory.createXMLStreamReader(new FileReader("users.xml"));