XML之dom4j的xpath解析

阿新 • • 發佈：2022-03-28

簡介：

XPath 可用來在 XML 文件中對元素和屬性進行遍歷。

參考文件：

https://www.w3cschool.cn/xpath/xpath-syntax.html

XPath 使用路徑表示式在 XML 文件中進行導航
XPath 包含一個標準函式庫
XPath 是 XSLT 中的主要元素
XPath 是一個 W3C 標準

節點:
在 XPath 中，有七種型別的節點：元素、屬性、文字、名稱空間、處理指令、註釋以及文件（根）節點。XML 文件是被作為節點樹來對待的。樹的根被稱為文件節點或者根節點。

請看下面這個 XML 文件：

 <?xml version="1.0" encoding="ISO-8859-1" 
?>

 <bookstore>
  <book>
     <title lang="en">Harry Potter</title>
     <author>J K. Rowling</author>
     <year>2005</year>
     <price>29.99</price>
   </book>
</bookstore>

上面的XML文件中的節點例子：
 <bookstore> (文件節點)
 <author> 
J K. Rowling</author> (元素節點)
 lang="en" (屬性節點) 

基本值（或稱原子值，Atomic value）:
基本值是無父或無子的節點。　　
基本值的例子：

 J K. Rowling

 "en"

專案（Item）:
專案是基本值或者節點。

節點關係:

 <bookstore>

 <book>
   <title>Harry Potter</title>
   <author>J K. Rowling</author>
   <year>2005</year 
>
   <price>29.99</price>
 </book>

 </bookstore>

父（Parent）
每個元素以及屬性都有一個父。
book 元素是 title、author、year 以及 price 元素的父：

子（Children）:
元素節點可有零個、一個或多個子。
title、author、year 以及 price 元素都是 book 元素的子：

同胞（Sibling）:
擁有相同的父的節點
title、author、year 以及 price 元素都是同胞：

先輩（Ancestor）:
某節點的父、父的父，等等。
title 元素的先輩是 book 元素和 bookstore 元素：


後代（Descendant）:
某個節點的子，子的子，等等。
bookstore 的後代是 book、title、author、year 以及 price 元素：

XPath 軸（Axes）

軸可定義相對於當前節點的節點集。

軸名稱	結果
ancestor	選取當前節點的所有先輩（父、祖父等）。
ancestor-or-self	選取當前節點的所有先輩（父、祖父等）以及當前節點本身。
attribute	選取當前節點的所有屬性。
child	選取當前節點的所有子元素。
descendant	選取當前節點的所有後代元素（子、孫等）。
descendant-or-self	選取當前節點的所有後代元素（子、孫等）以及當前節點本身。
following	選取文件中當前節點的結束標籤之後的所有節點。
namespace	選取當前節點的所有名稱空間節點。
parent	選取當前節點的父節點。
preceding	選取文件中當前節點的開始標籤之前的所有節點。
preceding-sibling	選取當前節點之前的所有同級節點。
self	選取當前節點。

XPath 運算子

下面列出了可用在 XPath 表示式中的運算子：

運算子	描述	例項	返回值
\|	計算兩個節點集	//book \| //cd	返回所有擁有 book 和 cd 元素的節點集
+	加法	6 + 4	10
-	減法	6 - 4	2
*	乘法	6 * 4	24
div	除法	8 div 4	2
=	等於	price=9.80	如果 price 是 9.80，則返回 true。如果 price 是 9.90，則返回 false。
!=	不等於	price!=9.80	如果 price 是 9.90，則返回 true。如果 price 是 9.80，則返回 false。
<	小於	price<9.80	如果 price 是 9.00，則返回 true。如果 price 是 9.90，則返回 false。
<=	小於或等於	price<=9.80	如果 price 是 9.00，則返回 true。如果 price 是 9.90，則返回 false。
>	大於	price>9.80	如果 price 是 9.90，則返回 true。如果 price 是 9.80，則返回 false。
>=	大於或等於	price>=9.80	如果 price 是 9.90，則返回 true。如果 price 是 9.70，則返回 false。
or	或	price=9.80 or price=9.70	如果 price 是 9.80，則返回 true。如果 price 是 9.50，則返回 false。
and	與	price>9.00 and price<9.90	如果 price 是 9.80，則返回 true。如果 price 是 8.50，則返回 false。
mod	計算除法的餘數	5 mod 2	1

dom4j使用XPath：

預設時dom4j是不支援XPath，所以想要使用XPath就需要匯入對應的jar包：

1、在dom4j網址中下載的檔案中匯入jaxen-1.1-beta-6.jar包，這個包是dom4j對XPath的支援：

https://dom4j.github.io/

然後是包jar包匯入到專案中：

詳見：https://www.cnblogs.com/0099-ymsml/p/16062244.html

使用XPath：

在dom4j中提供了兩個方法用來支援XPath：

NO.

方法

引數

作用

1、

獲取多個節點：public List selectNodes(String xpathExpression)

xpathExpression：xpath表示式

根據 XPath 表示式將結果作為節點例項或字串例項的列表返回。

2、

獲取一個節點：public Node selectSingleNode(String xpathExpression)

xpathExpression：xpath表示式

根據XPath 表示式並將結果作為單個 Node 例項返回

程式碼實現：

xml文件：

<?xml version="1.0" encoding="UTF-8"?>

<person> 
  <p1> 
    <name id="1">zs</name>  
    <age>100</age>  
    <sex>nv</sex> 
  </p1>  
  <p1> 
    <name>ls</name>  
    <age>11</age> 
  </p1> 
</person>

package XPathDemo1;
import java.util.List;
import org.dom4j.Document;
import org.dom4j.Node;
import cn.dom4jUtile.lm.Dom4jUtils;

public class XPathDemo1_1 {
    public static void main(String[] args) {
        System.out.print("selectAll: ");
        selectAll();
        System.out.print("\nselectSigle: ");
        selectSigle();
    }

    /**
     * 使用xpath獲取單個標籤
     */
    public static void selectSigle() {
        // 獲取解析xml後的Document多像
        Document doc = Dom4jUtils.getDocument();
        // 獲取對應單個節點
        Node name = doc.selectSingleNode("/person/p1/name");
        // 列印資訊
        System.out.println(name.getName() + ": " + name.getText());
    }

    /**
     * 使用xpath獲取所有name標籤
     */
    public static void selectAll() {
        // 獲取解析xml後的Document多像
        Document doc = Dom4jUtils.getDocument();
        // 通過方法selectNodes()獲取對應的標籤
        List<Node> name = doc.selectNodes("//name");
        // 增強for遍歷列表輸出資訊
        for (Node node : name) {
            System.out.print("{" + node.getName() + ": " + node.getText() + "} ");
        }
    }
}

注意上面的程式碼有個封裝類（注意修改xml檔案的路徑）：

package cn.dom4jUtile.lm;
import java.io.File;
import java.io.FileOutputStream;
import org.dom4j.Document;
import org.dom4j.io.OutputFormat;
import org.dom4j.io.SAXReader;
import org.dom4j.io.XMLWriter;

public final class Dom4jUtils {
    public static final String PATH="src" + File.separator + "XPathDemo1" + File.separator + "XPathDemo1_1.xml";
    
    /**
     * 將xml的回寫操作封裝為一個方法
     * @param xmlPath：xml的路徑
     * @param doc：回寫操作前修改資料後的Document物件
     */
    public static void ReWriteXml(Document doc) {
        try {
        //縮排文字
        OutputFormat format = OutputFormat.createPrettyPrint();
        // 建立寫入流
            XMLWriter Writer = new XMLWriter(new FileOutputStream(PATH),format);
            Writer.write(doc);
            Writer.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
    /**
     * 將建立解析器和解析xml的步驟封裝為一個方法
     * @param path    xml檔案的路徑
     * @return
     */
    public static Document getDocument() {
        try {
            // 建立解析器
            SAXReader reader = new SAXReader();
            // 解析xml得到Document
            Document doc = reader.read(PATH);
            return doc; 
        } catch (Exception e) {
            e.printStackTrace();
        }

        return null;
    }
}