1. 程式人生 > >什麼是解析樹?What is a Parse Tree?

什麼是解析樹?What is a Parse Tree?

 

The Oracle (tm) Users' Co-Operative FAQ

What is a Parse Tree?

Author's name: Carel-Jan Engel

Author's Email: [email protected]

Date written: Mar 24, 2005

Oracle version(s): N/A

In documentation about tuning SQL, I see references to parse trees. What is aparse tree ?

A parse-tree is an internal structure, created by the compiler or interpreter while parsing some language construction. Parsing is also known as 'syntax analysis'.

An example (slightly adapted version of the example found at page 6 of the famous 'Dragon Book', Compilers: principles, techniques and tools, by Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman, Published by Addison Wesley. My copy is from 1986) will illustrate a parse tree. Rather than dealing with the complexities of a SQL statement, let's take a rather simple language construction: The assignment of the result of an expression to a variable:

一個例子(稍微適應版本:發現了著名的“龍書”,編譯器第6頁的例子。原則,技術和工具,由阿爾弗雷德五阿霍,拉維Sethi和傑弗裡烏爾曼由Addison Wesley出版,我的副本從1986年)將展示一個解析樹。處理一個SQL語句的複雜性,而不是讓我們來看看一個相當簡單的語言建設:一個變數的表示式的結果的分配:

result := salary + bonus * 1.10

When the compiler analyzes this statement the resulting parse-tree will look like this :

當編譯器分析該語句生成的解析樹看起來像這樣:

                            assignment

           ________ statement ____

          /                          |                      \

         /                          :=                      \

    identifier                             ___ expression _______

        |                                     /                    |                            \

      result                           /                      +                              \

                                expression                                   __ expression ___

                                        |                                            /               |                    \

                                  identifier                                /                  *                      \

                                        |                                 expression                       expression

                                   salary                                   |                                              |

                                                                           identifier                                  number

                                                                                 |                                               |

                                                                              bonus                                     1.10

The picture is an upside-down representation of a tree. The language elements in this small simple assignment are:identifiers (result, salary, bonus), operators (:=, +, *), and anumber (1.10). 'Identifier' is the language element that names a variable, function or procedure. 'Operator' is the language element that represents some action to be taken, upon theoperands at either end of the operator. Number is a constant, 1.10 in this statement. The syntax rules (' grammar') will specify which 'sentences' are valid.

圖片是倒樹的代表性。在這個小的簡單的賦值的語言元素是:識別符號(因此,工資,獎金),操作符(:=,+ *)和(1​​.10)aNumber的。 “標識”是一個變數,函式或過程的語言元素名稱。 “經營者”的語言元素,代表了一些,在運營商的兩端後應採取theoperands的行動。數量是一個常數,在此宣告1.10。將指定的語法規則(“語法”),“句子”是有效的。

After successfully decomposing the statement into its internal representation, the compiler or interpreter can 'walk the tree' to create the executable code for the construction. An interpreter will not generate code for the execution, but will invoke built-in executing functions by itself. Let's take the interpreter for the rest of the explanation, execution of the steps is easier to explain than the code-generation of a compiler. For the example I assume the bonus to be 100, and the salary to be 1000. The tree-walk will start at the root of the tree, the assignment statement. The rule for the assignment will tell the interpreter that the right hand has to be evaluated first. This evaluation is also known as 'reduction'. The right hand side of the assignment needs to be reduced to a value, the result of the expression, before it can be assigned to the variable at the left hand side of the statement.

成功後分解成它的內部表示的語句,編譯器或解釋可以“行走的樹”的建設,以建立可執行程式碼。直譯器將不生成執行程式碼,但會呼叫內建的執行本身的職能。讓我們看看其他的解釋解釋,執行的步驟是比一個編譯器的程式碼生成更容易解釋。對於這個例子,我假定為100,獎金和工資為1000。樹步行將開始在樹的根,賦值語句。轉讓的規則會告訴直譯器,右手先計算。這種評價也被稱為“還原”。轉讓的右側的需求將減少到一個值,表示式的結果,才可以分配的語句左邊的變數。

The first node at the right-hand side of the statement contains an expression with a '+' operator. The right hand side of the '+' operator needs to be assigned to the left hand side. So the walk goes on to the next node at the right hand side. There the interpreter will detect the expression with the '*' operator. The left hand side of this operator needs to be multiplied with the right hand side. The interpreter goes on to the right hand side, and detects an expression that consists of a single number: 1.10. This side is fully reduced, the result can be stored and the interpreter walks the tree back up to the '*' operator, and starts evaluating its left hand side. This is an expression that consists of one single identifier, representing a variable, 'bonus'., The memory location represented by this variable is read and it's contents (100) will be multiplied by the right hand side result, 1.10. This expression has been fully reduced to the result 110 now. The interpreter walks up, to the '+' operator, and starts evaluating its left hand side. There it will again detect an identifier, 'salary'. Its location is read (1000) and the expression is reduced to a number, 1000. The right and left hand side will be added, resulting in 1,110. Now the expression at the right hand side of the assignment is fully reduced, and the interpreter walks up the tree, finds the assignment operator ':='. This instructs the interpreter to copy the result of the expression to the left hand side. The left hand side contains an identifier, 'result'. The memory location represented by 'result' will be filled with the result of the expression, 1,110.

在右側的宣告的第一個節點包含一個“+”操作​​符的表示式。右側的“+”運算需要被分配到左側。所以走在右側的下一個節點。有直譯器將檢測到的“*”操作符表示式。這個操作符左邊需要乘以右側。口譯員的右側,並檢測到表達,由一個單一的數字:1.10。此方是完全還原,結果可儲存和口譯各界樹“*”操作符,並開始評估其左側。這是一個包含一個單一的識別符號代表一個變數,“獎金”,這個變數所代表的記憶體位置讀取和它的內容(100)將右邊的結果,1.10乘以的表達。此表示式已全面降低到現在的結果110。口譯員走了,“+”運算,並開始評估其左側。在那裡,它會再次檢測識別符號,“薪水”。它的位置是隻讀(1000),表達的是一個數字,1000。將增加的權利和左側,導致在1110。現在,在右側轉讓的表示式是完全降低,口譯員走了樹,發現賦值運算子':='.這指示解釋複製的表達左側。左側包含一個識別符號,“結果”。 “結果”所代表的記憶體位置將被填充與表示式的結果,1110。

It is just a simplified explanation of how an interpreter or compiler uses a parse tree. It's out of scope of this answer to create a complete introduction to compiler building practices. However, it might be clear that creating a parse-tree consumes some resources. Before the language elements can be recognized they must be read character by character, type checking and possible conversion needs to be done, identifiers (tables, columns etc.) need to be identified and checked in the data dictionary, and so on. After this 'hard parse' the parse tree is composed, and is a far cheaper form to use to execute a statement than doing all this analysis over and over again. Therefore, storing the parse-tree in the SQL-area for future use can save quite some time during the processing of SQL-statements that have come across before.

它僅僅是一個簡單解釋如何直譯器或編譯器的使用解析樹。這是這個答案的範圍,建立一個完整的介紹編譯器的建設實踐。但是,它可能是明確的,建立解析樹消耗一些資源。的語言元素,可以確認之前,他們必須予以字元的字元,型別檢查和可能的轉換,需要做的,識別符號(表,列等)的需要確定和檢查資料字典,依此類推。在此之後的“硬解析”的解析樹組成,是一個便宜得多的形式用來執行比一遍又一遍的做這一切分析的宣告。因此,儲存在SQL區,供日後使用解析樹,跨前的SQL語句的處理過程中可以節省一段時間。

注:以上翻譯是用google直接翻譯的,意思不明白,參考英文原文