1. 程式人生 > 其它 >Understanding Natural Language Queries over Relational Databases論文學習

Understanding Natural Language Queries over Relational Databases論文學習

研究背景

  • NLIDBs have many advantages over other widely accepted query interfaces (keyword-based search, form-based interface, and visual query builder).(NLIDB 與其他廣泛接受的查詢介面(基於關鍵字的搜尋、基於表單的介面和視覺化查詢構建器)相比具有許多優勢。
  • Despite many advantages, NLIDBs have not been adopted widely. The fundamental problem is that understanding natural language is hard.(儘管有許多優點,但 NLIDB 並未被廣泛採用。 根本問題是理解自然語言很困難。

查詢機制

The query mechanism of the system NALIR (Natural Language Interface to Relational databases) facilitates collaboration between the system and the user in processing natural language queries. First, the system explains how it interprets a query, from each ambiguous word/phrase to the meaning of the whole sentence. Second, for each ambiguous part, we provide multiple likely interpretations for the user to choose from.(系統NALIR的查詢機制有利於系統與使用者協作處理自然語言查詢。 首先,系統解釋它如何解釋查詢,從每個歧義詞/短語到整個句子的含義。其次,對於每個不明確的部分,我們提供了多種可能的解釋供使用者選擇。

系統架構

The entire system consists of three main parts: the query interpretation part, interactive communicator and query tree translator. The query interpretation part, which includes parse tree node mapper and structure adjustor, is responsible for interpreting the natural language query and representing the interpretation as a query tree. The interactive communicatoris responsible for communicating with the user to ensure that the interpretation process is correct. The query tree, possibly verified by the user, will be translated into a SQL statement in the query tree translator and then evaluated against an RDBMS.(整個系統由三個主要部分組成:查詢解釋部分、互動通訊器和查詢樹翻譯器。

查詢解釋部分,包括解析樹節點對映器和結構調整器,負責解釋自然語言查詢並將解釋表示為查詢樹。 互動式通訊器負責與使用者進行通訊,以確保解釋過程的正確性。 可能由使用者驗證的查詢樹將在查詢樹轉換器中轉換為 SQL 語句,然後根據 RDBMS 進行評估。

  • Adjust the structure of the parse tree in two steps. In the first step, we reformulate the nodes in the parse tree to make it fall in the syntactic coverage of our system (valid parse tree). If there are multiple candidate valid parse trees for the query, we choose the best one as default input for the second step and report top k of them to the interactive communicator. In the second step, the chosen (or default) valid parse tree is analyzed semantically and implicit nodes are inserted to make it more semantically reasonable. This process is also under the supervision of the user. (分兩步調整解析樹的結構。第一步,我們重新構造解析樹中的節點,使其落入我們系統的句法覆蓋範圍內(有效解析樹)。 如果查詢有多個候選的有效解析樹,我們選擇最好的一個作為第二步的預設輸入,並將其中前 k 個報告給互動式通訊器。第二步,對選擇的(或預設的)有效解析樹進行語義分析,並插入隱式節點,使其在語義上更合理。 這個過程也是在使用者的監督下進行的。
  • Interactive communications are organized in three steps, which verify the intermediate results in the parse tree node mapping, parse tree structure reformulation, and implicit node insertion, respectively.(互動通訊分為三個步驟,分別驗證解析樹節點對映、解析樹結構重構和隱式節點插入中的中間結果。

如何做實驗

  • There are two crucial aspects we must evaluate: the quality of the returned results (effectiveness) and whether our system is easy to use for non-technical users (usability).(我們必須評估兩個關鍵方面:返回結果的質量(有效性)以及我們的系統是否易於非技術使用者使用(可用性)。
  • The experiment was a user study, in which participants were asked to finish the query tasks we designed for them.(該實驗是一項使用者研究,其中要求參與者完成我們為他們設計的查詢任務。
  • We used the data set of Microsoft Academic Search (MAS). We compared our system with the faceted interface of the MAS website.(我們使用了 Microsoft Academic Search (MAS) 的資料集。我們將我們的系統與 MAS 網站的分面介面進行了比較。