1. 程式人生 > 其它 >NALMO: A Natural Language Interface for Moving Objects Databases論文學習

NALMO: A Natural Language Interface for Moving Objects Databases論文學習

研究背景

  • Queries in natural languages are still not supported in MODs. Since most users are not familiar with structured query languages, it is essentially important to bridge the gap between natural languages and the underlying MODs system commands.(MOD 中仍然不支援自然語言的查詢。 由於大多數使用者不熟悉結構化查詢語言,因此彌合自然語言和底層 MOD 系統命令之間的差距至關重要。

研究方法

  • we design a natural language interface for moving objects, named NALMO. NALMO is able to well translate moving objects queries into structured (executable) languages.(我們為移動物件設計了一個自然語言介面,命名為 NALMO。NALMO 能夠很好地將移動物件查詢翻譯成結構化(可執行)語言。
  • We use semantic parsing in combination with a location knowledge base and domain-specific rules to interpret natural language queries.(我們使用語義解析結合位置知識庫和特定領域的規則來解釋自然語言查詢。
  • We design a corpus of moving objects queries for model training, which is later used to determine the query type. We support four kinds of queries including time interval queries, range queries, nearest neighbor queries and trajectory similarity queries.(我們設計了一個用於模型訓練的移動物件查詢語料庫,稍後用於確定查詢型別。我們支援四種查詢,包括時間間隔查詢、範圍查詢、最近鄰查詢和軌跡相似性查詢。
  • Extracted entities from parsing are mapped through deterministic rules to perform query composition.(從解析中提取的實體通過確定性規則對映以執行查詢組合。

系統架構

  • 自然語言理解
    • We retain or delete the punctuations, and utilize the tool spaCy to perform word segmentation and do entity recognition.(我們保留或刪除標點符號,並利用工具spaCy進行分詞和實體識別。)
    • Since entities are approximately processed by spaCy, key information for moving objects queries is not accurate and the results are not sufficient for structured language construction. To increase the translation quality, we propose an algorithm to further parse the spatio-temporal data.(由於spaCy對實體進行了近似處理,因此移動物件查詢的關鍵資訊不準確,結果不足以進行結構化語言構建。 為了提高翻譯質量,我們提出了一種演算法來進一步解析時空資料。
    • In order to determine the location, we generate a location knowledge base which extracts objects whose attribute is point or region.(為了確定位置,我們生成一個位置知識庫,它提取屬性為點或區域的物件。
    • In order to improve the query efficiency, we construct a locationbased prefix index for matching the location knowledge base. A large number of locations often contain the same prefix. Therefore, the index can effectively improve the extraction efficiency of location information.(為了提高查詢效率,我們構建了一個基於位置的字首索引來匹配位置知識庫。 大量位置通常包含相同的字首。 因此,該索引可以有效提高位置資訊的提取效率。
  • 查詢翻譯
    • The translation consists of two steps: (i)determining the query type and (ii)constructing the structured language.(翻譯包括兩個步驟:(i)確定查詢型別和(ii)構建結構化語言。
    • There are different types of queries and a corpus is built to determine the query type. The corpus makes use of an LSTM neural network to train a model for identifying the query type.(有不同型別的查詢,並且構建了一個語料庫來確定查詢型別。 語料庫使用 LSTM 神經網路來訓練識別查詢型別的模型。
    • Extracted entities are mapped to the data relation and corresponding values according to the query type. At the same time, operators are selected to constitute the query structure.(根據查詢型別將提取的實體對映到資料關係和對應的值。 同時,選擇運算子來構成查詢結構。

實驗評估

  • We evaluate our approach using 240 natural language queries extracted from popular conference and journal papers in the domain of moving objects. (我們使用從移動物件領域的流行會議和期刊論文中提取的 240 個自然語言查詢來評估我們的方法。
  • Experimental results show that(實驗結果表明,(i)NALMO的準確率和精度分別達到98.1%和88.1%,並且 (ii) 翻譯查詢的平均時間成本為 1.47 秒
    • NALMO achieves accuracy and precision 98.1%and 88.1%
    • the average time cost of translating a query is 1.47s