1. 程式人生 > >基於stanford nlp(JAVA)實現關係抽取

基於stanford nlp(JAVA)實現關係抽取


關係抽取一般採用三元組,(實體,關係,實體)。因此關係抽取是知識圖譜構建的重要環節之一。當前關係抽取已經有了各種方法,如有監督,遠端監督、神經網路的關係抽取方法。本篇部落格則側重於工程應用中實體關係抽取的實現,主要基於Stanford NLP的庫來實現。(見https://nlp.stanford.edu/software/relationExtractor.html),具體的關係抽取的實現方法見課件:https://web.stanford.edu/class/cs224u/materials/cs224u-2016-relation-extraction.pdf。

目前stanford nlp主要支援Live_InLocated_InOrgBased_InWork_For, and None.這幾種關係,它們的準確率介紹如下:

Label                           Correct Predict Actual  Precn   Recall  F       Roth/Yih F1
Live_In                         239.0   302.0   521.0   79.1    45.9    58.1    51.6
Located_In                      179.0   212.0   406.0   84.4    44.1    57.9    56.2
OrgBased_In                     169.0   252.0   452.0   67.1    37.4    48.0    51.7
Work_For                        185.0   247.0   401.0   74.9    46.1    57.1    52.0
_NR                             36176.0 37163.0 36396.0 97.3    99.4    98.4
Total                           772.0   1013.0  1780.0  76.2    43.4    55.3



(1) Tim Cook is the CEO of Apple, he replaced Steve Jobs, who died in 2011.


(2)Obama was born in Hawaii. He is our president.

(3)Xi Jinping delivers a report to the 19th National Congress of the Communist Party of China (CPC) at the Great Hall of the People in Beijing


(4)The aircraft, a Hainan Airlines flight with 22 Chinese passengers onboard, arrived at a Antarctic airport after a more than 20-hour journey starting from Hong Kong.