es在win下新增資料(二)
win環境下把MySql中的資料匯入到Elasticsearch(二)
環境問題參考我上文:
https://blog.csdn.net/qq_24265945/article/details/81168158
環境問題已經好了,接下來,我們講實戰。
下載:mysql-connector-java-5.1.46.zip
該壓縮包幫助mysql與其他平臺連線。看到很多資源都要積分,不能選0分,所以選1分了
https://dev.mysql.com/downloads/file/?id=476197
在bin目錄下建立jdbc.config
根據需求改連線,賬號,密碼,名字,埠等等。
input { stdin { } jdbc { # mysql jdbc connection string to our backup databse jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/zhangjiang?characterEncoding=UTF-8&useSSL=false" # the user we wish to excute our statement as jdbc_user => "root" jdbc_password => "111111" # the path to our downloaded jdbc driver jdbc_driver_library => "D:/logstash-6.3.1/mysql-connector-java-5.1.46/mysql-connector-java-5.1.46/mysql-connector-java-5.1.46.jar" # the name of the driver class for mysql jdbc_driver_class => "com.mysql.jdbc.Driver" jdbc_paging_enabled => "true" jdbc_page_size => "50000" jdbc_default_timezone => "UTC" statement_filepath => "D:/logstash-6.3.1/logstash-6.3.1/bin/jdbc.sql" schedule => "* * * * *" type => "patent" } } filter { json { source => "message" remove_field => ["message"] } } output { elasticsearch { hosts => "localhost:9200" index => "patent" document_id => "%{id}" } stdout { codec => json_lines } }
建立jdbc.sql(這裡根據自己需求改啦)
select Patent_ID as id, Patent_Num as pnum, Patent_Name as pname, Patent_Link as link, Patent_Applicant as applicant, Patent_Summary as summary, Patent_Date as pdate, Patent_ClassNum as classnum, Patent_Update as pupdate, Patent_Status as pstatus from patentinfor
在bin中輸入 logstash -f jdbc.conf
瘋狂匯入啦~~~
這裡也記錄下幾個報錯的問題給大家參考:
1.
D:\logstash-6.3.1\logstash-6.3.1\bin>logstash -f jdbc.conf
Sending Logstash's logs to D:/logstash-6.3.1/logstash-6.3.1/logs which is now configured via log4j2.properties
[2018-07-23T09:40:23,744][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
2018-07-23 09:40:23,822 LogStash::Runner ERROR Unable to move file D:\logstash-6.3.1\logstash-6.3.1\logs\logstash-plain.log to D:\logstash-6.3.1\logstash-6.3.1\logs\logstash-plain-2018-07-20-1.log: java.nio.file.FileSystemException D:\logstash-6.3.1\logstash-6.3.1\logs\logstash-plain.log -> D:\logstash-6.3.1\logstash-6.3.1\logs\logstash-plain-2018-07-20-1.log: 另一個程式正在使用此檔案,程序無法訪問。
已經有埠使用了。檢視下是不是開著別的cmd在呼叫logstash,關閉就是了
2.
D:\logstash-6.3.1\logstash-6.3.1\bin>logstash -f jdbc.conf
Sending Logstash's logs to D:/logstash-6.3.1/logstash-6.3.1/logs which is now configured via log4j2.properties
[2018-07-23T09:44:41,964][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-07-23T09:44:42,995][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.3.1"}
[2018-07-23T09:44:46,715][ERROR][logstash.outputs.elasticsearch] Unknown setting 'host' for elasticsearch
[2018-07-23T09:44:46,715][ERROR][logstash.outputs.elasticsearch] Unknown setting 'port' for elasticsearch
[2018-07-23T09:44:46,715][ERROR][logstash.outputs.elasticsearch] Unknown setting 'protocol' for elasticsearch
[2018-07-23T09:44:46,715][ERROR][logstash.outputs.elasticsearch] Unknown setting 'cluster' for elasticsearch
[2018-07-23T09:44:46,746][ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Something is wrong with your configuration.", :backtrace=>["D:/logstash-6.3.1/logstash-6.3.1/logstash-core/lib/logstash/config/mixin.rb:89:in `config_init'", "D:/logstash-6.3.1/logstash-6.3.1/logstash-core/lib/logstash/outputs/base.rb:62:in `initialize'", "org/logstash/config/ir/compiler/OutputStrategyExt.java:202:in `initialize'", "org/logstash/config/ir/compiler/OutputDelegatorExt.java:68:in `initialize'", "D:/logstash-6.3.1/logstash-6.3.1/logstash-core/lib/logstash/plugins/plugin_factory.rb:93:in `plugin'", "D:/logstash-6.3.1/logstash-6.3.1/logstash-core/lib/logstash/pipeline.rb:110:in `plugin'", "(eval):39:in `<eval>'", "org/jruby/RubyKernel.java:994:in `eval'", "D:/logstash-6.3.1/logstash-6.3.1/logstash-core/lib/logstash/pipeline.rb:82:in `initialize'", "D:/logstash-6.3.1/logstash-6.3.1/logstash-core/lib/logstash/pipeline.rb:167:in `initialize'", "D:/logstash-6.3.1/logstash-6.3.1/logstash-core/lib/logstash/pipeline_action/create.rb:40:in `execute'", "D:/logstash-6.3.1/logstash-6.3.1/logstash-core/lib/logstash/agent.rb:305:in `block in converge_state'"]}
[2018-07-23T09:44:47,215][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
這裡是參考了別的網頁,本來在下文中還寫了port,protocol等內容,刪除就是了,同時注意是hosts而不是host;記得檢視自己的sql檔案,如果sql檔案出錯,同樣會導致匯入失敗,結果是:可以手動在cmd寫內容進es,但是mysql的內容並沒有錄入es
output {
elasticsearch {
hosts => "localhost:9200"
index => "patent"
document_id => "%{id}"
}
stdout {
codec => json_lines
}
最後附上成功執行後視覺化軟體kibana圖:
一週後的後續:當我加入中文分詞工具以後,突然發現一個問題,對映是elasticsearch預設自己建立的 ,一旦建立沒有辦法修改,所以建議大家插入資料前先建立mapping,把分詞工具加進去,具體分詞工具我這裡採用elasticsearch中的IK Analysis,後續會考慮其他分詞工具,比較效果。這裡附上我分詞工具IK Analysis的安裝和使用後續:
https://blog.csdn.net/qq_24265945/article/details/81355504