1. 程式人生 > 其它 >基於flume和lftp的非結構化檔案同步

基於flume和lftp的非結構化檔案同步

目錄

同步非結構化檔案到本地系統

lftptest.sh

#!/bin/bash
 
lftp sftp://192.168.1.102 << EOF
set net:timeout 5; 
set net:max-retries 5;
set net:reconnect-interval-base 5;
set net:reconnect-interval-multiplier 1;
mirror --delete --only-newer --verbose /tmp/lftptest/in /tmp/lftptest/out
exit
EOF

vi /etc/crontab

* * * * * root sh /tmp/lftptest/lftptest.sh >> /tmp/lftptest/lftptest.log

同步非結構化檔案到HDFS

test.conf

a1.sources = r1
a1.channels = c1
a1.sinks = k1

a1.sources.r1.channels = c1
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /tmp/flumetest/in
a1.sources.r1.deserializer = org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder
a1.sources.r1.deserializer.maxBlobLength = 100000000
a1.sources.r1.basenameHeader = true
a1.sources.r1.basenameHeaderKey = fileName
a1.sources.r1.pollDelay = 1000

a1.channels = c1
a1.channels.c1.type = memory

a1.sinks.k1.channel = c1
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://linux01:9000/test/flumetest/out/
a1.sinks.k1.hdfs.filePrefix = %{fileName}
a1.sinks.k1.hdfs.fileType = DataStream

/opt/app/apache-flume-1.9.0-bin/bin/flume-ng agent -n a1 -c conf -f /tmp/flumetest/conf/test.conf -Dflume.root.logger=DEBUG,console

源目錄

目標目錄