背景:需要實時讀取log日誌檔案中的記錄到kafka
1.zookeeper服務需要開啟,檢視zookeeper的狀態,(zookeeper的安裝及啟動過程可檢視
[root@master kafka_2.11-0.11]# /opt/soft/zookeeper-3.4.13/bin/zkserver.sh status2.kafka服務需要開啟zookeeper jmx enabled by default
using config: /opt/soft/zookeeper-3.4.13/bin/../conf/zoo.cfg
mode: follower
/opt/soft/kafka_2.11-0.11/bin/kafka-server-start.sh /opt/soft/kafka_2.11-0.11/config/server.properties3.啟動flume 的配置檔案
啟動flume:bin/flume-ng agent --conf conf --conf-file ./conf/job/file_to_hdfs.conf --name a1
a1.sources = r14.檢視kafka對應的topic裡面資料是否有同步過來a1.sinks = k1
a1.channels = c1
# describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -f /opt/data/mall/16/mall.log # 需要監控的檔案
# describe the sink
#a1.sinks.k1.type = logger
a1.sinks.k1.type = org.apache.flume.sink.kafka.kafkasink
a1.sinks.k1.topic = test
a1.sinks.k1.brokerlist = master:9092
a1.sinks.k1.requiredacks = 1
a1.sinks.k1.batchsize = 20
# use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactioncapacity = 100
# bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
實時讀取目錄檔案到HDFS
1 建立配置檔案flume dir.conf 1.定義agent a3 a3.sources r3 a3.sinks k3 a3.channels c3 2.定義source a3.sources.r3.type spooldir a3.sources.r3.spooldir opt module ...
flume實時收集日誌到kafka
flume實時收集日誌 kafka版本0.8.2 1.版本apache flume 1.7.0 bin.tar.gz 解壓後conf 目錄下配置以.conf結尾的檔案如 flume properties.conf 2.配置檔案資訊 sources 資料來源每增加乙個新增即可 a1.sources r...
Flume採集檔案到HDFS
在flume和hadoop安裝好的情況下 1.遇到的坑 在安裝hadoop時,配置 core site.xml 檔案一定要注意。fs.defaultfs name hdfs master 9000 value property 上述的value值使用的是主機名稱 master 或者ip位址,不能使用...