Flume 之資料寫入hdfs

此案例前提：hadoop已經搭建完成（可用偽分布式）

啟動hadoop：start-all.sh

1.在/home/software/flume-1.9.0/job/目錄下建立hdfs.template.conf並配置如下資訊：

a3.sources = r3
a3.sinks = k3
a3.channels = c3
# describe/configure the source
a3.sources.r3.type = spooldir
a3.sources.r3.spooldir =
/home/data
a3.sources.r3.filesuffix =
.completed
a3.sources.r3.fileheader =
true
a3.sources.r3.ignorepattern =([
^]*\.tmp)
# describe the sink
a3.sinks.k3.type = hdfs
a3.sinks.k3.hdfs.path = hdfs:
//test:
9000
/upload/
%y%m%d/
%ha3.sinks.k3.hdfs.fileprefix = upload-
a3.sinks.k3.hdfs.round =
true
a3.sinks.k3.hdfs.roundvalue =
1a3.sinks.k3.hdfs.roundunit = hour
a3.sinks.k3.hdfs.uselocaltimestamp =
true
a3.sinks.k3.hdfs.batchsize =
100a3.sinks.k3.hdfs.filetype = datastream
a3.sinks.k3.hdfs.rollinterval =
600a3.sinks.k3.hdfs.rollsize =
134217700
a3.sinks.k3.hdfs.rollcount =
0a3.sinks.k3.hdfs.minblockreplicas =
1# use a channel which buffers events in memory
a3.channels.c3.type = memory
a3.channels.c3.capacity =
1000
a3.channels.c3.transactioncapacity =
100# bind the source and sink to the channel
a3.sources.r3.channels = c3
a3.sinks.k3.channel = c3

啟動flume：./bin/flume-ng agent --conf conf/ --name a3 -conf-file job/hdfs.template.conf

在另一台虛擬機器上建立/home/data/目錄並放入測試檔案

檢視結果：

配置資訊：

# 配置agent a1的元件
a1.sources=r1
a1.channels=c1
a1.sinks=s1
# 配置a1的source
a1.sources.r1.type=netcat
a1.sources.r1.bind=
0.0.0
.0a1.sources.r1.port=
1234
# 配置channel
a1.channels.c1.type=memory
a1.channels.c1.capacity=
1000
a1.channels.c1.transactioncapacity=
100# 配置sink
a1.sinks.s1.type=hdfs
a1.sinks.s1.hdfs.path=hdfs:
//test:
9000
/flume
a1.sinks.s1.hdfs.filetype=datastream
# 繫結
a1.sources.r1.channels=c1
a1.sinks.s1.channel=c1

啟動flume:./bin/flume-ng agent --conf conf/ --name a1 -conf-file ./job/hdfs2.template.conf

開啟另一台虛擬機器遠端登入啟動的flume的虛擬機器:ssh test

測試

檢視結果

flume 寫入hdfs 採用lzo 格式教程

問題環境 hadoop2.7.3 和 flume 1.7.0 1.首先我的flume是單獨部署在一台主機的。沒該主機沒有部署hadoop 所以sink寫入hdfs的時候直接報錯。於是我上傳了hadoop2.x.x 版本配置了下hadoop home path 環境變數。搞定。2.在編寫flume...

將hdfs資料寫入hive

下面來嘮嘮怎麼將hdfs裡的資料寫入到hive裡。要將資料寫入hive裡，那麼hive裡得有存放資料表得位置，因此，第一步，是在hive裡建立新的表來儲存來自hdfs的資料，這裡有兩個注意 1 新建的表應該跟hdfs裡面資料表一致，指定列名 2 建立式應一致，具體就是指row format del...

HDFS資料寫入流程

1 client 發起檔案寫入請求，通過 rpc 與 namenode 建立通訊，namenode檢查目標檔案，返回是否可以上傳 2 client 請求第乙個 block 該傳輸到哪些 datanode 伺服器上 3 namenode 根據副本數量和副本放置策略進行節點分配，返回datanode節點...

Flume 之資料寫入hdfs

flume 寫入hdfs 採用lzo 格式 教程

將hdfs資料寫入hive

HDFS資料寫入流程

相關推薦

flume 寫入hdfs 採用lzo 格式教程