搭建Hibench測試flink環境手冊

hibench官網

這裡不搭建全套的hibench盡搭建flink相關的一些元件

本地需要提前準備好

在本地找乙個地方解壓縮

配置並啟動

在zookeeper目錄執行如下命令

# 直接複製樣例配置檔案為需要使用的檔案 cp conf/zoo_sample.cfg conf/zoo.cfg # 啟動zookeeper bin/zkserver.sh start # 不需要zookeeper的時候通過下面命令停止zookeeper

bin/zkserver.sh stop

在本地找乙個地方解壓縮

啟動需要的zookeeper已經在配置檔案config/server.properties中預設配置好了，所以可以直接啟動

bin/kafka-server-start.sh config/server.properties

# 檢視當前所有topic列表 bin/kafka-topics.sh --list --bootstrap-server localhost:9092 # 檢視topic identity裡面的資料

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic identity --from-beginning

參見這個文件

在mac上安裝hadoop

我們採用standalone cluster的部署方式部署

官網指導位址

配置ssh

要求無密碼的，ssh localhost。沒有驗證過不配置是否能用

配置過程參見hadoop的配置

配置flink

conf/flink-conf.yaml

# 原來預設是1個，實際使用不夠用

taskmanager.numberoftaskslots: 30

啟動集群

# 啟動集群 bin/start-cluster.sh # 停止集群

bin/stop-cluster.sh

參考hibench官網flink配置部分

採用本地集群部署方案，官網說明位址

在官網配置過程的基礎上增加配置

conf/flink-conf.yaml

# 原來預設是1個，實際使用不夠用 taskmanager.numberoftaskslots

:30

git clone [email protected]:intel-bigdata/hibench.git

build專案

官網說明

這裡僅僅構建flink的專案

mvn -pflinkbench -dspark=2.1 -dscala=2.11 clean package

配置hadoop

執行如下命令

cp conf/hadoop.conf.template conf/hadoop.conf

修改配置檔案內容如下

# hadoop的home目錄，根據自己情況填寫 hibench.hadoop.home /users//downloads/2020-03/hadoop-2.10.0 # the path of hadoop executable hibench.hadoop.executable $/bin/hadoop # hadoop configraution directory hibench.hadoop.configure.dir $/etc/hadoop # the root hdfs path to store hibench data。指定乙個已經存在的目錄，如果沒有需要使用hadoop命令建立 hibench.hdfs.master hdfs://localhost:9000/user/# hadoop release provider. supported value: apache, cdh5, hdp

hibench.hadoop.release apache

配置kafka

conf/hibench.conf 修改如下內容

# 配置自己本地的kafka安裝目錄 hibench.streambench.kafka.home /users//tools/kafka_2.11-2.4.0 # zookeeper host:port of kafka cluster, host1:port1,host2:port2... hibench.streambench.zkhost localhost:2181 # kafka broker lists, written in mode host:port,host:port,..

hibench.streambench.kafka.brokerlist localhost:9092

配置資料生成

conf/hibench.conf中hibench.streambench.datagen開頭的配置。

這塊都有預設值，可以不管

配置flink到hibench

官網配置位址

執行命令

cp conf/flink.conf.template conf/flink.conf

配置檔案內容如下

# 根據flink安裝位置自己調整 hibench.streambench.flink.home /users//tools/flink-1.10.0 hibench.flink.master localhost:8081 # default parallelism of flink job。這裡的數字必須小於flink中slot數量 hibench.streambench.flink.parallelism 20 hibench.streambench.flink.buffertimeout 10

hibench.streambench.flink.checkpointduration 1000

執行生成資料過程

執行下面命令，可能有錯誤，參見錯誤描述中內容修改

bin/workloads/streaming/identity/prepare/genseeddataset.sh

bin/workloads/streaming/identity/prepare/datagen.sh

執行flink 的job

bin/workloads/streaming/identity/flink/run.sh

生成報告

# 執行生成報告指令碼 bin/workloads/streaming/identity/common/metrics_reader.sh # 上面的指令碼會列出類似下面的topic名字 flink_identity_1_5_50_1583118115848 flink_identity_1_5_50_1583118729972 flink_identity_1_5_50_1583119730761 flink_identity_1_5_50_1583120900468 flink_identity_1_5_50_1583121043536 flink_identity_1_5_50_1583131260923 flink_identity_1_5_50_1583207113628 __consumer_offsets identity test # 在下面提示後輸入乙個flink_identity開頭的topic please input the topic:flink_identity_1_5_50_1583118115848 collected 0 results for partition: 11 # 最後控制體輸出資訊中輸出了報告檔名稱 written out metrics to

/users//projects/hibench/report/flink_identity_1_5_50_15831181...

搭建Hibench測試flink環境手冊

搭建測試環境

Hapoop 搭建（四）搭建後測試

CSP測試環境搭建

搭建Hibench測試flink環境手冊

搭建測試環境

Hapoop 搭建 （四）搭建後測試

CSP測試環境搭建

相關推薦

Hapoop 搭建（四）搭建後測試