Hadoop運維記錄系列 九

2021-09-05 08:13:07 字數 1937 閱讀 3837

linux作業系統針對hadoop的引數和命令調優。對於hadoop本身的引數調優,寫的已經不少了,作業系統方面的不多,記錄一下我用的系統引數。先寫一點,想起哪個再往裡面加。

一、系統核心引數調優sysctl.conf

net.ipv4.ip_forward = 0

net.ipv4.conf.default.rp_filter = 1

net.ipv4.conf.default.accept_source_route = 0

kernel.sysrq = 0

kernel.core_uses_pid = 1

net.ipv4.tcp_syncookies = 1

kernel.msgmnb = 65536

kernel.msgmax = 65536

kernel.shmmax = 68719476736

kernel.shmall = 4294967296

net.ipv4.tcp_max_tw_buckets = 60000

net.ipv4.tcp_sack = 1

net.ipv4.tcp_window_scaling = 1

net.ipv4.tcp_rmem = 4096 87380 4194304

net.ipv4.tcp_wmem = 4096 16384 4194304

net.core.wmem_default = 8388608

net.core.rmem_default = 8388608

net.core.rmem_max = 16777216

net.core.wmem_max = 16777216

net.core.netdev_max_backlog = 262144

net.core.somaxconn = 262144

net.ipv4.tcp_max_orphans = 3276800

net.ipv4.tcp_max_syn_backlog = 262144

net.ipv4.tcp_timestamps = 0

net.ipv4.tcp_synack_retries = 1

net.ipv4.tcp_syn_retries = 1

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_tw_reuse = 1

net.ipv4.tcp_mem = 94500000 915000000 927000000

net.ipv4.tcp_fin_timeout = 1

net.ipv4.tcp_keepalive_time = 1200

net.ipv4.tcp_max_syn_backlog = 65536

net.ipv4.tcp_timestamps = 0

net.ipv4.tcp_synack_retries = 2

net.ipv4.tcp_syn_retries = 2

net.ipv4.tcp_tw_recycle = 1

#net.ipv4.tcp_tw_len = 1

net.ipv4.tcp_tw_reuse = 1

#net.ipv4.tcp_fin_timeout = 30

#net.ipv4.tcp_keepalive_time = 120

net.ipv4.ip_local_port_range = 1024 65535

二、命令引數調優

echo 512 > /proc/sys/net/ipv4/neigh/default/gc_thresh1

echo 2048 > /proc/sys/net/ipv4/neigh/default/gc_thresh2

echo 10240 > /proc/sys/net/ipv4/neigh/default/gc_thresh3

ulimit -shn 65535

Hadoop運維記錄系列 九

linux作業系統針對hadoop的引數和命令調優。對於hadoop本身的引數調優,寫的已經不少了,作業系統方面的不多,記錄一下我用的系統引數。先寫一點,想起哪個再往裡面加。一 系統核心引數調優sysctl.conf net.ipv4.ip forward 0 net.ipv4.conf.defau...

Hadoop運維記錄系列 十七

上個月通過email,幫朋友的朋友解決了乙個cloudera的spark sql無法訪問hbase做資料分析的問題,記錄一下。首先,對方已經做好了hive訪問hbase,所以spark sql原則上可以通過呼叫hive的元資料來訪問hbase。但是執行極慢,而且日誌無報錯。中間都是郵件溝通,先問了幾...

Hadoop運維記錄系列 二十二

今天抽空解決了乙個hadoop集群的乙個非常有意思的故障,之所有有意思,是這個故障既可以稱之為故障,又不算是故障,說不算問題吧,作業跑的特慢,說算問題吧,作業不但都能跑出來,還沒有任何報錯,所以還比較難查。故障表象是一幫人嚷嚷作業太慢了,跑不動,但是基本上嚷嚷一會就能跑出來,但相對於原來還是慢。我看...