ceph pg不一致問題

2021-09-03 01:23:12 字數 3745 閱讀 2818

今天在公司環境中出現了pg不一致問題,通過ceph health detail命令檢視如下:

pg 19.211 is active+clean+inconsistent, acting [88,16]

pg 19.214 is active+clean+inconsistent, acting [59,36]

pg 19.2b7 is active+clean+inconsistent, acting [39,22]

pg 19.3e9 is active+clean+inconsistent, acting [13,39]

環境使用的是bluestore,我們知道ceph中出現pg不一致,原因是該pg下存在乙個或多個object所在的主副本不一致;

從上述資訊可知,不一致pg所在pool_id是19,我們可以通過下列命令查詢該pool下所有不一致的pg:

$ rados list-inconsistent-pg pool01

["19.214","19.3e9","19.546","19.781","19.80a","19.8a7","19.8b3","19.a26","19.c10","19.d91","19.e14","19.ffc"] 

下面檢視19.214這個pg到底存在什麼問題:

[root@host240 ~]# rados list-inconsistent-obj 19.214 --format=json-pretty

,"errors": ,

"union_shard_errors": [

"read_error"

],"selected_object_info": "19:284db047:::benchmark_data_host241_302999_object2530:head(430'1 client.24390.0:2531 dirty|data_digest|omap_digest s 4194304 uv 1 dd 9dc47f37 od ffffffff alloc_hint [4194304 4194304 53])",

"shards": [,]

},,"errors": ,

"union_shard_errors": [

"read_error"

],"selected_object_info": "19:28460362:::benchmark_data_host241_302999_object3346:head(430'2 client.24390.0:3347 dirty|data_digest|omap_digest s 4194304 uv 2 dd 94786cdf od ffffffff alloc_hint [4194304 4194304 53])",

"shards": [,]

}]}

可以看到該pg下有兩個object(分別是benchmark_data_host241_302999_object2530和benchmark_data_host241_302999_object3346) 出現了問題,且原因都是因為osd.41上面出現了read_error錯誤。

同時osd日誌中也有如下報錯:

0000], logical extent 0x0~80000, object #19:0831133d:::benchmark_data_host241_302999_object2649:head#

2018-12-12 05:18:58.379434 7fe1d570d700 -1 log_channel(cluster) log [err] : 19.c10 shard 41: soid 19:0831133d:::benchmark_data_host241_302999_object2649:head candidate had a read error

2018-12-12 05:18:59.749579 7fe1d570d700 -1 bluestore(/var/lib/ceph/osd/ucsm-41) _verify_csum bad crc32c/0x80000 checksum at blob offset 0x0, got 0xd78b4b28, expected 0xb603665d, device location [0xee50000~80000], logical extent 0x0~80000, object #19:0831a5b5:::benchmark_data_host240_322007_object1050:head#

2018-12-12 05:18:59.932828 7fe1d570d700 -1 log_channel(cluster) log [err] : 19.c10 shard 41: soid 19:0831a5b5:::benchmark_data_host240_322007_object1050:head candidate had a read error

2018-12-12 05:19:00.975106 7fe1d570d700 -1 bluestore(/var/lib/ceph/osd/ucsm-41) _verify_csum bad crc32c/0x80000 checksum at blob offset 0x0, got 0x1913d068, expected 0x7acc9a9b, device location [0x4ee50000~80000], logical extent 0x0~80000, object #19:08322ef1:::benchmark_data_host241_302999_object6417:head#

2018-12-12 05:19:01.198230 7fe1d570d700 -1 log_channel(cluster) log [err] : 19.c10 shard 41: soid 19:08322ef1:::benchmark_data_host241_302999_object6417:head candidate had a read error

2018-12-12 05:19:21.593182 7fe1d570d700 -1 bluestore(/var/lib/ceph/osd/ucsm-41) _verify_csum bad crc32c/0x80000 checksum at blob offset 0x0, got 0xb1e5113a, expected 0xd9e604bb, device location [0x58a50000~80000], logical extent 0x0~80000, object #19:083b28ff:::benchmark_data_host241_302999_object7127:head#

這個情況一般都是磁碟出現了問題,導致osd.41所對應的磁碟出現壞塊,導致無法讀取,因此在做主副本比較時不一致;

可是檢視demsg日誌,並未發現有磁碟的報錯資訊,比較奇怪;

下面通過ceph-object-tool命令去讀取osd.41所在的磁碟,發現報錯:

[root@host241 ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ucsm-41/ --type bluestore --op list-pgs

mount failed with '(2) no such file or directory'

進而通過ceph-volume lvm list |grep "osd\.41"命令查詢osd.41所對應的volume,沒找到,哈哈,此時問題應該已經明確了,原來是該osd的volume不存在,當然會有問題;進一步檢查系統pv、lv設定等,確認了該問題。

參考:

version magic 不一致問題

碰到乙個問題,在開發過程中發現以前編譯的模組載入失敗了。wlan version magic 4.1.15 gfb2dbf6 smp preempt mod unload armv7 p2v8 should be 4.1.15 ge5de83b dirty smp preempt mod unloa...

快取不一致

當程式在執行過程中,會將運算需要的資料從主存複製乙份到cpu的快取記憶體當中,那麼cpu進行計算時就可以直接從它的快取記憶體讀取資料和向其中寫入資料,當運算結束之後,再將快取記憶體中的資料重新整理到主存當中。舉個簡單的例子 i i 1。當執行緒執行這個語句時,會先從主存當中讀取i的值,然後複製乙份到...

快取不一致問題分析

快取不一致問題 快取更新的4種策略 解決方案 延時雙刪除,偽 如下 public void write string key,object data 假設這會有兩個請求,乙個請求a做查詢操作,乙個請求b做更新操作,那麼會有如下情形產生 1 快取剛好失效 2 請求a查詢資料庫,得乙個舊值 3 請求b將...