ceph的pool的pg修改值的计算

pg的修改
当扩大pg num的时候,有时候会遇到报错:

ceph osd pool set rbd pg_num 4096
 Error E2BIG: specified pg_num 3500 is too large (creating 4096 new PGs \
 on ~64 OSDs exceeds per-OSD max of 32)

限制pg spliting的参数来源于mon_osd_max_split_count value。
查看配置文件

#ceph daemon mon.ipmi1151 config get mon_osd_max_split_count
{
    "mon_osd_max_split_count": "32"
}

计算脚本:

max_inc=`ceph daemon mon.ipmi1151 config get mon_osd_max_split_count 2>&1 \
  | tr -d '\n ' | sed 's/.*"\([[:digit:]]\+\)".*/\1/'`
pg_num=`ceph osd pool get wtest pg_num | cut -f2 -d: | tr -d ' '`
echo "current pg_num value: $pg_num, max increment: $max_inc"
osd_num=`ceph osd ls |wc -l`
next_pg_num="$(($pg_num+$(($max_inc * $osd_num))))"
echo "allowed increment of pg_num: $next_pg_num"

结果输出

current pg_num value: 512, max increment: 32
allowed increment of pg_num: 800

block_size的计算

block_size = mon_osd_max_split_count * n_osds.

( Target PGs per OSD ) x ( OSD # ) x ( %Data )
-------------------------------------------------
                     ( Size )

If the value of the above calculation is less than the value of ( OSD# ) / ( Size ), then the value is updated to the value of ( OSD# ) / ( Size ). This is to ensure even load / data distribution by allocating at least one Primary or Secondary PG to every OSD for every Pool.
The output value is then rounded to the nearest power of 2.
Tip: The nearest power of 2 provides a marginal improvement in efficiency of the CRUSH algorithm.
If the nearest power of 2 is more than 25% below the original value, the next higher power of 2 is used.

发表评论

您的电子邮箱地址不会被公开。