本节主要介绍一下ceph pg相关操作,及PG状态的说明。当前操作系统环境为:

# lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description:    CentOS Linux release 7.3.1611 (Core) 
Release:        7.3.1611
Codename:       Core

# uname -a
Linux server-oss21 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

osd个数为21个。

1. 统计pg个数

采用如下脚本统计每个pool的PG个数:

# ceph osd pool ls
.rgw.root
default.rgw.control
default.rgw.data.root
default.rgw.gc
default.rgw.log
default.rgw.users.uid
default.rgw.users.keys
default.rgw.usage
default.rgw.buckets.index
default.rgw.buckets.non-ec
default.rgw.buckets.data

# ceph osd pool ls | while read line; do ceph pg ls-by-pool $line | awk -v pool=$line 'BEGIN{col=0} /^[0-9a-f]+\.[0-9a-f]+/ {cols++} END{printf "%-32s: %d\n", pool, cols}'; done
.rgw.root                       : 8
default.rgw.control             : 8
default.rgw.data.root           : 8
default.rgw.gc                  : 8
default.rgw.log                 : 8
default.rgw.users.uid           : 8
default.rgw.users.keys          : 8
default.rgw.usage               : 8
default.rgw.buckets.index       : 8
default.rgw.buckets.non-ec      : 8
default.rgw.buckets.data        : 8

# ceph osd pool get default.rgw.buckets.data pgp_num
pgp_num: 8

2. 调整PG个数

2.1 计算合适的PG数

PG和PGP数量一定要根据OSD的数量进行调整,计算公式如下,但是最后算出的结果一定要接近或者等于一个2的指数。调整PGP不会引起PG内的对象的分裂,但是会引起PG的分布的变动。

Total PGs = (Total_number_of_OSD * 100) / max_replication_count

计算的结果往上取靠近2的N次方的值。比如总共OSD数量是21,副本数是3,则计算出的结果为700,向上去2的指数,就应该为1024。

2.2 调整前确保状态OK

如果ceph -s命令显示的集群状态是OK的,此时就可以动态的增大PG的值。

注意:增大PG有几个步骤,同时必须比较平滑的增大,不能一次性调的太猛。对于生产环境格外注意

一次性的将PG调到一个很大的值会导致集群大规模的数据平衡,因而可能导致集群出现问题而临时不可用。

2.3 调整数据同步参数

当调整PG/PGP的值时,会引发ceph集群的backfill操作,数据会以最快的速度进行平衡,因此可能导致集群不稳定。因此首先设置backfill ratio到一个比较小的值。通过下面的命令设置:

# ceph tell osd.* injectargs '--osd-max-backfills 1'
# ceph tell osd.* injectargs '--osd-recovery-max-active 1'
# ceph tell osd.* injectargs '--osd-recovery-max-single-start 1'

此外,还包括如下这些参数:

# ceph tell osd.* injectargs '--osd-backfill-scan-min 2' 
# ceph tell osd.* injectargs '--osd-backfill-scan-max 4' 
# ceph tell osd.* injectargs '--osd-recovery-threads 1' 
# ceph tell osd.* injectargs '--osd-recovery-op-priority 1' 

注: 在设置之前我们最好先通过如下方式获取到对应参数的原始值,以便在恢复之后能够调整回来

# ceph daemon osd.0 config show | grep backfill
    "osd_max_backfills": "2",
    "osd_backfill_full_ratio": "0.85",
    "osd_backfill_retry_interval": "10",
    "osd_backfill_scan_min": "2",
    "osd_backfill_scan_max": "4",
    "osd_kill_backfill_at": "0",
    "osd_debug_skip_full_check_in_backfill_reservation": "false",
    "osd_debug_reject_backfill_probability": "0",

# ceph daemon osd.0 config show | grep recovery
    "osd_min_recovery_priority": "0",
    "osd_allow_recovery_below_min_size": "true",
    "osd_recovery_threads": "1",
    "osd_recovery_thread_timeout": "30",
    "osd_recovery_thread_suicide_timeout": "300",
    "osd_recovery_sleep": "0",
    "osd_recovery_delay_start": "0",
    "osd_recovery_max_active": "2",
    "osd_recovery_max_single_start": "5",
    "osd_recovery_max_chunk": "33554432",
    "osd_recovery_max_omap_entries_per_chunk": "64000",
    "osd_recovery_forget_lost_objects": "false",
    "osd_scrub_during_recovery": "true",
    "osd_recovery_op_priority": "3",
    "osd_recovery_op_warn_multiple": "16",

在调整完成之后执行如下命令进行参数恢复:

# ceph tell osd.* injectargs '--osd-max-backfills 2'
# ceph tell osd.* injectargs '--osd-recovery-max-active 2'
# ceph tell osd.* injectargs '--osd-recovery-max-single-start 5'

# ceph tell osd.* injectargs '--osd-backfill-scan-min 2' 
# ceph tell osd.* injectargs '--osd-backfill-scan-max 4' 
# ceph tell osd.* injectargs '--osd-recovery-threads 1' 
# ceph tell osd.* injectargs '--osd-recovery-op-priority 3' 

2.4 调整PG及PGP

这里我们首先调整PG的值,我们推荐的增长幅度是按照2的幂进行增长,如原来是64个,第一次调整先增加到128个,设置命令如下:

# ceph osd pool get <poolname> pg_num

# ceph osd pool set <poolname> pg_num <new_pgnum>

你可以通过ceph -w查看集群的变化。

在PG增长之后,等到整个集群恢复正常,再通过下面的命令增加PGP,PGP的值需要与PG的值一致:

# ceph osd pool get <poolname> pgp_num

# ceph osd pool set <poolname> pgp_num <new_pgnum>

此时,通过ceph -w可以看到集群状态的详细信息,可以看到数据的再平衡过程。

如下是针对本集群环境进行的一个调整(请注意调整时,每一步操作完成后及时观察整个集群状况):

# ceph osd pool set default.rgw.gc pg_num 16
# ceph -w
# ceph osd pool set default.rgw.gc pgp_num 16
# ceph -w
# ceph osd pool set default.rgw.gc pg_num 32
# ceph osd pool set default.rgw.gc pgp_num 32

# ceph osd pool set default.rgw.users.uid pg_num 16
# ceph -w
# ceph osd pool set default.rgw.users.uid pgp_num 16
# ceph -w
# ceph osd pool set default.rgw.users.uid pg_num 32
# ceph osd pool set default.rgw.users.uid pgp_num 32

# ceph osd pool set default.rgw.data.root pg_num 16
# ceph -w
# ceph osd pool set default.rgw.data.root pgp_num 16
# ceph -w
# ceph osd pool set default.rgw.data.root pg_num 32
# ceph osd pool set default.rgw.data.root pgp_num 32

# ceph osd pool set default.rgw.buckets.data pg_num 16
# ceph -w
# ceph osd pool set default.rgw.buckets.data pgp_num 16
# ceph -w
# ceph osd pool set default.rgw.buckets.data pg_num 32
# ceph osd pool set default.rgw.buckets.data pgp_num 32
# ceph osd pool set default.rgw.buckets.data pg_num 64
# ceph osd pool set default.rgw.buckets.data pgp_num 64
# ceph osd pool set default.rgw.buckets.data pg_num 128
# ceph osd pool set default.rgw.buckets.data pgp_num 128
# ceph osd pool set default.rgw.buckets.data pg_num 256
# ceph osd pool set default.rgw.buckets.data pgp_num 256
# ceph osd pool set default.rgw.buckets.data pg_num 512
# ceph osd pool set default.rgw.buckets.data pgp_num 512
# ceph osd pool set default.rgw.buckets.data pg_num 1024
# ceph osd pool set default.rgw.buckets.data pgp_num 1024

2.5 PG调整示例

下面给出一个实际生产环境中PG数量的示例(OSD个数为105):

# ceph osd pool ls | while read line; do ceph pg ls-by-pool $line | awk -v pool=$line 'BEGIN{col=0} /^[0-9a-f]+\.[0-9a-f]+/ {cols++} END{printf "%-32s: %d\n", pool, cols}'; done
.rgw.root                     : 32
default.rgw.control           : 8
default.rgw.data.root         : 32
default.rgw.gc                : 16
default.rgw.log               : 8
default.rgw.users.uid         : 32
default.rgw.users.keys        : 32
default.rgw.users.email       : 8
default.rgw.users.swift       : 8
default.rgw.usage             : 16
default.rgw.buckets.index     : 256
default.rgw.buckets.data      : 2048
default.rgw.intent-log        : 32
default.rgw.meta              : 8
default.rgw.buckets.non-ec    : 128



[参看]

  1. ceph PG数量调整/PG的状态说明

  2. POOL, PG AND CRUSH CONFIG REFERENCE

  3. PG个数的计算

  4. ceph pg数量调整

  5. ceph PG数量调整/PG的状态说明