通过对服务器、网络等资源的合理利用,可以实现南大通用GBase 8c在TPC-C标准测试中的性能最优化。本文结合实际环境,对tpcc性能调优进行说明(以GBase 8c V5 3.0.0版本为例)。

一、服务器配置

(1)服务器

使用4台服务器,节点IP为10.100.100.1~4 

(2)操作系统版本

建议使用国产麒麟操作系统。执行cat /etc/os-release命令查看具体版本。

[root@GBase-NODE4 ~]# cat /etc/os-release
NAME="Kylin Linux Advanced Server"
VERSION="V10 (Sword)"
ID="kylin"
VERSION_ID="V10"
PRETTY_NAME="Kylin Linux Advanced Server V10 (Sword)"
ANSI_COLOR="0;31"

(3)cpu信息

执行lscpu命令查看cpu信息:

[gbase@GBase-NODE1 ~]$  lscpu 
Architecture:                    aarch64
CPU op-mode(s):                  64-bit
Byte Order:                      Little Endian
CPU(s):                          96
On-line CPU(s) list:             0-95
Thread(s) per core:              1
Core(s) per socket:              48
Socket(s):                       2
NUMA node(s):                    4
Vendor ID:                       HiSilicon
Model:                           0
Model name:                      Kunpeng-920
Stepping:                        0x1
CPU max MHz:                     2600.0000
CPU min MHz:                     200.0000
BogoMIPS:                        200.00
L1d cache:                       6 MiB
L1i cache:                       6 MiB
L2 cache:                        48 MiB
L3 cache:                        96 MiB
NUMA node0 CPU(s):               0-23
NUMA node1 CPU(s):               24-47
NUMA node2 CPU(s):               48-71
NUMA node3 CPU(s):               72-95
……

重点看CPU(s)NUMA node(s) 信息。本实验环境可以看出,每台服务器cpu核心为 96 核,NUMA节点为4个。

(4)内存信息

执行free -h查看空余内存信息,建议预留至少16GB:

[root@GBase-NODE4 ~]# free -h
             total        used        free      shared  buff/cache   available
Mem:          509Gi        18Gi       414Gi       5.1Gi        77Gi       415Gi
Swap:         4.0Gi          0B       4.0Gi

(5)存储信息

查看硬盘存储信息:

[gbase@GBase-NODE1 ~]$ lsscsi 
[5:0:65:0]   enclosu HUAWEI   Expander 12Gx16  131   -        
[5:2:0:0]    disk    AVAGO    HW-SAS3508       5.06  /dev/sda 
[5:2:1:0]    disk    AVAGO    HW-SAS3508       5.06  /dev/sdb

可以看出有两张硬盘,本实验环境在/dec/sda上进行。

查看该硬盘的存储结构:

[root@GBase-NODE4 ~]# lsblk 
NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda               8:0    0 557.9G  0 disk 
├─sda1            8:1    0   600M  0 part /boot/efi
├─sda2            8:2    0     1G  0 part /boot
└─sda3            8:3    0 556.3G  0 part 
 ├─rootvg-root 252:0    0     5G  0 lvm  /
 ├─rootvg-swap 252:1    0     4G  0 lvm  [SWAP]
 ├─rootvg-usr  252:2    0    10G  0 lvm  /usr
 ├─rootvg-tmp  252:4    0     4G  0 lvm  /tmp
 ├─rootvg-opt  252:5    0     5G  0 lvm  /opt
 ├─rootvg-var  252:6    0     5G  0 lvm  /var
 └─rootvg-home 252:7    0    10G  0 lvm  /home
sdb               8:16   0  17.5T  0 disk 
└─datavg-datalv 252:3    0  17.5T  0 lvm  /data

(6)网络信息

查看网络信息:

[gbase@GBase-NODE1 ~]$ ip addr 
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
   link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
   inet 127.0.0.1/8 scope host lo
      valid_lft forever preferred_lft forever
   inet6 ::1/128 scope host 
      valid_lft forever preferred_lft forever
2: enp125s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
   link/ether 44:a1:91:9f:0c:02 brd ff:ff:ff:ff:ff:ff
3: enp125s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
   link/ether 44:a1:91:9f:0c:03 brd ff:ff:ff:ff:ff:ff
4: enp125s0f2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
   link/ether 44:a1:91:9f:0c:04 brd ff:ff:ff:ff:ff:ff
5: enp125s0f3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
   link/ether 44:a1:91:9f:0c:05 brd ff:ff:ff:ff:ff:ff
6: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master mgmt state UP group default qlen 1000
   link/ether 44:a1:91:3a:5a:f9 brd ff:ff:ff:ff:ff:ff
7: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master mgmt state UP group default qlen 1000
   link/ether 44:a1:91:3a:5a:f9 brd ff:ff:ff:ff:ff:ff
8: enp5s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
   link/ether 44:a1:91:3a:5a:fb brd ff:ff:ff:ff:ff:ff
9: enp6s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
   link/ether 44:a1:91:3a:5a:fc brd ff:ff:ff:ff:ff:ff
10: enp131s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master app state UP group default qlen 1000
   link/ether 44:a1:91:3a:5c:49 brd ff:ff:ff:ff:ff:ff
11: enp132s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master app state UP group default qlen 1000
   link/ether 44:a1:91:3a:5c:49 brd ff:ff:ff:ff:ff:ff
12: enp133s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
   link/ether 44:a1:91:3a:5c:4b brd ff:ff:ff:ff:ff:ff
13: enp134s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
   link/ether 44:a1:91:3a:5c:4c brd ff:ff:ff:ff:ff:ff
14: enp137s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
   link/ether 44:a1:91:3a:76:d9 brd ff:ff:ff:ff:ff:ff
   inet 192.20.122.1/24 brd 192.20.122.255 scope global noprefixroute enp137s0
      valid_lft forever preferred_lft forever
   inet6 fe80::939f:da48:c828:9e05/64 scope link noprefixroute 
      valid_lft forever preferred_lft forever
15: enp138s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
   link/ether 44:a1:91:3a:76:da brd ff:ff:ff:ff:ff:ff
   inet 192.20.123.1/24 brd 192.20.123.255 scope global noprefixroute enp138s0
      valid_lft forever preferred_lft forever
   inet6 fe80::98c5:33ca:65e7:fb68/64 scope link noprefixroute 
      valid_lft forever preferred_lft forever
……

其中inet6为万兆网卡,即enp137s0enp138s0。分别查看万兆网卡:

[gbase@GBase-NODE1 ~]$ ethtool enp137s0
Settings for enp137s0:
       Supported ports: [ FIBRE ]
       Supported link modes:   10000baseKR/Full 
                               25000baseCR/Full 
                               25000baseKR/Full 
       Supported pause frame use: Symmetric
       Supports auto-negotiation: No
       Supported FEC modes: Not reported
       Advertised link modes:  10000baseKR/Full 
                               25000baseCR/Full 
                               25000baseKR/Full 
       Advertised pause frame use: Symmetric
       Advertised auto-negotiation: No
       Advertised FEC modes: Not reported
       Speed: 10000Mb/s
       Duplex: Full
       Port: FIBRE
       PHYAD: 0
       Transceiver: internal
       Auto-negotiation: off
Cannot get wake-on-lan settings: Operation not permitted
       Current message level: 0x00000045 (69)
                              drv link rx_err
       Link detected: yes
[gbase@GBase-NODE1 ~]$ ethtool enp138s0
Settings for enp138s0:
       Supported ports: [ FIBRE ]
       Supported link modes:   10000baseKR/Full 
                               25000baseCR/Full 
                               25000baseKR/Full 
       Supported pause frame use: Symmetric
       Supports auto-negotiation: No
       Supported FEC modes: Not reported
       Advertised link modes:  10000baseKR/Full 
                               25000baseCR/Full 
                               25000baseKR/Full 
       Advertised pause frame use: Symmetric
       Advertised auto-negotiation: No
       Advertised FEC modes: Not reported
       Speed: 10000Mb/s
       Duplex: Full
       Port: FIBRE
       PHYAD: 0
       Transceiver: internal
       Auto-negotiation: off
Cannot get wake-on-lan settings: Operation not permitted
       Current message level: 0x00000045 (69)
                              drv link rx_err
       Link detected: yes

二、操作系统调优

(1)关闭IRQ均衡

service irqbalance stop
#service sysmonitor stop
service rsyslog stop

(2)关闭NUMA均衡 

echo 0 > /proc/sys/kernel/numa_balancing

(3)关闭透明大页

echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag

(4)调优服务

tuned-adm profile throughput-performance

三、网络配置调优

(1)数据库集群网络调整

服务器中存在多个网卡,推荐将数据库的管控口与数据口进行隔离:

新规划:

  • MGMT用于管控
  • SERVICE用于数据

调整后的yml部署文件:

gha_server:
 - gha_server3:
     host: 10.100.100.1
     port: 20001
 - gha_server3:
     host: 10.100.100.1
     port: 20001
dcs:
 - host: 10.100.100.1
   port: 2379
 - host: 10.100.100.3
   port: 2379
 - host: 10.100.100.4
   port: 2379
gtm:
 - gtm3:
     host: 10.185.103.1
     agent_host: 10.100.100.1
     role: primary
     port: 6666
     agent_port: 8003
     work_dir: /data/mpp/gbase/data/gtm3
 - gtm4:
     host: 10.185.103.2
     agent_host: 10.100.100.2
     role: standby
     port: 6666
     agent_port: 8004
     work_dir: /data/mpp/gbase/data/gtm4
 - gtm5:
     host: 10.185.103.4
     agent_host: 10.100.100.4
     role: standby
     port: 6666
     agent_port: 8005
     work_dir: /data/mpp/gbase/data/gtm5
coordinator:
 - cn3:
     host: 10.185.103.1
     agent_host: 10.100.100.1
     role: primary
     port: 5432
     agent_port: 8008
     work_dir: /data/gbase/mpp/data/coord/cn3
 - cn4:
     host: 10.185.103.2
     agent_host: 10.100.100.2
     role: primary
     port: 5432
     agent_port: 8009
     work_dir: /data/gbase/mpp/data/coord/cn4
datanode:
 - dn1:
     - dn1_3:
         host: 10.185.103.1
         agent_host: 10.100.100.1
         role: primary
         port: 20010
         agent_port: 8032
         work_dir: /data/mpp/gbase/data/dn1/dn1_3
     - dn1_4:
         host: 10.185.103.2
         agent_host: 10.100.100.2
         role: standby
         port: 20010
         agent_port: 8032
         work_dir: /data/mpp/gbase/data/dn1/dn1_4
 - dn2:
     - dn2_3:
         host: 10.185.103.2
         agent_host: 10.100.100.2
         role: primary
         port: 20020
         agent_port: 8033
         work_dir: /data/mpp/gbase/data/dn2/dn2_3
         # numa:
         #   cpu_node_bind: 0,1
         #   mem_node_bind: 0,1
     - dn2_4:
         host: 10.185.103.1
         agent_host: 10.100.100.21
         role: standby
         port: 20020
         agent_port: 8033
         work_dir: /data/mpp/gbase/data/dn2/dn2_4
         # numa:
         #   cpu_node_bind: 2
         #   mem_node_bind: 2
 - dn3:
     - dn3_3:
         host: 10.185.103.3
         agent_host: 10.100.100.3
         role: primary
         port: 20030
         agent_port: 8033
         work_dir: /data/mpp/gbase/data/dn3/dn3_3
     - dn3_4:
         host: 10.185.103.4
         agent_host: 10.100.100.4
         role: standby
         port: 20030
         agent_port: 8033
         work_dir: /data/mpp/gbase/data/dn3/dn3_4
 - dn4:
     - dn4_3:
         host: 10.185.103.4
         agent_host: 10.100.100.4
         role: primary
         port: 20040
         agent_port: 8034
         work_dir: /data/mpp/gbase/data/dn4/dn4_3
     - dn4_4:
         host: 10.185.103.3
         agent_host: 10.100.100.3
         role: standby
         port: 20040
         agent_port: 8034
         work_dir: /data/mpp/gbase/data/dn4/dn4_4
env:
 # cluster_type allowed values: multiple-nodes, single-inst, default is multiple-nodes
 cluster_type: multiple-nodes
 pkg_path: /data/mpp/gbase/gbase_pkg_b78
 prefix: /data/mpp/gbase/gbase_db
 version: V5_S3.0.0B78
 user: gbase
 port: 22
# constant:
# virtual_ip: 100.0.1.254/24

(2)中断绑核

16 表示 96 核中选取每个 NUMA 节点的最后 4 个节点进行绑核:

sudo bind_net_irq.sh 16

中断实现绑核需要禁用 SMMU,如果已在 BIOS 中开启,需要手动调整脚本,注释掉相关检查,具体在/data/mpp/install/gbase/app/bin/bind_net_irq.sh

function check_os()
{
   echo "Platform number of cores:" $core_no
   if [ $numa_node -ne 4 ]
   then
        echo "Warning: Number of NUMA nodes is not matched, please Disable DIE interleave in BIOS"
   fi
#    smmu=$(ps aux|grep smmu)
#    if [[ $smmu == *"arm-smmu"* ]]
#    then
#        echo $smmu
#        echo "Error: SMMU enabled, please Disable SMMU in BIOS"
#        exit
#    fi
}

四、数据库参数优化

分别在CN、GTM、DN节点上进行GUC参数优化:

--参数修改
cn:
gs_guc reload -N all -I all -Z coordinator -c "max_process_memory = 20GB"
gs_guc reload -N all -I all -Z coordinator -c "max_connections = 2048"
gs_guc reload -N all -I all -Z coordinator -c "max_prepared_transactions = 2048"
gs_guc reload -N all -I all -Z coordinator -c "shared_buffers=10GB"
gs_guc reload -N all -I all -Z coordinator -c "work_mem = 4MB"
gs_guc reload -N all -I all -Z coordinator -c "fsync=off"
gs_guc reload -N all -I all -Z coordinator -c "synchronous_commit=off"
gs_guc reload -N all -I all -Z coordinator -c "maintenance_work_mem = 64MB"
gs_guc reload -N all -I all -Z coordinator -c "checkpoint_segments=512"
gs_guc reload -N all -I all -Z coordinator -c "thread_pool_attr='96,2,(nodebind:1,2)'"
gs_guc reload -N all -I all -Z coordinator -c "thread_pool_attr='812,4,(cpubind:1-19,24-43,48-67,72-91)'"
gs_guc reload -N all -I all -Z coordinator -c "enable_memory_limit = on"
gs_guc reload -N all -I all -Z coordinator -c "cstore_buffers = 1GB"
gs_guc reload -N all -I all -Z coordinator -c "temp_buffers=64MB"
gs_guc reload -N all -I all -Z coordinator -c "checkpoint_timeout=30min"
gs_guc reload -N all -I all -Z coordinator -c "random_page_cost = 1.1"
gs_guc reload -N all -I all -Z coordinator -c "max_pool_size = 4096"
max_pool_size = 2048

dn:
gs_guc reload -N all -I all -Z datanode -c "max_process_memory = 180GB"
gs_guc reload -N all -I all -Z datanode -c "max_connections = 4096"
gs_guc reload -N all -I all -Z datanode -c "max_prepared_transactions = 4096"
gs_guc reload -N all -I all -Z datanode -c "shared_buffers=80GB"
gs_guc reload -N all -I all -Z datanode -c "work_mem = 4MB"
gs_guc reload -N all -I all -Z datanode -c "fsync = off"
gs_guc reload -N all -I all -Z datanode -c "synchronous_commit = off"
gs_guc reload -N all -I all -Z datanode -c "maintenance_work_mem = 64MB"
gs_guc reload -N all -I all -Z datanode -c "checkpoint_segments=512"
gs_guc reload -N all -I all -Z datanode -c "thread_pool_attr='96,2,(nodebind:1,2)'"
gs_guc reload -N all -I all -Z datanode -c "thread_pool_attr='812,4,(cpubind:1-19,24-43,48-67,72-91)'"
gs_guc reload -N all -I all -Z datanode -c "enable_memory_limit = on"
gs_guc reload -N all -I all -Z datanode -c "cstore_buffers = 1GB"
gs_guc reload -N all -I all -Z datanode -c "temp_buffers=64MB"
gs_guc reload -N all -I all -Z datanode -c "checkpoint_timeout=30min"
gs_guc reload -N all -I all -Z datanode -c "random_page_cost = 1.1"

gtm:
gs_guc reload -N all -I all -Z gtm -c "max_process_memory = 20GB"
gs_guc reload -N all -I all -Z gtm -c "max_connections=2048"
gs_guc reload -N all -I all -Z gtm -c "max_prepared_transactions=2048"
gs_guc reload -N all -I all -Z gtm -c "shared_buffers=2GB"
gs_guc reload -N all -I all -Z gtm -c "work_mem = 4MB"
gs_guc reload -N all -I all -Z gtm -c "fsync=off"
gs_guc reload -N all -I all -Z gtm -c "synchronous_commit=off"
gs_guc reload -N all -I all -Z gtm -c "maintenance_work_mem = 64MB"
gs_guc reload -N all -I all -Z gtm -c "checkpoint_segments=512"
gs_guc reload -N all -I all -Z gtm -c "enable_memory_limit = on"
gs_guc reload -N all -I all -Z gtm -c "cstore_buffers = 1GB"
gs_guc reload -N all -I all -Z gtm -c "temp_buffers=64MB"
gs_guc reload -N all -I all -Z gtm -c "checkpoint_timeout=30min"
gs_guc reload -N all -I all -Z gtm -c "random_page_cost = 1.1"

通过以上操作,可以对GBase 8c性能测试实现最优。大家如果还有其他建议,可以在评论区留言。