1.1 GreenPlum安装地址

1)GreenPlum官网地址 2)文档查看地址 3)下载地址

1.2 GreenPlum安装部署
1.2.0 模板虚拟机环境准备
0)安装模板虚拟机,IP地址192.168.10.101、主机名称greenplum101、内存8G、核数4个、硬盘50G
1)greenplum100虚拟机配置要求如下(本文Linux系统全部以CentOS-7-x86_64为例 官网下载地址
(1)修改参数,使虚拟机联网

vim /etc/sysconfig/network-scripts/ifcfg-ens33

GreenPlum 6.25.3 集群安装_vim

(2)测试是否已成功连上网络

[root@greenplum100 ~]# ping www.baidu.com

PING www.baidu.com (14.215.177.39) 56(84) bytes of data.
64 bytes from 14.215.177.39 (14.215.177.39): icmp_seq=1 ttl=128 time=8.60 ms
64 bytes from 14.215.177.39 (14.215.177.39): icmp_seq=2 ttl=128 time=7.72 ms

(3)安装epel-release

注:Extra Packages for Enterprise Linux是为“红帽系”的操作系统提供额外的软件包,适用于RHEL、CentOS和Scientific Linux。相当于是一个软件仓库,大多数rpm包在官方 repository 中是找不到的)

[root@greenplum100 ~]# yum install -y epel-release

注:如果Linux安装的是最小系统版,还需要安装如下工具;如果安装的是Linux桌面标准版,不需要执行如下操作

[root@greenplum100 ~]# yum install -y  vim net-tools psmisc  nc  rsync  lrzsz  ntp libzstd openssl-static tree iotop git

2)卸载虚拟机自带的JDK

注:如果你的虚拟机是最小化安装不需要执行这一步。

[root@greenplum100 ~]# rpm -qa | grep -i java | xargs -n1 rpm -e --nodeps

rpm -qa:查询所安装的所有rpm软件包
grep -i:忽略大小写
xargs -n1:表示每次只传递一个参数
rpm -e –nodeps:强制卸载软件

3)设置模板机hosts映射文件

vim /etc/hosts 

192.168.10.100 greenplum100
192.168.10.101 greenplum101
192.168.10.102 greenplum102
192.168.10.103 greenplum103

4)重启虚拟机

[root@greenplum100 ~]# reboot

1.2.1 环境准备

硬件:克隆3台虚拟机(每台4核、8G内存、50G存储)。
操作系统:CentOS-7.5-x86-1804。
GreenPlum版本:open-source-greenplum-db-6.25.3-rhel7-x86_64
数据库节点安装规划:1台master节点, 无standby节点,2台segment节点
主机名配置及节点规划如下:

主机ip

主机名

节点规划

192.168.3.101

greenplum101

master节点

192.168.3.102

greenplum102

segment1节点

192.168.3.103

greenplum103

segment1节点

1.2.2 安装前配置
1)检查安装依赖包
GP6.X版本安装之前需要进行依赖检查,确保如下依赖都已安装完成。

yum install -y apr apr-util bash bzip2 curl krb5 libcurl libevent libxml2 libyaml zlib openldap openssh-client openssl openssl-libs perl readline rsync R sed tar zip krb5-devel

2)关闭SElinux

三台节点的SElinux都需要关闭,修改/etc/selinux/config,将其置为disable。
[root@greenplum102 ~]# vim /etc/selinux/config

GreenPlum 6.25.3 集群安装_CentOS_02

3)关闭防火墙

三台节点的防火墙都需要关闭。

[root@greenplum102 ~]# systemctl stop firewalld
[root@greenplum102 ~]# systemctl disable firewalld

GreenPlum 6.25.3 集群安装_CentOS_03

4)操作系统参数配置
修改操作系统的参数(/etc/sysctl.conf)。
(1)共享内存(每台节点需要单独计算)
kernel.shmall = _PHYS_PAGES / 2 ,系统可用的内存页总量的一半,可以用getconf _PHYS_PAGES查看系统可用的内存页总量。

[root@greenplum102 ~]# echo $(expr $(getconf _PHYS_PAGES) / 2)

482661

kernel.shmmax = kernel.shmall * PAGE_SIZE ,命令getconf PAGE_SIZE或者页大小。

[root@greenplum102 ~]# echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))

1976979456

(2)主机内存
vm.overcommit_memory 系统使用该参数来确定可以为进程分配多少内存。对于GP数据库,此参数应设置为2。
vm.overcommit_ratio 以为进程分配内的百分比,其余部分留给操作系统。默认值为50。建议设置95。
(3)端口设定
为避免在Greenplum初始化期间与其他应用程序之间的端口冲突,指定的端口范围 net.ipv4.ip_local_port_range。使用gpinitsystem初始化Greenplum时,请不要在该范围内指定Greenplum数据库端口。
(4)系统内存
系统内存大于64G ,建议以下配置:

vm.dirty_background_ratio = 0
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736 # 1.5GB
vm.dirty_bytes = 4294967296 # 4GB

系统内存小于等于 64GB,移除vm.dirty_background_bytes 设置,并设置以下参数。

vm.dirty_background_ratio = 3
vm.dirty_ratio = 10

最后,本次系统参数配置如下:(三台都需要修改)vim /etc/sysctl.conf。

kernel.shmall = 482661 
kernel.shmmax = 1976979456 # 这里的俩个参数为该节点第一步分别计算出的数据
# 设置系统范围内共享内存段的最大数量,默认4096
kernel.shmmni = 4096
# See Segment Host Memory           
# 主机内存
vm.overcommit_memory = 2
# See Segment Host Memory
vm.overcommit_ratio = 95
# See Port Settings 端口设定
net.ipv4.ip_local_port_range = 10000 65535
kernel.sem = 500 2048000 200 40960
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048

net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
# See System Memory
# 系统内存
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10

设置完成后 重载参数( sysctl -p)。

5)系统资源限制
修改系统资源限制配置文件(vim /etc/security/limits.conf),3台主机同步修改,都添加以下参数:

*       soft    nofile  65536
*       hard    nofile  65536
*       soft    nproc   131072
*       hard    nproc   131072
  • “*” 星号表示所有用户
  • noproc 是代表最大进程数
  • nofile 是代表最大文件打开数

同时,针对CentOS 7操作系统,还需修改:vim /etc/security/limits.d/20-nproc.conf 文件。

* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072

退出重新登陆,ulimit -u 命令显示每个用户可用的最大进程数max user processes。验证返回值为131072。

6)ssh连接阈值
Greenplum数据库管理程序中的gpexpand、 gpinitsystem、gpaddmirrors,使用 SSH连接来执行任务。在规模较大的Greenplum集群中,程序的ssh连接数可能会超出主机的未认证连接的最大阈值。发生这种情况时,会收到以下错误:ssh_exchange_identification: Connection closed by remote host。
为避免这种情况,可以更新 /etc/ssh/sshd_config 或者 /etc/sshd_config 文件的 MaxStartups 和 MaxSessions 参数。
root用户登陆所有服务器,编辑配置文件:/etc/ssh/sshd_config,修改完成,重启sshd服务,使参数生效。

[root@greenplum102 ~]# vi /etc/ssh/sshd_config

注:找到这俩个参数进行修改
MaxSessions 200
MaxStartups 100:30:1000

7)修改字符集
检查主机的字符集,字符集必须是 en_US.UTF-8,查看LANG环境变量或者通过locale命令。

[root@greenplum102 ~]# echo $LANG

en_US.UTF-8

如果不是 en_US.UTF-8字符集,则用root用户进行设置,退出重新登陆后,再进行查询设置是否生效。

localectl set-locale LANG=en_US.UTF-8

8)确保集群时钟同步(如果集群时间不一致,每台节点使用如下命令)

ntpdate cn.pool.ntp.org

1.2.3 开始安装GreenPlum

1)创建gpadmin组及用户(三个节点都创建…)

[root@greenplum102 ~]# groupadd gpadmin
[root@greenplum102 ~]# useradd gpadmin -r -m -g gpadmin
[root@greenplum102 ~]# passwd gpadmin
Changing password for user gpadmin.
New password: 
BAD PASSWORD: The password is shorter than 8 characters
Retype new password: 
passwd: all authentication tokens updated successfully.

2)给gpadmin设置用户具有root权限,方便后期加sudo执行root权限的命令

[root@greenplum102 ~]# vim /etc/sudoers
## Allow root to run any commands anywhere
root    ALL=(ALL)     ALL
gpadmin ALL=(ALL)     NOPASSWD:ALL

3)配置节点间的免密登录(三台同样的操作,以下以greenplum102为例…)(在gpadmin用户下执行)

[gpadmin@greenplum102 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/gpadmin/.ssh/id_rsa): 
Created directory '/home/gpadmin/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/gpadmin/.ssh/id_rsa.
Your public key has been saved in /home/gpadmin/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:FvcBSHQ0GD9WPRdWeUN1bigvRFcH7E5vtxUPLfUZknY gpadmin@greenplum102
The key's randomart image is:
+---[RSA 2048]----+
|       o+== oo=*@|
|        oo = ==E*|
|        . = =.+=O|
|         + + +=+o|
|        S   oo.+o|
|       .     .. *|
|               .+|
|               . |
|                 |
+----[SHA256]-----+

将生成的密钥分发(每一个节点都需要执行下面这段命令)。

如当前节点需要执行 ssh-copy-id greenplum102、ssh-copy-id greenplum103、ssh-copy-id greenplum104命令(greenplum103 greenplum104为其它俩个节点)

[gpadmin@greenplum102 ~]$ ssh-copy-id greenplum102
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/gpadmin/.ssh/id_rsa.pub"
The authenticity of host 'greenplum102 (192.168.10.102)' can't be established.
ECDSA key fingerprint is SHA256:JZDFincZCdUJWBBwiwlF/7xp5ZHoCrbjWxqE30gh1vw.
ECDSA key fingerprint is MD5:39:83:38:be:32:2d:99:66:42:98:fa:01:8a:c5:98:4e.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
gpadmin@greenplum102's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'greenplum102'"

4)配置GreenPlum的ssh权限互通设置 (在greenplum102节点配置)
在/home/gpadmin 目录下常见一个 conf目录,用于存放配置文件

[gpadmin@greenplum102 ~]$ mkdir -p /home/gpadmin/conf

在/home/gpadmin/conf 目录下创建两个文件,分别是hostlist 和 seg_hosts

[gpadmin@greenplum102 ~]$ touch /home/gpadmin/conf hostlist
 
[gpadmin@greenplum102 ~]$ touch /home/gpadmin/conf seg_hosts

编辑hostlist 和 seg_hosts 输入内容如下

[gpadmin@greenplum102 ~]$ vim /home/gpadmin/conf hostlist
greenplum102
greenplum103
greenplum104
 
[gpadmin@greenplum102 ~]$ vim /home/gpadmin/conf seg_hosts
greenplum103
greenplum104

5)先上传GreenPlum软件安装包,然后执行rpm安装(三台都需要安装)(安装时如果报错 请将依赖的包yum安装即可…)

[gpadmin@greenplum102 software]# mkdir -p /home/gpadmin/software
[gpadmin@greenplum102 software]# sudo yum -y install ./open-source-greenplum-db-6.25.3-rhel7-x86_64.rpm

– 注:安装完之后,这个地方是将这个文件在gpadmin用户赋予权限

[gpadmin@greenplum102 local]$ sudo chown -R gpadmin:gpadmin /usr/local/greenplum-db*

6)使用 gpssh-exkeys 打通所有服务器 (Master节点进行操作)

[gpadmin@greenplum102 ~]# cd /usr/local/greenplum-db-6.25.3/

[gpadmin@greenplum102 greenplum-db-6.25.3]$ source /usr/local/greenplum-db-6.25.3/greenplum_path.sh(注source后用空格)

[gpadmin@greenplum102 greenplum-db-6.25.3]$ cd /home/gpadmin/conf

[gpadmin@greenplum102 conf]$ gpssh-exkeys -f hostlist
[STEP 1 of 5] create local ID and authorize on local host
  ... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped

[STEP 2 of 5] keyscan all hosts and update known_hosts file

[STEP 3 of 5] retrieving credentials from remote hosts
  ... send to greenplum103
  ... send to greenplum104

[STEP 4 of 5] determine common authentication file content

[STEP 5 of 5] copy authentication files to all remote hosts
  ... finished key exchange with greenplum103
  ... finished key exchange with greenplum104

[INFO] completed successfully

7)环境变量配置.bashrc和GPHOME(所有节点机器 gpadmin 用户操作)(配置环境变量.bashrc)

[gpadmin@greenplum102 software]# mkdir -p /home/gpadmin/data/master

通过以下方式进行修改:

cat <<EOF>> /home/gpadmin/.bashrc

source /usr/local/greenplum-db/greenplum_path.sh

export PGPORT=5432

export PGUSER=gpadmin

export MASTER_DATA_DIRECTORY=/home/gpadmin/data/master/gpseg-1

export PGDATABASE=gp_sydb

export LD_PRELOAD=/lib64/libz.so.1 ps

EOF

[gpadmin@greenplum102 conf]$ source /home/gpadmin/.bashrc

GreenPlum 6.25.3 集群安装_vim_04

8)配置环境变量GPHOME,首先进到文件中直接修改

[gpadmin@greenplum102 ~]#vim /usr/local/greenplum-db/greenplum_path.sh

#添加以下路径
GPHOME=/usr/local/greenplum-db

9)创建数据文件夹(Master节点)

创建各个节点的数据文件夹,该步骤之后 hostlist 文件包含机器下都会创建 data目录,data目录下都会创建master、primary、mirror文件夹。

具体操作:

[gpadmin@greenplum102 ~]# gpssh -f /home/gpadmin/conf/hostlist

[gpadmin@greenplum102 conf]$ gpssh -f /home/gpadmin/conf/hostlist
=> mkdir data
[greenplum102]
[greenplum103] mkdir: cannot create directory ‘data’: File exists
[greenplum104] mkdir: cannot create directory ‘data’: File exists
=> cd data
[greenplum102]
[greenplum103]
[greenplum104]
=> mkdir master
[greenplum102]
[greenplum103] mkdir: cannot create directory ‘master’: File exists
[greenplum104] mkdir: cannot create directory ‘master’: File exists
=> mkdir primary
[greenplum102]
[greenplum103]
[greenplum104]
=> exit[greenplum102]
[greenplum103]
[greenplum104]
=> exit

注:cannot create directory ‘master’: File exists 不用管...

10)连通性检查(主节点 gpadmin 用户操作)

[gpadmin@greenplum102 conf]$ gpcheckperf -f /home/gpadmin/conf/hostlist -r N -d /tmp

[INFO] --buffer-size value is not specified or invalid. Using default (32 kilobytes)
/usr/local/greenplum-db-6.25.3/bin/gpcheckperf -f /home/gpadmin/conf/hostlist -r N -d /tmp

-------------------
--  NETPERF TEST
-------------------

====================
==  RESULT 2023-12-01T02:35:41.263196
====================
Netperf bisection bandwidth test
greenplum102 -> greenplum103 = 333.010000
greenplum104 -> greenplum105 = 330.740000
greenplum103 -> greenplum102 = 337.730000

Summary:
sum = 1346.55 MB/sec
min = 330.74 MB/sec
max = 345.07 MB/sec
avg = 336.64 MB/sec
median = 337.73 MB/sec

1.2.4 集群初始化

1)创建初始化配置文件(主节点 gpadmin 用户操作)

这里修改初始化文件,首先拷贝一个文件gpinitsystem_config,在修改。

mkdir /home/gpadmin/gpconfigs

cd /home/gpadmin/gpconfigs

cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config 

vim /home/gpadmin/gpconfigs/gpinitsystem_config

GreenPlum 6.25.3 集群安装_greenplum_05

# ------------------------以下为配置内容------------------------

# 该项配置设置主节点数据存储位置,括号里边有几个代表每台主机创建几个postgresql数据库实例,即segment的实例数,上边示例是2个。

# 注图上删掉部分替换成下方这行代码
declare -a DATA_DIRECTORY=(/home/gpadmin/data/primary /home/gpadmin/data/primary)

# 该项配置设置主节点机器名字

MASTER_HOSTNAME=greenplum101

# 该项配置设置主节点数据存储位置

MASTER_DIRECTORY=/home/gpadmin/data/master

# 该项配置设置是备节点数据存储位置,规则同DATA_DIRECTORY,括号里边数量和DATA_DIRECTORY保持一致。

# greenplum数据分主节点和备节点,主节点挂掉时候备节点数据会启用。

GreenPlum 6.25.3 集群安装_greenplum_06

#注图上删掉部分替换成下方这行代码
declare -a MIRROR_DATA_DIRECTORY=(/home/gpadmin/data/primary /home/gpadmin/data/primary)

# 该项配置设置默认数据库名字,和环境变量数据库名字保持一致,不然会失败。

DATABASE_NAME=gp_sydb

2、在/home/gpadmin/gpconfigs新增一个配置文件hostfile_gpinitsystem
具体操作:
cd /home/gpadmin/gpconfigs
vim hostfile_gpinitsystem  内容如下:
greenplum103
greenplum104

2)初始化

[gpadmin@greenplum102 ~]$ gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config -h /home/gpadmin/gpconfigs/hostfile_gpinitsystem
集群初始化成功完成,会提示:Greenplum Database instance successfully created。

GreenPlum 6.25.3 集群安装_CentOS_07

初始化有误,可以使用命令gpdeletesystem进行删除,重新初始化:

gpdeletesystem -d /home/gpadmin/data/master/gpseg-1 -f

参数-d 后面跟 MASTER_DATA_DIRECTORY(master 的数据目录),会清除master,segment所有的数据目录。
参数-f force, 终止所有进程,进行强制删除。

3)GreenPlum的一些常用启停命令

关闭 gpstop
启动 gpstart
状态查看 gpstate
help 命令查看所有命令