因客户业务发展迅速,原有系统对业务响应的能力就显得更加捉襟见肘,因此公司跟客户沟通后对原有系统进行IT架构转型重构,使用微服务架构方式,服务器资源虚拟化容器云服务PaaS平台,我负责基础环境的构建,包括对原有硬件进行资源虚拟化构建、阿里云服务APP端服务的PAAS平台构建部署配置,这次我们对内部管理系统和对C端客户的APP服务统一使用Openshift的容器云平台进行构建部署,实现混合云服务构建,如果根据以往方式部署,在某个tomcat出现内存泄漏等现象会导致该台服务器资源使用紧张现象,因此我们采用KVM工具对服务硬件资源进行虚拟化成多个服务,以便资源熔断隔离,然后在使用openshift来进行PAAS平台管理部署等,主要涉及的技术点多规划配置资源部署也是头等头疼事情,还好原有postgres数据库主从等还是部署在实体机上暂时不改造,最多也是数据库版本升级、操作系统版本升级、把报表服务独立出来,变成一主3从,然后原有tomcat应用、redis、MQ、NFS、DNS等部署都需要在新的容器进行资源合理规划配置部署,这也是我们公司首次打造了面向DevOps的PaaS平台的产品,这是具有开创性的 ,很多技术层面也是需要摸索,使用开源工具在配置安装部署上确实存在不少问题,很多坑需要耐心、细心进行分析排查处理。
碰到问题最多的应该是openshift配置部署, master独立一个节点、ETCD独立一个节点、若干个node子节点,子节点的硬件资源要求一模一样,因此我装完KVM 虚拟化系统后直接使用克隆方式来配置多个nodes提高效率,但是因为某个节点的IP的配置疏忽,导致报错,排查很久,而开源信息的报错信息有时跟实际发生问题提示信息不是很吻合,导致排查了很长时间,类似头痛医脚。具体问题如下:
INSTALLER STATUS *******************************************************************************************************************
Initialization : Complete (0:00:44)
Health Check : Complete (0:00:52)
Node Bootstrap Preparation : Complete (0:02:41)
在线
etcd Install : Complete (0:00:39)
Master Install : Complete (0:03:39)
Master Additional Install : Complete (0:00:43)
Node Join : In Progress (0:03:17)
This phase can be restarted by running: playbooks/openshift-node/join.yml
Failure summary:
1. Hosts: paas-node3.mylike.okd
Play: Disable excluders
Task: Install docker excluder - yum
Message: http://mirrors.nwsuaf.edu.cn/centos/7.6.1810/os/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.nwsuaf.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.cn99.com/centos/7.6.1810/os/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.cn99.com; Unknown error"
Trying other mirror.
http://ap.stykers.moe/centos/7.6.1810/os/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: ap.stykers.moe; Unknown error"
Trying other mirror.
http://mirrors.tuna.tsinghua.edu.cn/centos/7.6.1810/os/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.tuna.tsinghua.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.huaweicloud.com/centos/7.6.1810/os/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.huaweicloud.com; Unknown error"
Trying other mirror.
http://mirrors.zju.edu.cn/centos/7.6.1810/os/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.zju.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.163.com/centos/7.6.1810/os/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.163.com; Unknown error"
Trying other mirror.
http://mirrors.nju.edu.cn/centos/7.6.1810/os/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.nju.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.aliyun.com/centos/7.6.1810/os/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.aliyun.com; Unknown error"
Trying other mirror.
http://ftp.sjtu.edu.cn/centos/7.6.1810/os/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: ftp.sjtu.edu.cn; Unknown error"
Trying other mirror.
http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin311/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirror.centos.org; Unknown error"
Trying other mirror.
http://mirror.lzu.edu.cn/centos/7.6.1810/extras/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirror.lzu.edu.cn; Unknown error"
Trying other mirror.
http://centos.ustc.edu.cn/centos/7.6.1810/extras/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: centos.ustc.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.njupt.edu.cn/centos/7.6.1810/extras/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.njupt.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.nwsuaf.edu.cn/centos/7.6.1810/extras/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.nwsuaf.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.cn99.com/centos/7.6.1810/extras/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.cn99.com; Unknown error"
Trying other mirror.
http://ap.stykers.moe/centos/7.6.1810/extras/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: ap.stykers.moe; Unknown error"
Trying other mirror.
http://mirrors.tuna.tsinghua.edu.cn/centos/7.6.1810/extras/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.tuna.tsinghua.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.huaweicloud.com/centos/7.6.1810/extras/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.huaweicloud.com; Unknown error"
Trying other mirror.
http://mirrors.aliyun.com/centos/7.6.1810/extras/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.aliyun.com; Unknown error"
Trying other mirror.
http://ftp.sjtu.edu.cn/centos/7.6.1810/extras/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: ftp.sjtu.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.neusoft.edu.cn/centos/7.6.1810/updates/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.neusoft.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.163.com/centos/7.6.1810/updates/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.163.com; Unknown error"
Trying other mirror.
http://mirrors.nju.edu.cn/centos/7.6.1810/updates/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.nju.edu.cn; Unknown error"
Trying other mirror.
http://mirror.lzu.edu.cn/centos/7.6.1810/updates/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirror.lzu.edu.cn; Unknown error"
Trying other mirror.
http://centos.ustc.edu.cn/centos/7.6.1810/updates/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: centos.ustc.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.njupt.edu.cn/centos/7.6.1810/updates/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.njupt.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.nwsuaf.edu.cn/centos/7.6.1810/updates/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.nwsuaf.edu.cn; Unknown error"
Trying other mirror.
http://mirrors.cn99.com/centos/7.6.1810/updates/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.cn99.com; Unknown error"
Trying other mirror.
http://ap.stykers.moe/centos/7.6.1810/updates/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: ap.stykers.moe; Unknown error"
Trying other mirror.
http://mirrors.tuna.tsinghua.edu.cn/centos/7.6.1810/updates/x86_64/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: mirrors.tuna.tsinghua.edu.cn; Unknown error"
Trying other mirror.
http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin311/origin-docker-excluder-3.11.0-1.el7.git.0.62803d0.noarch.rpm: [Errno 14] curl#6 - "Could not resolve host: mirror.centos.org; Unknown error"
Trying other mirror.
Error downloading packages:
origin-docker-excluder-3.11.0-1.el7.git.0.62803d0.noarch: [Errno 256] No more mirrors to try.
2. Hosts: paas-master.mylike.okd
Play: Approve any pending CSR requests from inventory nodes
Task: Approve node certificates when bootstrapping
Message: Could not find csr for nodes: paas-node3.mylike.okd
刚开始碰到问题时,以为是因为openshift在部署时要通过master节点然后通过网络下载拉取各个不同的docker镜像进行安装部署,并再各个node同样部署,网络问题导致失效,重复多次后发现问题都一样,然后怀疑是master或者node3的yum.conf 配置有问题,修改对比多次后发现还是同样问题。最终在node3服务器上想通过yum makecache 发现也有问题,然后就ping www.baidu.com才发现这台服务无法访问外网,于是删除了网卡重新配置后,能访问外网,在重新安装openshift,问题解决,安装成功。