使用 RKE 方式搭建 K8s 集群并部署 NebulaGraph

smallmeteror
发布于 2023-8-31 11:34
2455浏览
0收藏

  • 本文仅供参考,不推荐在生产环境使用。K8s 集群安装 NebulaGraph 部署

一、集群环境准备

二、Docker 部署

  • 配置 Docker YUM 源
  • 安装 Docker CE
  • 启动 Docker 服务
  • 配置 Docker 容器镜像加速器

三、Docker Compose 安装

四、添加 Rancher 用户

五、生成 SSH 证书用于部署集群

  • 生成 SSH 证书
  • 复制证书到集群中所有主机
  • 验证 SSH 证书是否可用

六、RKE 工具下载

七、初始化 RKE 配置文件

八、集群部署

九、安装 kubectl 客户端

  • kubectl 客户端安装
  • kubectl 客户端配置集群管理文件及应用验证

十、集群 Web 管理 Rancher

  • 使用 docker run 启动一个 rancher
  • 访问 Rancher【这里映射端口 1080,1443】

十一、搭建 NFS 服务器

十二、使用 NFS 文件系统创建存储动态供给(Storage Class 安装)

  • 下载并创建 Storage Class
  • 下载并创建 RBAC
  • 创建动态供给的 deployment

十三、NebulaGraph Operator 安装部署

  • 卸载 NebulaGraph Operator(如果需要卸载重新安装)

十四、部署 NebulaGraph(kubectl)

十五、连接 NebulaGraph

参考资料


K8s 集群安装 NebulaGraph 部署


在本文,我将会详细地记录下我用 K8s 部署分布式图数据库 NebulaGraph 的过程。下面是本次实践的内容规划:

  1. 一到十章节为 K8s 集群搭建过程;
  2. 十一到十五章节为参考 NebulaGraph 官方文档安装部署 NebulaGraph的过程本文所有实践是在本地虚拟机 3 台 CentOS 实例上完成。


一、集群环境准备


本文所有集群都遵循以下的部署,仅供参考。

首先,集群规划方面:

#把 xxx 替换为对应的主机名
192.168.222.141 node1
192.168.222.142 node2
192.168.222.143 node3
  • 1.
  • 2.
  • 3.
  • 4.

配置静态 IP

# vi /etc/sysconfig/network-scripts/ifcfg-enss3

IPADDR="192.168.222.XXX" # XXX 是自己规划的 IP
PREFIX="24"  # 掩码 4 个 255
GATEWAY="192.168.222.XXX" # 网关需要自己指定
DNS1="114.114.114.114" # DNS 也可以设置为其他,能用即可
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.

配置主机名

hostnamectl set-hostname xxx
192.168.222.141 node1
192.168.222.142 node2
192.168.222.143 node3
  • 1.
  • 2.
  • 3.
  • 4.

配置 ip_forward 及过滤机制

# vim /etc/sysctl.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
modprobe br_netfilter
sysctl -p /etc/sysctl.conf
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.

防火墙设置

# systemctl stop firewalld
# systemctl disable firewalld
  • 1.
  • 2.

SELinux 设置

#永久关闭,一定要重启操作系统后生效。
sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
  • 1.
  • 2.

主机 Swap 分区设置

sed -ri 's/.*swap.*/#&/' /etc/fstab
  • 1.

集群时间同步设置

# yum -y install ntpdate
# crontab -e
0 */1 * * *  ntpdate time1.aliyun.com
  • 1.
  • 2.
  • 3.


二、Docker 部署


⚠️ 所有主机均要按照下面方法进行配置。

配置 Docker YUM 源

# wget -O /etc/yum.repos.d/docker-ce.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
  • 1.

安装 Docker CE

通过下面命令安装 Docker 社区版本:

#  yum install docker-ce-18.09.8-3.el7.x86_64 -y
  • 1.

⚠️ 这里需要指定 Docker 版本,否则后面 RKE 安装会报 Docker 版本不兼容问题。

启动 Docker 服务

# systemctl enable docker
# systemctl start docker
  • 1.
  • 2.

配置 Docker 容器镜像加速器

登录个人阿里云账号,在镜像加速中查看配置,将 xxxxx 替换为自己的 ID:

# vim /etc/docker/daemon.json
# cat /etc/docker/daemon.json
{
  "registry-mirrors": ["https://xxxxx.mirror.aliyuncs.com"]
}
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.


三、Docker Compose 安装


通过以下方式安装 Docker Compose:

# curl -L "https://github.com/docker/compose/releases/download/1.28.5/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
# chmod +x /usr/local/bin/docker-compose
# ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
# docker-compose --version
  • 1.
  • 2.
  • 3.
  • 4.


四、添加 Rancher 用户


在使用 CentOS 时,不能使用 root 账号,因此我们要添加专用的账号进行 Docker 相关操作。同之前一样,下面操作需要在所有机器上执行:

# useradd rancher
# usermod -aG docker rancher
# echo 123 | passwd --stdin rancher
  • 1.
  • 2.
  • 3.


五、生成 SSH 证书用于部署集群


RKE 二进制文件安装主机上创建密钥,即为 control 主机,用于部署集群。

生成 SSH 证书

# ssh-keygen
  • 1.

复制证书到集群中所有主机

# 修改文件夹所属用户及所属组
chown -R rancher:rancher /home/rancher
# ssh-copy-id rancher@node1
# ssh-copy-id rancher@node2
# ssh-copy-id rancher@node3
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

验证 SSH 证书是否可用

本次实践主要在 node1 上部署 RKE 二进制文件,并在 RKE 二进制文件安装主机机测试连接其它集群主机,可通过 Docker 的命令 docker ps 来查看服务是否可用。

# ssh rancher@主机名
远程主机# docker ps
  • 1.
  • 2.


六、RKE 工具下载


依旧是在 node1 上部署 RKE 二进制文件。

# wget https://github.com/rancher/rke/releases/download/v1.3.7/rke_linux-amd64
# mv rke_linux-amd64 /usr/local/bin/rke
# chmod +x /usr/local/bin/rke
# rke --version
rke version v1.3.7
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.


七、初始化 RKE 配置文件


# mkdir -p /app/rancher
# cd /app/rancher
# rke config --name cluster.yml
[+] Cluster Level SSH Private Key Path [~/.ssh/id_rsa]: 集群私钥路径
[+] Number of Hosts [1]: 3 集群中有 3 个节点
[+] SSH Address of host (1) [none]: 192.168.10.10 第一个节点 IP 地址
[+] SSH Port of host (1) [22]: 22 第一个节点 SSH 访问端口
[+] SSH Private Key Path of host (192.168.10.10) [none]: ~/.ssh/id_rsa 第一个节点私钥路径
[+] SSH User of host (192.168.10.10) [ubuntu]: rancher 远程用户名
[+] Is host (192.168.10.10) a Control Plane host (y/n)? [y]: y 是否为 K8s 集群控制节点
[+] Is host (192.168.10.10) a Worker host (y/n)? [n]: n 不是 worker 节点
[+] Is host (192.168.10.10) an etcd host (y/n)? [n]: n 不是 etcd 节点
[+] Override Hostname of host (192.168.10.10) [none]: 不覆盖现有主机名
[+] Internal IP of host (192.168.10.10) [none]: 主机局域网 IP 地址
[+] Docker socket path on host (192.168.10.10) [/var/run/docker.sock]: 主机上 docker.sock 路径
[+] SSH Address of host (2) [none]: 192.168.10.12 第二个节点
[+] SSH Port of host (2) [22]: 22 远程端口
[+] SSH Private Key Path of host (192.168.10.12) [none]: ~/.ssh/id_rsa 私钥路径
[+] SSH User of host (192.168.10.12) [ubuntu]: rancher 远程访问用户
[+] Is host (192.168.10.12) a Control Plane host (y/n)? [y]: n 不是控制节点
[+] Is host (192.168.10.12) a Worker host (y/n)? [n]: y 是 worker 节点
[+] Is host (192.168.10.12) an etcd host (y/n)? [n]: n 不是 etcd 节点
[+] Override Hostname of host (192.168.10.12) [none]: 不覆盖现有主机名
[+] Internal IP of host (192.168.10.12) [none]: 主机局域网 IP 地址
[+] Docker socket path on host (192.168.10.12) [/var/run/docker.sock]: 主机上 docker.sock 路径
[+] SSH Address of host (3) [none]: 192.168.10.14 第三个节点
[+] SSH Port of host (3) [22]: 22 远程端口
[+] SSH Private Key Path of host (192.168.10.14) [none]: ~/.ssh/id_rsa 私钥路径
[+] SSH User of host (192.168.10.14) [ubuntu]: rancher 远程访问用户
[+] Is host (192.168.10.14) a Control Plane host (y/n)? [y]: n 不是控制节点
[+] Is host (192.168.10.14) a Worker host (y/n)? [n]: n 不是 worker 节点
[+] Is host (192.168.10.14) an etcd host (y/n)? [n]: y 是 etcd 节点
[+] Override Hostname of host (192.168.10.14) [none]: 不覆盖现有主机名
[+] Internal IP of host (192.168.10.14) [none]: 主机局域网 IP 地址
[+] Docker socket path on host (192.168.10.14) [/var/run/docker.sock]: 主机上 docker.sock 路径
[+] Network Plugin Type (flannel, calico, weave, canal, aci) [canal]: 使用的网络插件
[+] Authentication Strategy [x509]: 认证策略
[+] Authorization Mode (rbac, none) [rbac]: 认证模式
[+] Kubernetes Docker image [rancher/hyperkube:v1.21.9-rancher1]: 集群容器镜像
[+] Cluster domain [cluster.local]: 集群域名
[+] Service Cluster IP Range [10.43.0.0/16]: 集群中 Servic IP 地址范围
[+] Enable PodSecurityPolicy [n]: 是否开启 Pod 安装策略
[+] Cluster Network CIDR [10.42.0.0/16]: 集群 Pod 网络
[+] Cluster DNS Service IP [10.43.0.10]: 集群 DNS Service IP 地址
[+] Add addon manifest URLs or YAML files [no]: 是否增加插件 manifest URL 或配置文件
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.

在部署过程中,需要注意 37 行版本的设置,这里设置为:rancher/hyperkube:v1.21.9-rancher1

[root@node1 rancher]# ls
cluster.yml
  • 1.
  • 2.

此外,在 cluster.yaml 文件中修改以下配置:

kube-controller:
image: ""
extra_args:
# 如果后面需要部署 kubeflow 或 istio 则一定要配置以下参数
cluster-signing-cert-file: "/etc/kubernetes/ssl/kube-ca.pem"
cluster-signing-key-file: "/etc/kubernetes/ssl/kube-ca-key.pem"
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.


八、集群部署


# pwd
/app/rancher
# rke up
输出:
INFO[0000] Running RKE version: v1.3.7
INFO[0000] Initiating Kubernetes cluster
INFO[0000] [dialer] Setup tunnel for host [192.168.10.14]
INFO[0000] [dialer] Setup tunnel for host [192.168.10.10]
INFO[0000] [dialer] Setup tunnel for host [192.168.10.12]
INFO[0000] Checking if container [cluster-state-deployer] is running on host [192.168.10.14], try #1
INFO[0000] Checking if container [cluster-state-deployer] is running on host [192.168.10.10], try #1
INFO[0000] Checking if container [cluster-state-deployer] is running on host [192.168.10.12], try #1
INFO[0000] [certificates] Generating CA kubernetes certificates
INFO[0000] [certificates] Generating Kubernetes API server aggregation layer requestheader client CA certificates
INFO[0000] [certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates
INFO[0000] [certificates] Generating Kubernetes API server certificates
INFO[0000] [certificates] Generating Service account token key
INFO[0000] [certificates] Generating Kube Controller certificates
INFO[0000] [certificates] Generating Kube Scheduler certificates
INFO[0000] [certificates] Generating Kube Proxy certificates
INFO[0001] [certificates] Generating Node certificate
INFO[0001] [certificates] Generating admin certificates and kubeconfig
INFO[0001] [certificates] Generating Kubernetes API server proxy client certificates
INFO[0001] [certificates] Generating kube-etcd-192-168-10-14 certificate and key
INFO[0001] Successfully Deployed state file at [./cluster.rkestate]
INFO[0001] Building Kubernetes cluster
INFO[0001] [dialer] Setup tunnel for host [192.168.10.12]
INFO[0001] [dialer] Setup tunnel for host [192.168.10.14]
INFO[0001] [dialer] Setup tunnel for host [192.168.10.10]
INFO[0001] [network] Deploying port listener containers
INFO[0001] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0001] Starting container [rke-etcd-port-listener] on host [192.168.10.14], try #1
INFO[0001] [network] Successfully started [rke-etcd-port-listener] container on host [192.168.10.14]
INFO[0001] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0001] Starting container [rke-cp-port-listener] on host [192.168.10.10], try #1
INFO[0002] [network] Successfully started [rke-cp-port-listener] container on host [192.168.10.10]
INFO[0002] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.12]
INFO[0002] Starting container [rke-worker-port-listener] on host [192.168.10.12], try #1
INFO[0002] [network] Successfully started [rke-worker-port-listener] container on host [192.168.10.12]
INFO[0002] [network] Port listener containers deployed successfully
INFO[0002] [network] Running control plane -> etcd port checks
INFO[0002] [network] Checking if host [192.168.10.10] can connect to host(s) [192.168.10.14] on port(s) [2379], try #1
INFO[0002] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0002] Starting container [rke-port-checker] on host [192.168.10.10], try #1
INFO[0002] [network] Successfully started [rke-port-checker] container on host [192.168.10.10]
INFO[0002] Removing container [rke-port-checker] on host [192.168.10.10], try #1
INFO[0002] [network] Running control plane -> worker port checks
INFO[0002] [network] Checking if host [192.168.10.10] can connect to host(s) [192.168.10.12] on port(s) [10250], try #1
INFO[0002] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0003] Starting container [rke-port-checker] on host [192.168.10.10], try #1
INFO[0003] [network] Successfully started [rke-port-checker] container on host [192.168.10.10]
INFO[0003] Removing container [rke-port-checker] on host [192.168.10.10], try #1
INFO[0003] [network] Running workers -> control plane port checks
INFO[0003] [network] Checking if host [192.168.10.12] can connect to host(s) [192.168.10.10] on port(s) [6443], try #1
INFO[0003] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.12]
INFO[0003] Starting container [rke-port-checker] on host [192.168.10.12], try #1
INFO[0003] [network] Successfully started [rke-port-checker] container on host [192.168.10.12]
INFO[0003] Removing container [rke-port-checker] on host [192.168.10.12], try #1
INFO[0003] [network] Checking KubeAPI port Control Plane hosts
INFO[0003] [network] Removing port listener containers
INFO[0003] Removing container [rke-etcd-port-listener] on host [192.168.10.14], try #1
INFO[0003] [remove/rke-etcd-port-listener] Successfully removed container on host [192.168.10.14]
INFO[0003] Removing container [rke-cp-port-listener] on host [192.168.10.10], try #1
INFO[0003] [remove/rke-cp-port-listener] Successfully removed container on host [192.168.10.10]
INFO[0003] Removing container [rke-worker-port-listener] on host [192.168.10.12], try #1
INFO[0003] [remove/rke-worker-port-listener] Successfully removed container on host [192.168.10.12]
INFO[0003] [network] Port listener containers removed successfully
INFO[0003] [certificates] Deploying kubernetes certificates to Cluster nodes
INFO[0003] Checking if container [cert-deployer] is running on host [192.168.10.14], try #1
INFO[0003] Checking if container [cert-deployer] is running on host [192.168.10.10], try #1
INFO[0003] Checking if container [cert-deployer] is running on host [192.168.10.12], try #1
INFO[0003] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0003] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.12]
INFO[0003] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0004] Starting container [cert-deployer] on host [192.168.10.14], try #1
INFO[0004] Starting container [cert-deployer] on host [192.168.10.12], try #1
INFO[0004] Starting container [cert-deployer] on host [192.168.10.10], try #1
INFO[0004] Checking if container [cert-deployer] is running on host [192.168.10.14], try #1
INFO[0004] Checking if container [cert-deployer] is running on host [192.168.10.12], try #1
INFO[0004] Checking if container [cert-deployer] is running on host [192.168.10.10], try #1
INFO[0009] Checking if container [cert-deployer] is running on host [192.168.10.14], try #1
INFO[0009] Removing container [cert-deployer] on host [192.168.10.14], try #1
INFO[0009] Checking if container [cert-deployer] is running on host [192.168.10.12], try #1
INFO[0009] Removing container [cert-deployer] on host [192.168.10.12], try #1
INFO[0009] Checking if container [cert-deployer] is running on host [192.168.10.10], try #1
INFO[0009] Removing container [cert-deployer] on host [192.168.10.10], try #1
INFO[0009] [reconcile] Rebuilding and updating local kube config
INFO[0009] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
WARN[0009] [reconcile] host [192.168.10.10] is a control plane node without reachable Kubernetes API endpoint in the cluster
WARN[0009] [reconcile] no control plane node with reachable Kubernetes API endpoint in the cluster found
INFO[0009] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0009] [file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [192.168.10.10]
INFO[0009] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0009] Starting container [file-deployer] on host [192.168.10.10], try #1
INFO[0009] Successfully started [file-deployer] container on host [192.168.10.10]
INFO[0009] Waiting for [file-deployer] container to exit on host [192.168.10.10]
INFO[0009] Waiting for [file-deployer] container to exit on host [192.168.10.10]
INFO[0009] Container [file-deployer] is still running on host [192.168.10.10]: stderr: [], stdout: []
INFO[0010] Waiting for [file-deployer] container to exit on host [192.168.10.10]
INFO[0010] Removing container [file-deployer] on host [192.168.10.10], try #1
INFO[0010] [remove/file-deployer] Successfully removed container on host [192.168.10.10]
INFO[0010] [/etc/kubernetes/audit-policy.yaml] Successfully deployed audit policy file to Cluster control nodes
INFO[0010] [reconcile] Reconciling cluster state
INFO[0010] [reconcile] This is newly generated cluster
INFO[0010] Pre-pulling kubernetes images
INFO[0010] Pulling image [rancher/hyperkube:v1.21.9-rancher1] on host [192.168.10.10], try #1
INFO[0010] Pulling image [rancher/hyperkube:v1.21.9-rancher1] on host [192.168.10.14], try #1
INFO[0010] Pulling image [rancher/hyperkube:v1.21.9-rancher1] on host [192.168.10.12], try #1
INFO[0087] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.10]
INFO[0090] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.12]
INFO[0092] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.14]
INFO[0092] Kubernetes images pulled successfully
INFO[0092] [etcd] Building up etcd plane..
INFO[0092] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0092] Starting container [etcd-fix-perm] on host [192.168.10.14], try #1
INFO[0092] Successfully started [etcd-fix-perm] container on host [192.168.10.14]
INFO[0092] Waiting for [etcd-fix-perm] container to exit on host [192.168.10.14]
INFO[0092] Waiting for [etcd-fix-perm] container to exit on host [192.168.10.14]
INFO[0092] Container [etcd-fix-perm] is still running on host [192.168.10.14]: stderr: [], stdout: []
INFO[0093] Waiting for [etcd-fix-perm] container to exit on host [192.168.10.14]
INFO[0093] Removing container [etcd-fix-perm] on host [192.168.10.14], try #1
INFO[0093] [remove/etcd-fix-perm] Successfully removed container on host [192.168.10.14]
INFO[0093] Image [rancher/mirrored-coreos-etcd:v3.5.0] exists on host [192.168.10.14]
INFO[0093] Starting container [etcd] on host [192.168.10.14], try #1
INFO[0093] [etcd] Successfully started [etcd] container on host [192.168.10.14]
INFO[0093] [etcd] Running rolling snapshot container [etcd-snapshot-once] on host [192.168.10.14]
INFO[0093] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0094] Starting container [etcd-rolling-snapshots] on host [192.168.10.14], try #1
INFO[0094] [etcd] Successfully started [etcd-rolling-snapshots] container on host [192.168.10.14]
INFO[0099] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0099] Starting container [rke-bundle-cert] on host [192.168.10.14], try #1
INFO[0099] [certificates] Successfully started [rke-bundle-cert] container on host [192.168.10.14]
INFO[0099] Waiting for [rke-bundle-cert] container to exit on host [192.168.10.14]
INFO[0099] Container [rke-bundle-cert] is still running on host [192.168.10.14]: stderr: [], stdout: []
INFO[0100] Waiting for [rke-bundle-cert] container to exit on host [192.168.10.14]
INFO[0100] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [192.168.10.14]
INFO[0100] Removing container [rke-bundle-cert] on host [192.168.10.14], try #1
INFO[0100] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0100] Starting container [rke-log-linker] on host [192.168.10.14], try #1
INFO[0100] [etcd] Successfully started [rke-log-linker] container on host [192.168.10.14]
INFO[0100] Removing container [rke-log-linker] on host [192.168.10.14], try #1
INFO[0100] [remove/rke-log-linker] Successfully removed container on host [192.168.10.14]
INFO[0100] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0101] Starting container [rke-log-linker] on host [192.168.10.14], try #1
INFO[0101] [etcd] Successfully started [rke-log-linker] container on host [192.168.10.14]
INFO[0101] Removing container [rke-log-linker] on host [192.168.10.14], try #1
INFO[0101] [remove/rke-log-linker] Successfully removed container on host [192.168.10.14]
INFO[0101] [etcd] Successfully started etcd plane.. Checking etcd cluster health
INFO[0101] [etcd] etcd host [192.168.10.14] reported healthy=true
INFO[0101] [controlplane] Building up Controller Plane..
INFO[0101] Checking if container [service-sidekick] is running on host [192.168.10.10], try #1
INFO[0101] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0101] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.10]
INFO[0101] Starting container [kube-apiserver] on host [192.168.10.10], try #1
INFO[0101] [controlplane] Successfully started [kube-apiserver] container on host [192.168.10.10]
INFO[0101] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [192.168.10.10]
INFO[0106] [healthcheck] service [kube-apiserver] on host [192.168.10.10] is healthy
INFO[0106] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0107] Starting container [rke-log-linker] on host [192.168.10.10], try #1
INFO[0107] [controlplane] Successfully started [rke-log-linker] container on host [192.168.10.10]
INFO[0107] Removing container [rke-log-linker] on host [192.168.10.10], try #1
INFO[0107] [remove/rke-log-linker] Successfully removed container on host [192.168.10.10]
INFO[0107] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.10]
INFO[0107] Starting container [kube-controller-manager] on host [192.168.10.10], try #1
INFO[0107] [controlplane] Successfully started [kube-controller-manager] container on host [192.168.10.10]
INFO[0107] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [192.168.10.10]
INFO[0112] [healthcheck] service [kube-controller-manager] on host [192.168.10.10] is healthy
INFO[0112] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0113] Starting container [rke-log-linker] on host [192.168.10.10], try #1
INFO[0113] [controlplane] Successfully started [rke-log-linker] container on host [192.168.10.10]
INFO[0113] Removing container [rke-log-linker] on host [192.168.10.10], try #1
INFO[0113] [remove/rke-log-linker] Successfully removed container on host [192.168.10.10]
INFO[0113] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.10]
INFO[0113] Starting container [kube-scheduler] on host [192.168.10.10], try #1
INFO[0113] [controlplane] Successfully started [kube-scheduler] container on host [192.168.10.10]
INFO[0113] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [192.168.10.10]
INFO[0118] [healthcheck] service [kube-scheduler] on host [192.168.10.10] is healthy
INFO[0118] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0119] Starting container [rke-log-linker] on host [192.168.10.10], try #1
INFO[0119] [controlplane] Successfully started [rke-log-linker] container on host [192.168.10.10]
INFO[0119] Removing container [rke-log-linker] on host [192.168.10.10], try #1
INFO[0119] [remove/rke-log-linker] Successfully removed container on host [192.168.10.10]
INFO[0119] [controlplane] Successfully started Controller Plane..
INFO[0119] [authz] Creating rke-job-deployer ServiceAccount
INFO[0119] [authz] rke-job-deployer ServiceAccount created successfully
INFO[0119] [authz] Creating system:node ClusterRoleBinding
INFO[0119] [authz] system:node ClusterRoleBinding created successfully
INFO[0119] [authz] Creating kube-apiserver proxy ClusterRole and ClusterRoleBinding
INFO[0119] [authz] kube-apiserver proxy ClusterRole and ClusterRoleBinding created successfully
INFO[0119] Successfully Deployed state file at [./cluster.rkestate]
INFO[0119] [state] Saving full cluster state to Kubernetes
INFO[0119] [state] Successfully Saved full cluster state to Kubernetes ConfigMap: full-cluster-state
INFO[0119] [worker] Building up Worker Plane..
INFO[0119] Checking if container [service-sidekick] is running on host [192.168.10.10], try #1
INFO[0119] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.12]
INFO[0119] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0119] [sidekick] Sidekick container already created on host [192.168.10.10]
INFO[0119] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.10]
INFO[0119] Starting container [kubelet] on host [192.168.10.10], try #1
INFO[0119] [worker] Successfully started [kubelet] container on host [192.168.10.10]
INFO[0119] [healthcheck] Start Healthcheck on service [kubelet] on host [192.168.10.10]
INFO[0119] Starting container [nginx-proxy] on host [192.168.10.14], try #1
INFO[0119] [worker] Successfully started [nginx-proxy] container on host [192.168.10.14]
INFO[0119] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0119] Starting container [nginx-proxy] on host [192.168.10.12], try #1
INFO[0119] [worker] Successfully started [nginx-proxy] container on host [192.168.10.12]
INFO[0119] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.12]
INFO[0119] Starting container [rke-log-linker] on host [192.168.10.14], try #1
INFO[0120] Starting container [rke-log-linker] on host [192.168.10.12], try #1
INFO[0120] [worker] Successfully started [rke-log-linker] container on host [192.168.10.14]
INFO[0120] Removing container [rke-log-linker] on host [192.168.10.14], try #1
INFO[0120] [remove/rke-log-linker] Successfully removed container on host [192.168.10.14]
INFO[0120] Checking if container [service-sidekick] is running on host [192.168.10.14], try #1
INFO[0120] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0120] [worker] Successfully started [rke-log-linker] container on host [192.168.10.12]
INFO[0120] Removing container [rke-log-linker] on host [192.168.10.12], try #1
INFO[0120] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.14]
INFO[0120] [remove/rke-log-linker] Successfully removed container on host [192.168.10.12]
INFO[0120] Checking if container [service-sidekick] is running on host [192.168.10.12], try #1
INFO[0120] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.12]
INFO[0120] Starting container [kubelet] on host [192.168.10.14], try #1
INFO[0120] [worker] Successfully started [kubelet] container on host [192.168.10.14]
INFO[0120] [healthcheck] Start Healthcheck on service [kubelet] on host [192.168.10.14]
INFO[0120] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.12]
INFO[0120] Starting container [kubelet] on host [192.168.10.12], try #1
INFO[0120] [worker] Successfully started [kubelet] container on host [192.168.10.12]
INFO[0120] [healthcheck] Start Healthcheck on service [kubelet] on host [192.168.10.12]
INFO[0124] [healthcheck] service [kubelet] on host [192.168.10.10] is healthy
INFO[0124] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0124] Starting container [rke-log-linker] on host [192.168.10.10], try #1
INFO[0125] [worker] Successfully started [rke-log-linker] container on host [192.168.10.10]
INFO[0125] Removing container [rke-log-linker] on host [192.168.10.10], try #1
INFO[0125] [remove/rke-log-linker] Successfully removed container on host [192.168.10.10]
INFO[0125] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.10]
INFO[0125] Starting container [kube-proxy] on host [192.168.10.10], try #1
INFO[0125] [worker] Successfully started [kube-proxy] container on host [192.168.10.10]
INFO[0125] [healthcheck] Start Healthcheck on service [kube-proxy] on host [192.168.10.10]
INFO[0125] [healthcheck] service [kubelet] on host [192.168.10.14] is healthy
INFO[0125] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0125] Starting container [rke-log-linker] on host [192.168.10.14], try #1
INFO[0125] [healthcheck] service [kubelet] on host [192.168.10.12] is healthy
INFO[0125] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.12]
INFO[0125] [worker] Successfully started [rke-log-linker] container on host [192.168.10.14]
INFO[0125] Starting container [rke-log-linker] on host [192.168.10.12], try #1
INFO[0125] Removing container [rke-log-linker] on host [192.168.10.14], try #1
INFO[0126] [remove/rke-log-linker] Successfully removed container on host [192.168.10.14]
INFO[0126] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.14]
INFO[0126] Starting container [kube-proxy] on host [192.168.10.14], try #1
INFO[0126] [worker] Successfully started [rke-log-linker] container on host [192.168.10.12]
INFO[0126] Removing container [rke-log-linker] on host [192.168.10.12], try #1
INFO[0126] [worker] Successfully started [kube-proxy] container on host [192.168.10.14]
INFO[0126] [healthcheck] Start Healthcheck on service [kube-proxy] on host [192.168.10.14]
INFO[0126] [remove/rke-log-linker] Successfully removed container on host [192.168.10.12]
INFO[0126] Image [rancher/hyperkube:v1.21.9-rancher1] exists on host [192.168.10.12]
INFO[0126] Starting container [kube-proxy] on host [192.168.10.12], try #1
INFO[0126] [worker] Successfully started [kube-proxy] container on host [192.168.10.12]
INFO[0126] [healthcheck] Start Healthcheck on service [kube-proxy] on host [192.168.10.12]
INFO[0130] [healthcheck] service [kube-proxy] on host [192.168.10.10] is healthy
INFO[0130] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0130] Starting container [rke-log-linker] on host [192.168.10.10], try #1
INFO[0130] [worker] Successfully started [rke-log-linker] container on host [192.168.10.10]
INFO[0130] Removing container [rke-log-linker] on host [192.168.10.10], try #1
INFO[0130] [remove/rke-log-linker] Successfully removed container on host [192.168.10.10]
INFO[0131] [healthcheck] service [kube-proxy] on host [192.168.10.14] is healthy
INFO[0131] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0131] Starting container [rke-log-linker] on host [192.168.10.14], try #1
INFO[0131] [healthcheck] service [kube-proxy] on host [192.168.10.12] is healthy
INFO[0131] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.12]
INFO[0131] [worker] Successfully started [rke-log-linker] container on host [192.168.10.14]
INFO[0131] Removing container [rke-log-linker] on host [192.168.10.14], try #1
INFO[0131] Starting container [rke-log-linker] on host [192.168.10.12], try #1
INFO[0131] [remove/rke-log-linker] Successfully removed container on host [192.168.10.14]
INFO[0131] [worker] Successfully started [rke-log-linker] container on host [192.168.10.12]
INFO[0131] Removing container [rke-log-linker] on host [192.168.10.12], try #1
INFO[0131] [remove/rke-log-linker] Successfully removed container on host [192.168.10.12]
INFO[0131] [worker] Successfully started Worker Plane..
INFO[0131] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.12]
INFO[0131] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.14]
INFO[0131] Image [rancher/rke-tools:v0.1.78] exists on host [192.168.10.10]
INFO[0132] Starting container [rke-log-cleaner] on host [192.168.10.14], try #1
INFO[0132] Starting container [rke-log-cleaner] on host [192.168.10.12], try #1
INFO[0132] Starting container [rke-log-cleaner] on host [192.168.10.10], try #1
INFO[0132] [cleanup] Successfully started [rke-log-cleaner] container on host [192.168.10.14]
INFO[0132] Removing container [rke-log-cleaner] on host [192.168.10.14], try #1
INFO[0132] [cleanup] Successfully started [rke-log-cleaner] container on host [192.168.10.12]
INFO[0132] Removing container [rke-log-cleaner] on host [192.168.10.12], try #1
INFO[0132] [cleanup] Successfully started [rke-log-cleaner] container on host [192.168.10.10]
INFO[0132] Removing container [rke-log-cleaner] on host [192.168.10.10], try #1
INFO[0132] [remove/rke-log-cleaner] Successfully removed container on host [192.168.10.14]
INFO[0132] [remove/rke-log-cleaner] Successfully removed container on host [192.168.10.12]
INFO[0132] [remove/rke-log-cleaner] Successfully removed container on host [192.168.10.10]
INFO[0132] [sync] Syncing nodes Labels and Taints
INFO[0132] [sync] Successfully synced nodes Labels and Taints
INFO[0132] [network] Setting up network plugin: canal
INFO[0132] [addons] Saving ConfigMap for addon rke-network-plugin to Kubernetes
INFO[0132] [addons] Successfully saved ConfigMap for addon rke-network-plugin to Kubernetes
INFO[0132] [addons] Executing deploy job rke-network-plugin
INFO[0137] [addons] Setting up coredns
INFO[0137] [addons] Saving ConfigMap for addon rke-coredns-addon to Kubernetes
INFO[0137] [addons] Successfully saved ConfigMap for addon rke-coredns-addon to Kubernetes
INFO[0137] [addons] Executing deploy job rke-coredns-addon
INFO[0142] [addons] CoreDNS deployed successfully
INFO[0142] [dns] DNS provider coredns deployed successfully
INFO[0142] [addons] Setting up Metrics Server
INFO[0142] [addons] Saving ConfigMap for addon rke-metrics-addon to Kubernetes
INFO[0142] [addons] Successfully saved ConfigMap for addon rke-metrics-addon to Kubernetes
INFO[0142] [addons] Executing deploy job rke-metrics-addon
INFO[0147] [addons] Metrics Server deployed successfully
INFO[0147] [ingress] Setting up nginx ingress controller
INFO[0147] [ingress] removing admission batch jobs if they exist
INFO[0147] [addons] Saving ConfigMap for addon rke-ingress-controller to Kubernetes
INFO[0147] [addons] Successfully saved ConfigMap for addon rke-ingress-controller to Kubernetes
INFO[0147] [addons] Executing deploy job rke-ingress-controller
INFO[0152] [ingress] removing default backend service and deployment if they exist
INFO[0152] [ingress] ingress controller nginx deployed successfully
INFO[0152] [addons] Setting up user addons
INFO[0152] [addons] no user addons defined
INFO[0152] Finished building Kubernetes cluster successfully
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
  • 87.
  • 88.
  • 89.
  • 90.
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.
  • 96.
  • 97.
  • 98.
  • 99.
  • 100.
  • 101.
  • 102.
  • 103.
  • 104.
  • 105.
  • 106.
  • 107.
  • 108.
  • 109.
  • 110.
  • 111.
  • 112.
  • 113.
  • 114.
  • 115.
  • 116.
  • 117.
  • 118.
  • 119.
  • 120.
  • 121.
  • 122.
  • 123.
  • 124.
  • 125.
  • 126.
  • 127.
  • 128.
  • 129.
  • 130.
  • 131.
  • 132.
  • 133.
  • 134.
  • 135.
  • 136.
  • 137.
  • 138.
  • 139.
  • 140.
  • 141.
  • 142.
  • 143.
  • 144.
  • 145.
  • 146.
  • 147.
  • 148.
  • 149.
  • 150.
  • 151.
  • 152.
  • 153.
  • 154.
  • 155.
  • 156.
  • 157.
  • 158.
  • 159.
  • 160.
  • 161.
  • 162.
  • 163.
  • 164.
  • 165.
  • 166.
  • 167.
  • 168.
  • 169.
  • 170.
  • 171.
  • 172.
  • 173.
  • 174.
  • 175.
  • 176.
  • 177.
  • 178.
  • 179.
  • 180.
  • 181.
  • 182.
  • 183.
  • 184.
  • 185.
  • 186.
  • 187.
  • 188.
  • 189.
  • 190.
  • 191.
  • 192.
  • 193.
  • 194.
  • 195.
  • 196.
  • 197.
  • 198.
  • 199.
  • 200.
  • 201.
  • 202.
  • 203.
  • 204.
  • 205.
  • 206.
  • 207.
  • 208.
  • 209.
  • 210.
  • 211.
  • 212.
  • 213.
  • 214.
  • 215.
  • 216.
  • 217.
  • 218.
  • 219.
  • 220.
  • 221.
  • 222.
  • 223.
  • 224.
  • 225.
  • 226.
  • 227.
  • 228.
  • 229.
  • 230.
  • 231.
  • 232.
  • 233.
  • 234.
  • 235.
  • 236.
  • 237.
  • 238.
  • 239.
  • 240.
  • 241.
  • 242.
  • 243.
  • 244.
  • 245.
  • 246.
  • 247.
  • 248.
  • 249.
  • 250.
  • 251.
  • 252.
  • 253.
  • 254.
  • 255.
  • 256.
  • 257.
  • 258.
  • 259.
  • 260.
  • 261.
  • 262.
  • 263.
  • 264.
  • 265.
  • 266.
  • 267.
  • 268.
  • 269.
  • 270.
  • 271.
  • 272.
  • 273.
  • 274.
  • 275.
  • 276.
  • 277.
  • 278.
  • 279.
  • 280.
  • 281.
  • 282.
  • 283.
  • 284.
  • 285.
  • 286.
  • 287.
  • 288.
  • 289.
  • 290.
  • 291.
  • 292.
  • 293.
  • 294.
  • 295.
  • 296.
  • 297.
  • 298.
  • 299.
  • 300.
  • 301.
  • 302.
  • 303.
  • 304.
  • 305.
  • 306.
  • 307.
  • 308.
  • 309.
  • 310.
  • 311.
  • 312.
  • 313.
  • 314.
  • 315.
  • 316.
  • 317.
  • 318.

假如你遇到部署失败,报错:

WARN[0114] [etcd] host [xxxxxxx] failed to check etcd health: failed to get /health for host [xxxxxxx]: Get "https://xxxxxxxx:2379/health": remote error: tls: bad certificate
FATA[0114] [etcd] Failed to bring up Etcd Plane: etcd cluster is unhealthy: hosts [xxxxxxxx] failed to report healthy. Check etcd container logs on each host for more information
  • 1.
  • 2.

可以对所有节点执行(无法删除就重启机器):rm -rf /etc/kubernetes/ /var/lib/kubelet/ /var/lib/etcd/


九、安装 kubectl 客户端


还是在 node1(控制节点)主机上操作。

kubectl 客户端安装

# wget https://storage.googleapis.com/kubernetes-release/release/v1.21.9/bin/linux/amd64/kubectl
# chmod +x kubectl 
# mv kubectl /usr/local/bin/kubectl
# kubectl version --client
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.9", GitCommit:"f59f5c2fda36e4036b49ec027e556a15456108f0", GitTreeState:"clean", BuildDate:"2022-01-19T17:33:06Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

kubectl 客户端配置集群管理文件及应用验证

[root@node1 ~]# ls /app/rancher/
cluster.rkestate  cluster.yml  kube_config_cluster.yml
[root@node1 ~]# mkdir ./.kube
[root@node1 ~]# cp /app/rancher/kube_config_cluster.yml /root/.kube/config
[root@node1 ~]# kubectl get nodes
NAME            STATUS   ROLES          AGE     VERSION
192.168.10.10   Ready    controlplane   9m13s   v1.21.9
192.168.10.12   Ready    worker         9m12s   v1.21.9
192.168.10.14   Ready    etcd           9m12s   v1.21.9
[root@node1 ~]# kubectl get pods -n kube-system
NAME                                         READY   STATUS      RESTARTS   AGE
calico-kube-controllers-5685fbd9f7-gcwj7     1/1     Running     0          9m36s
canal-fz2bg                                  2/2     Running     0          9m36s
canal-qzw4n                                  2/2     Running     0          9m36s
canal-sstjn                                  2/2     Running     0          9m36s
coredns-8578b6dbdd-ftnf6                     1/1     Running     0          9m30s
coredns-autoscaler-f7b68ccb7-fzdgc           1/1     Running     0          9m30s
metrics-server-6bc7854fb5-kwppz              1/1     Running     0          9m25s
rke-coredns-addon-deploy-job--1-x56w2        0/1     Completed   0          9m31s
rke-ingress-controller-deploy-job--1-wzp2b   0/1     Completed   0          9m21s
rke-metrics-addon-deploy-job--1-ltlgn        0/1     Completed   0          9m26s
rke-network-plugin-deploy-job--1-nsbfn       0/1     Completed   0          9m41s
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.


十、集群 Web 管理 Rancher


Rancher 控制面板主要方便用于控制 K8s 集群,查看集群状态,编辑集群等操作。

依旧在(控制节点)node1 运行以下命令:

使用 docker run 启动一个 rancher

# 注意映射端口改了
[root@node1 ~]# docker run -d --restart=unless-stopped --privileged --name rancher -p 1080:80 -p 1443:443 rancher/rancher:v2.5.9
[root@node1 ~]# docker ps
CONTAINER ID   IMAGE                                COMMAND                  CREATED          STATUS          PORTS                                                                      NAMES
0fd46ee77655   rancher/rancher:v2.5.9               "entrypoint.sh"          5 seconds ago    Up 3 seconds    0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp   rancher
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

访问 Rancher【这里映射端口 1080,1443】

[root@node1 ~]# ss -anput | grep ":1080"
tcp    LISTEN     0      128       *:1080                    *:*                   users:(("docker-proxy",pid=29564,fd=4))
tcp    LISTEN     0      128    [::]:1080                 [::]:*                   users:(("docker-proxy",pid=29570,fd=4))
  • 1.
  • 2.
  • 3.

以上便是 RKE 方式搭建 K8s 集群。


下面来讲讲,如何搞定 NebulaGraph 部署。


十一、搭建 NFS 服务器


[root@nfsserver ~]# mkdir -p /data/nfs
[root@nfsserver ~]# vim /etc/exports
/data/nfs       *(rw,no_root_squash,sync)
  • 1.
  • 2.
  • 3.

在所有节点中安装 NFS 客户端:

yum install nfs-utils -y
  • 1.

下面在工作节点中验证是否服务可用:

showmount -e 192.168.222.143 #NFS 服务器的地址
  • 1.


十二、使用 NFS 文件系统创建存储动态供给(Storage Class 安装)


PV 对存储系统的支持可通过其插件来实现,目前,Kubernetes 支持如下类型的插件。

官方地址:https://kubernetes.io/docs/concepts/storage/storage-classes/

但是,官方插件是不支持 NFS 动态供给的,我们可以用第三方的插件来实现,第三方插件地址:https://github.com/kubernetes-retired/external-storage

下载并创建 Storage Class

[root@k8s-master1 ~]# wget https://raw.githubusercontent.com/kubernetes-sigs/nfs-subdir-external-provisioner/master/deploy/class.yaml

[root@k8s-master1 ~]# mv class.yaml storageclass-nfs.yml
[root@k8s-master1 ~]# cat storageclass-nfs.yml
apiVersion: storage.k8s.io/v1
kind: StorageClass                                # 类型
metadata:
  name: nfs-client                # 名称,要使用就需要调用此名称
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner         # 动态供给插件
parameters:
  archiveOnDelete: "false"                # 删除数据时是否存档,false 表示不存档,true 表示存档
[root@k8s-master1 ~]# kubectl apply -f storageclass-nfs.yml
storageclass.storage.k8s.io/managed-nfs-storage created
[root@k8s-master1 ~]# kubectl get storageclass
NAME         PROVISIONER                                   RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
nfs-client   k8s-sigs.io/nfs-subdir-external-provisioner   Delete          Immediate           false                  10s
# RECLAIMPOLICY PV 回收策略,Pod 或 PVC 被删除后,PV 是否删除还是保留。
# VOLUMEBINDINGMODE Immediate 模式下 PVC 与 PV 立即绑定,主要是不等待相关 Pod 调度完成,不关心其运行节点,直接完成绑定。相反的 WaitForFirstConsumer 模式下需要等待 Pod 调度完成后进行 PV 绑定。
# ALLOWVOLUMEEXPANSION PVC 扩容
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.

下载并创建 RBAC

因为 Storage 自动创建 PV 需要经过 kube-apiserver,所以需要授权。

[root@k8s-master1 ~]# wget https://raw.githubusercontent.com/kubernetes-sigs/nfs-subdir-external-provisioner/master/deploy/rbac.yaml

[root@k8s-master1 ~]# mv rbac.yaml storageclass-nfs-rbac.yaml
[root@k8s-master1 ~]# cat storageclass-nfs-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: nfs-client-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: run-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
roleRef:
  kind: ClusterRole
  name: nfs-client-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
rules:
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
roleRef:
  kind: Role
  name: leader-locking-nfs-client-provisioner
  apiGroup: rbac.authorization.k8s.io
[root@k8s-master1 ~]# kubectl apply -f rbac.yaml
serviceaccount/nfs-client-provisioner created
clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/run-nfs-client-provisioner created
role.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
rolebinding.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.

创建动态供给的 deployment

这里需要一个 deployment 来专门实现 PV 与 PVC 的自动创建:

[root@k8s-master1 ~]# vim deploy-nfs-client-provisioner.yml
apiVersion: apps/v1
kind: Deployment
metadata:
 name: nfs-client-provisioner
spec:
 replicas: 1
 strategy:
   type: Recreate
 selector:
   matchLabels:
     app: nfs-client-provisioner
 template:
   metadata:
     labels:
       app: nfs-client-provisioner
   spec:
     serviceAccount: nfs-client-provisioner
     containers:
       - name: nfs-client-provisioner
         image: registry.cn-beijing.aliyuncs.com/pylixm/nfs-subdir-external-provisioner:v4.0.0
         volumeMounts:
           - name: nfs-client-root
             mountPath: /persistentvolumes
         env:
           - name: PROVISIONER_NAME
             value: k8s-sigs.io/nfs-subdir-external-provisioner
           - name: NFS_SERVER
             value: 192.168.10.129
           - name: NFS_PATH
             value: /data/nfs
     volumes:
       - name: nfs-client-root
         nfs:
           server: 192.168.10.129
           path: /data/nfs
[root@k8s-master1 ~]# kubectl apply -f deploy-nfs-client-provisioner.yml
deployment.apps/nfs-client-provisioner created
[root@k8s-master1 ~]# kubectl get pods |grep nfs-client-provisioner
nfs-client-provisioner-5b5ddcd6c8-b6zbq   1/1     Running   0          34s
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.


十三、NebulaGraph Operator 安装部署


这里参考官方文档来安装 NebulaGraph Operator 前,用户需要安装以下软件并确保安装版本的正确性(NebulaGraph Operator 不负责处理安装这些软件过程中出现的问题)。

软件

版本要求

Kubernetes

>= 1.16

Helm

>= 3.2.0

CoreDNS

>= 1.6.0

这里需要安装下 Helm:

[root@a ~]# curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                Dload  Upload   Total   Spent    Left  Speed
100 11345  100 11345    0     0  18830      0 --:--:-- --:--:-- --:--:-- 18845
[WARNING] Could not find git. It is required for plugin installation.
Downloading https://get.helm.sh/helm-v3.12.1-linux-amd64.tar.gz
Verifying checksum... Done.
Preparing to install helm into /usr/local/bin
helm installed into /usr/local/bin/helm
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.

而 CoreDNS 的安装,可以网上查阅资料,其实上面 RKE 安装过程中已经安装了 CoreDNS。再来一遍安装 NebulaGraph Operator 的具体流程:

第一步,添加 NebulaGraph Operator Helm 仓库。

helm repo add nebula-operator https://vesoft-inc.github.io/nebula-operator/charts
  • 1.

第二步,拉取最新的 Operator Helm 仓库。

helm repo update
  • 1.

参考 Helm 仓库 以获取更多 helm repo 相关信息。

第三步,创建命名空间用于安装 NebulaGraph Operator。

kubectl create namespace <namespace_name>
  • 1.

例如,创建 operator 命名空间。

kubectl create namespace operator
  • 1.

  • nebula-operator Chart 中的所有资源都会安装在该命名空间下。
  • 用户也可自行创建其他命名空间。

第四步,安装 NebulaGraph Operator。

helm install nebula-operator nebula-operator/nebula-operator --namespace=<namespace_name> --version=${chart_version}
  • 1.

目前 1.5.0 无法安装,可以不指定版本,默认安装最新版本。由于 gcr.io 无法下载镜像使用 kubesphere/kube-rbac-proxy:v0.8.0 代替。

helm install nebula-operator nebula-operator/nebula-operator --namespace=nebula-operator-system --set image.kubeRBACProxy.image=kubesphere/kube-rbac-proxy:v0.8.0
  • 1.

卸载 NebulaGraph Operator(如果需要卸载重新安装)

1. 卸载 NebulaGraph Operator chart

helm uninstall nebula-operator --namespace=operator
  • 1.

2. 删除 CRD

kubectl delete crd nebulaclusters.apps.nebula-graph.io
  • 1.


十四、部署 NebulaGraph (kubectl)


创建集群配置文件:创建名为 apps_v1alpha1_nebulacluster.yaml 的文件。官网提供了模板可以直接下载:

apiVersion: apps.nebula-graph.io/v1alpha1
kind: NebulaCluster
metadata:
  name: nebula
spec:
  graphd:
    resources:
      requests:
        cpu: "500m"
        memory: "500Mi"
      limits:
        cpu: "1"
        memory: "1Gi"
    replicas: 1
    image: vesoft/nebula-graphd
    version: v3.5.0
    service:
      type: NodePort
      externalTrafficPolicy: Local
    logVolumeClaim:
      resources:
        requests:
          storage: 1Gi
      storageClassName: managed-nfs-storage
 metad:
  #    license:
   #      secretName: "nebula-license"
#      licenseKey: "nebula.license"
    resources:
      requests:
        cpu: "500m"
        memory: "500Mi"
      limits:
        cpu: "1"
        memory: "1Gi"
    replicas: 1
    image: vesoft/nebula-metad
    version: v3.5.0
    dataVolumeClaim:
      resources:
        requests:
          storage: 5Gi
      storageClassName:  managed-nfs-storage
    logVolumeClaim:
      resources:
        requests:
          storage: 1Gi
      storageClassName:  managed-nfs-storage
  storaged:
    resources:
      requests:
        cpu: "500m"
        memory: "500Mi"
      limits:
        cpu: "1"
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.

注意:所有的 storageClassName 需要和上面安装 storageclass 时名称相同。即,storageclass-nfs.yml 中的 metadata.name: nfs-client。

来查看下服务状态:

nebula-exporter-66457984-w6zpn            1/1     Running   0          31m
nebula-graphd-0                           2/2     Running   0          31m
nebula-metad-0                            2/2     Running   0          35m
nebula-storaged-0                         2/2     Running   0          35m
nebula-storaged-1                         0/2     Pending   0          35m
nebula-storaged-2                         0/2     Pending   0          35m
nfs-client-provisioner-6bd7f48698-m4v94   1/1     Running   0          57m
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.


十五、连接 NebulaGraph


按照官网提供的模板修改:

apiVersion: v1
kind: Service
metadata:
  labels:
    # modify the cluster name
    app.kubernetes.io/cluster: "nebula"
    app.kubernetes.io/component: graphd
    app.kubernetes.io/managed-by: nebula-operator
    app.kubernetes.io/name: nebula-graph
  name: nebula-graphd-nodeport-svc
  namespace: default
spec:
  externalTrafficPolicy: Cluster #改为local,无法连接到数据库,这里改为Cluster
  ports:
  - name: thrift
    port: 9669
    protocol: TCP
    targetPort: 9669
  - name: http
    port: 19669
    protocol: TCP
    targetPort: 19669
  selector:
    # modify the cluster name
    app.kubernetes.io/cluster: "nebula"
    app.kubernetes.io/component: graphd
    app.kubernetes.io/managed-by: nebula-operator
    app.kubernetes.io/name: nebula-graph
  type: NodePort
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.

执行以下命令使 Service 服务在集群中生效。

kubectl create -f graphd-nodeport-service.yaml
  • 1.

查看 Service 中 NebulaGraph 映射至集群节点的端口。

kubectl get services -l app.kubernetes.io/cluster=<nebula>  #<nebula>为变量值,请用实际集群名称替换。
  • 1.

返回:

NAME                           TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                                          AGE
nebula-graphd-svc-nodeport     NodePort    10.107.153.129 <none>        9669:32236/TCP,19669:31674/TCP,19670:31057/TCP   24h
...
  • 1.
  • 2.
  • 3.

NodePort 类型的 Service 中,映射至集群节点的端口为 32236

使用节点 IP 和上述映射的节点端口连接 NebulaGraph。

kubectl run -ti --image vesoft/nebula-console:v3.5.0 --restart=Never -- <nebula_console_name> -addr <node_ip> -port <node_port> -u <username> -p <password>
  • 1.

示例如下:

kubectl run -ti --image vesoft/nebula-console:v3.5.0 --restart=Never -- nebula-console -addr 192.168.8.24 -port 32236 -u root -p vesoft
If you don't see a command prompt, try pressing enter.

(root@nebula) [(none)]>
  • 1.
  • 2.
  • 3.
  • 4.

  • --image:为连接 NebulaGraph 的工具 NebulaGraph Console 的镜像。
  • <nebula-console>:自定义的 Pod 名称,本示例为 nebula-console
  • -addr:NebulaGraph 集群中任一节点 IP 地址,本示例为 192.168.8.24
  • -port:NebulaGraph 映射至节点的端口,本示例为 32236
  • -u:NebulaGraph 账号的用户名。未启用身份认证时,可以使用任意已存在的用户名(默认为 root)。
  • -p:用户名对应的密码。未启用身份认证时,密码可以填写任意字符。以上为本次实践,有任何问题欢迎论坛留言交流。


参考资料



文章转载自公众号:NebulaGraph 技术社区

分类
已于2023-8-31 11:34:07修改
收藏
回复
举报


回复
    相关推荐
    这个用户很懒,还没有个人简介
    帖子
    视频
    声望
    粉丝
    社区精华内容
    恭喜您,今日已阅读两篇内容,特奖励+2声望, 快来领取吧。