【我和openGauss的故事】为集群增加VIP

老老老JR老北
发布于 2023-9-8 14:53
浏览
0收藏

openGauss发布以来,原生支持一主多备,RTO<10S,高可用性能大大增强。自openGauss3.0开始,更新了集群管理套件CM,易用性也得到了提高。但对于客户端来说,数据库端的切换,需要手工完成。

openGauss增加VIP后,客户端的连接就如连接ORACLE RAC的scan VIP一样,对于服务端的切换无感知。

要使用VIP,可以在安装前规划,在配置文件中指定,也可以对已安装的集群进行手工增加。下面就测试手工增加方法。

1.已安装集群的相关信息

数据库版本

gsql -V
gsql (openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:37:13 commit 0 last mr

集群状态

[omm@db1 srv]$ cm_ctl query -Cv
[  CMServer State   ]

node   instance state
-----------------------
1  db1 1        Primary
2  db2 2        Standby
3  db3 3        Standby

[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
balanced        : Yes
current_az      : AZ_ALL

[  Datanode State   ]

node   instance state            | node   instance state            | node   instance state
---------------------------------------------------------------------------------------------------------
1  db1 6001     P Primary Normal | 2  db2 6002     S Standby Normal | 3  db3 6003     S Standby Normal


[omm@db1 srv]$ gs_om -t status --detail
[  CMServer State   ]

node   node_ip         instance                                 state
-----------------------------------------------------------------------
1  db1 192.168.56.11   1    /opt/huawei/data/cmserver/cm_server Primary
2  db2 192.168.56.12   2    /opt/huawei/data/cmserver/cm_server Standby
3  db3 192.168.56.13   3    /opt/huawei/data/cmserver/cm_server Standby

[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
balanced        : Yes
current_az      : AZ_ALL

[  Datanode State   ]

node   node_ip         instance                         state
-------------------------------------------------------------------------
1  db1 192.168.56.11   6001 /opt/huawei/install/data/dn P Primary Normal
2  db2 192.168.56.12   6002 /opt/huawei/install/data/dn S Standby Normal
3  db3 192.168.56.13   6003 /opt/huawei/install/data/dn S Standby Normal

2.给omm用户增加sudo权限,三台机器都执行

echo “omm ALL=(ALL) NOPASSWD:ALL”>>/etc/sudoers

3. 在主库上添加VIP

添加前

[omm@db1 cm_agent]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
       valid_lft 74572sec preferred_lft 74572sec
    inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8
       valid_lft forever preferred_lft forever
    inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute
       valid_lft forever preferred_lft forever

 ifconfig enp0s8:15400 192.168.56010 netmask 255.255.255.0 up

添加后

[omm@db1 cm_agent]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
       valid_lft 74572sec preferred_lft 74572sec
    inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8
       valid_lft forever preferred_lft forever
  inet 192.168.56.10/24 brd 192.168.56.255 scope global secondary enp0s8:15400
   inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute
       valid_lft forever preferred_lft forever

4.给集群添加VIP资源 VIP作为openGauss的资源管理

[omm@db2 cm_agent]$cm_ctl res --add --res_name="VIP_az1" --res_attr="resources_type=VIP,float_ip=192.168.56.10"
cm_ctl: add res(VIP_az1) success.

将每个实例加到资源中

[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=1,res_instance_id=6001" --inst_attr=base_ip=192.168.56.11
cm_ctl: edit res(VIP_az1) success.
  
[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=2,res_instance_id=6002" --inst_attr=base_ip=192.168.56.12
cm_ctl: edit res(VIP_az1) success.

[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=2,res_instance_id=6003" --inst_attr=base_ip=192.168.56.13
cm_ctl: edit res(VIP_az1) success.

查询VIP在哪个节点

[omm@db3 ~]$ cm_ctl show

[  Network Connect State  ]

Network timeout:       6s
Current CMServer time: 2023-08-03 06:18:42
Network stat('Y' means connected, otherwise 'N'):
|  \  |  Y  |  Y  |
|  Y  |  \  |  Y  |
|  Y  |  Y  |  \  |


[  Node Disk HB State  ]

Node disk hb timeout:    200s
Current CMServer time: 2023-08-03 06:18:43
Node disk hb stat('Y' means connected, otherwise 'N'):
|  N  |  N  |  N  |

[  FloatIp Network State  ]

node   instance base_ip       float_ip_name float_ip
----------------------------------------------------------
1  db1 6001     192.168.56.11 VIP_az1       192.168.56.10

模拟主节点故障

[omm@db3 ~]$ cm_ctl stop -n 1
cm_ctl: stop the node: 1.
cm_ctl: stop node, nodeid: 1
...........
cm_ctl: stop node successfully.

主节点切换到节点2,VIP也到了节点2

[omm@db1 cm_agent]$ cm_ctl show

[  Network Connect State  ]

Network timeout:       6s
Current CMServer time: 2023-08-03 06:19:40
Network stat('Y' means connected, otherwise 'N'):
|  \  |  N  |  N  |
|  N  |  \  |  Y  |
|  N  |  Y  |  \  |


[  Node Disk HB State  ]

Node disk hb timeout:    200s
Current CMServer time: 2023-08-03 06:19:41
Node disk hb stat('Y' means connected, otherwise 'N'):
|  N  |  N  |  N  |

[  FloatIp Network State  ]

node   instance base_ip       float_ip_name float_ip
----------------------------------------------------------
2  db2 6002     192.168.56.12 VIP_az1     192.168.56.10

节点1的IP,已没有192.168.56.10

[omm@db1 cm_agent]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
       valid_lft 74572sec preferred_lft 74572sec
    inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8
       valid_lft forever preferred_lft forever
    inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute
       valid_lft forever preferred_lft forever

节点2的IP,已增加192.168.56.10

[omm@db2 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
       valid_lft 74550sec preferred_lft 74550sec
    inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:41:73:29 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.12/24 brd 192.168.56.255 scope global noprefixroute enp0s8
       valid_lft forever preferred_lft forever
    inet 192.168.56.10/24 brd 192.168.56.255 scope global secondary enp0s8:15400
       valid_lft forever preferred_lft forever
    inet6 fe80::5373:d66d:7a39:ddc2/64 scope link noprefixroute
       valid_lft forever preferred_lft forever

资源配置文件

[omm@db1 cm_agent]$ cat cm_resource.json
{
        "resources":    [{
                        "name": "VIP_az1",
                        "resources_type":       "VIP",
                        "instances":    [{
                                        "node_id":      1,
                                        "res_instance_id":      6001,
                                        "inst_attr":    "base_ip=192.168.56.11"
                                }, {
                                        "node_id":      2,
                                        "res_instance_id":      6002,
                                        "inst_attr":    "base_ip=192.168.56.12"
                                }, {
                                        "node_id":      3,
                                        "res_instance_id":      6003,
                                        "inst_attr":    "base_ip=192.168.56.13"
                                }],
                        "float_ip":     "192.168.56.10"
                }]

同步配置文件到其余节点

scp 	cm_resource.json db2:/opt/huawei/data/cmserver/cm_agent
scp 	cm_resource.json db3:/opt/huawei/data/cmserver/cm_agent	

启动节点1

[omm@db3 ~]$ cm_ctl start -n 1
cm_ctl: start the node: 1.
cm_ctl: start node, nodeid: 1
...........
cm_ctl: start node successfully.
		
[omm@db1 cm_agent]$ gs_om -t status --detail
[  CMServer State   ]

node   node_ip         instance                                 state
-----------------------------------------------------------------------
1  db1 192.168.56.11   1    /opt/huawei/data/cmserver/cm_server Standby
2  db2 192.168.56.12   2    /opt/huawei/data/cmserver/cm_server Primary
3  db3 192.168.56.13   3    /opt/huawei/data/cmserver/cm_server Standby

[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
balanced        : No
current_az      : AZ_ALL

[  Datanode State   ]

node   node_ip         instance                         state
-------------------------------------------------------------------------
1  db1 192.168.56.11   6001 /opt/huawei/install/data/dn P Standby Normal
2  db2 192.168.56.12   6002 /opt/huawei/install/data/dn S Primary Normal
3  db3 192.168.56.13   6003 /opt/huawei/install/data/dn S Standby Normal

【我和openGauss的故事】为集群增加VIP-鸿蒙开发者社区

现在CM的主节点和数据库的主节点在同一机器上了。




文章转载自公众号:openGauss

分类
标签
已于2023-9-8 14:53:04修改
收藏
回复
举报
回复
    相关推荐