【我和openGauss的故事】为集群增加VIP

2023/08/04 18:00
阅读数 68

openGauss发布以来,原生支持一主多备,RTO<10S,高可用性能大大增强。自openGauss3.0开始,更新了集群管理套件CM,易用性也得到了提高。但对于客户端来说,数据库端的切换,需要手工完成。

openGauss增加VIP后,客户端的连接就如连接ORACLE RAC的scan VIP一样,对于服务端的切换无感知。

要使用VIP,可以在安装前规划,在配置文件中指定,也可以对已安装的集群进行手工增加。下面就测试手工增加方法。

1.已安装集群的相关信息

数据库版本
 gsql -V
gsql (openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:37:13 commit 0 last mr

集群状态

[omm@db1 srv]$ cm_ctl query -Cv
[ CMServer State ]

node instance state
-----------------------
1 db1 1 Primary
2 db2 2 Standby
3 db3 3 Standby

[ Cluster State ]

cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL

[ Datanode State ]

node instance state | node instance state | node instance state
---------------------------------------------------------------------------------------------------------
1 db1 6001 P Primary Normal | 2 db2 6002 S Standby Normal | 3 db3 6003 S Standby Normal


[omm@db1 srv]$ gs_om -t status --detail
[ CMServer State ]

node node_ip instance state
-----------------------------------------------------------------------
1 db1 192.168.56.11 1 /opt/huawei/data/cmserver/cm_server Primary
2 db2 192.168.56.12 2 /opt/huawei/data/cmserver/cm_server Standby
3 db3 192.168.56.13 3 /opt/huawei/data/cmserver/cm_server Standby

[ Cluster State ]

cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL

[ Datanode State ]

node node_ip instance state
-------------------------------------------------------------------------
1 db1 192.168.56.11 6001 /opt/huawei/install/data/dn P Primary Normal
2 db2 192.168.56.12 6002 /opt/huawei/install/data/dn S Standby Normal
3 db3 192.168.56.13 6003 /opt/huawei/install/data/dn S Standby Normal

2.给omm用户增加sudo权限,三台机器都执行

echo “omm ALL=(ALL) NOPASSWD:ALL”>>/etc/sudoers

3. 在主库上添加VIP

添加前

[omm@db1 cm_agent]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
valid_lft 74572sec preferred_lft 74572sec
inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute
valid_lft forever preferred_lft forever
 ifconfig enp0s8:15400 192.168.56010 netmask 255.255.255.0 up

添加后

[omm@db1 cm_agent]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
valid_lft 74572sec preferred_lft 74572sec
inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8
valid_lft forever preferred_lft forever
inet 192.168.56.10/24 brd 192.168.56.255 scope global secondary enp0s8:15400
inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute
valid_lft forever preferred_lft forever

4.给集群添加VIP资源 VIP作为openGauss的资源管理

[omm@db2 cm_agent]$cm_ctl res --add --res_name="VIP_az1" --res_attr="resources_type=VIP,float_ip=192.168.56.10"
cm_ctl: add res(VIP_az1) success.

将每个实例加到资源中

[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=1,res_instance_id=6001" --inst_attr=base_ip=192.168.56.11
cm_ctl: edit res(VIP_az1) success.

[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=2,res_instance_id=6002" --inst_attr=base_ip=192.168.56.12
cm_ctl: edit res(VIP_az1) success.

[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=2,res_instance_id=6003" --inst_attr=base_ip=192.168.56.13
cm_ctl: edit res(VIP_az1) success.

查询VIP在哪个节点

[omm@db3 ~]$ cm_ctl show

[ Network Connect State ]

Network timeout: 6s
Current CMServer time: 2023-08-03 06:18:42
Network stat('Y' means connected, otherwise 'N'):
| \ | Y | Y |
| Y | \ | Y |
| Y | Y | \ |


[ Node Disk HB State ]

Node disk hb timeout: 200s
Current CMServer time: 2023-08-03 06:18:43
Node disk hb stat('Y' means connected, otherwise 'N'):
| N | N | N |

[ FloatIp Network State ]

node instance base_ip float_ip_name float_ip
----------------------------------------------------------
1 db1 6001 192.168.56.11 VIP_az1 192.168.56.10

模拟主节点故障

[omm@db3 ~]$ cm_ctl stop -n 1
cm_ctl: stop the node: 1.
cm_ctl: stop node, nodeid: 1
...........
cm_ctl: stop node successfully.

主节点切换到节点2,VIP也到了节点2

[omm@db1 cm_agent]$ cm_ctl show

[ Network Connect State ]

Network timeout: 6s
Current CMServer time: 2023-08-03 06:19:40
Network stat('Y' means connected, otherwise 'N'):
| \ | N | N |
| N | \ | Y |
| N | Y | \ |


[ Node Disk HB State ]

Node disk hb timeout: 200s
Current CMServer time: 2023-08-03 06:19:41
Node disk hb stat('Y' means connected, otherwise 'N'):
| N | N | N |

[ FloatIp Network State ]

node instance base_ip float_ip_name float_ip
----------------------------------------------------------
2 db2 6002 192.168.56.12 VIP_az1 192.168.56.10

节点1的IP,已没有192.168.56.10

[omm@db1 cm_agent]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
valid_lft 74572sec preferred_lft 74572sec
inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute
valid_lft forever preferred_lft forever

节点2的IP,已增加192.168.56.10

[omm@db2 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
valid_lft 74550sec preferred_lft 74550sec
inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:41:73:29 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.12/24 brd 192.168.56.255 scope global noprefixroute enp0s8
valid_lft forever preferred_lft forever
inet 192.168.56.10/24 brd 192.168.56.255 scope global secondary enp0s8:15400
valid_lft forever preferred_lft forever
inet6 fe80::5373:d66d:7a39:ddc2/64 scope link noprefixroute
valid_lft forever preferred_lft forever

资源配置文件

[omm@db1 cm_agent]$ cat cm_resource.json
{
"resources": [{
"name": "VIP_az1",
"resources_type": "VIP",
"instances": [{
"node_id": 1,
"res_instance_id": 6001,
"inst_attr": "base_ip=192.168.56.11"
}, {
"node_id": 2,
"res_instance_id": 6002,
"inst_attr": "base_ip=192.168.56.12"
}, {
"node_id": 3,
"res_instance_id": 6003,
"inst_attr": "base_ip=192.168.56.13"
}],
"float_ip": "192.168.56.10"
}]

同步配置文件到其余节点

scp 	cm_resource.json db2:/opt/huawei/data/cmserver/cm_agent
scp cm_resource.json db3:/opt/huawei/data/cmserver/cm_agent

启动节点1

[omm@db3 ~]$ cm_ctl start -n 1
cm_ctl: start the node: 1.
cm_ctl: start node, nodeid: 1
...........
cm_ctl: start node successfully.

[omm@db1 cm_agent]$ gs_om -t status --detail
[ CMServer State ]

node node_ip instance state
-----------------------------------------------------------------------
1 db1 192.168.56.11 1 /opt/huawei/data/cmserver/cm_server Standby
2 db2 192.168.56.12 2 /opt/huawei/data/cmserver/cm_server Primary
3 db3 192.168.56.13 3 /opt/huawei/data/cmserver/cm_server Standby

[ Cluster State ]

cluster_state : Normal
redistributing : No
balanced : No
current_az : AZ_ALL

[ Datanode State ]

node node_ip instance state
-------------------------------------------------------------------------
1 db1 192.168.56.11 6001 /opt/huawei/install/data/dn P Standby Normal
2 db2 192.168.56.12 6002 /opt/huawei/install/data/dn S Primary Normal
3 db3 192.168.56.13 6003 /opt/huawei/install/data/dn S Standby Normal


现在CM的主节点和数据库的主节点在同一机器上了。

本文分享自微信公众号 - openGauss(openGauss)。
如有侵权,请联系 support@oschina.cn 删除。
本文参与“OSC源创计划”,欢迎正在阅读的你也加入,一起分享。

展开阅读全文
加载中
点击引领话题📣 发布并加入讨论🔥
0 评论
0 收藏
0
分享
返回顶部
顶部