openGauss发布以来,原生支持一主多备,RTO<10S,高可用性能大大增强。自openGauss3.0开始,更新了集群管理套件CM,易用性也得到了提高。但对于客户端来说,数据库端的切换,需要手工完成。
openGauss增加VIP后,客户端的连接就如连接ORACLE RAC的scan VIP一样,对于服务端的切换无感知。
要使用VIP,可以在安装前规划,在配置文件中指定,也可以对已安装的集群进行手工增加。下面就测试手工增加方法。
1.已安装集群的相关信息
数据库版本
gsql -V
gsql (openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:37:13 commit 0 last mr
集群状态
[omm@db1 srv]$ cm_ctl query -Cv
[ CMServer State ]
node instance state
-----------------------
1 db1 1 Primary
2 db2 2 Standby
3 db3 3 Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL
[ Datanode State ]
node instance state | node instance state | node instance state
---------------------------------------------------------------------------------------------------------
1 db1 6001 P Primary Normal | 2 db2 6002 S Standby Normal | 3 db3 6003 S Standby Normal
[omm@db1 srv]$ gs_om -t status --detail
[ CMServer State ]
node node_ip instance state
-----------------------------------------------------------------------
1 db1 192.168.56.11 1 /opt/huawei/data/cmserver/cm_server Primary
2 db2 192.168.56.12 2 /opt/huawei/data/cmserver/cm_server Standby
3 db3 192.168.56.13 3 /opt/huawei/data/cmserver/cm_server Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
-------------------------------------------------------------------------
1 db1 192.168.56.11 6001 /opt/huawei/install/data/dn P Primary Normal
2 db2 192.168.56.12 6002 /opt/huawei/install/data/dn S Standby Normal
3 db3 192.168.56.13 6003 /opt/huawei/install/data/dn S Standby Normal
2.给omm用户增加sudo权限,三台机器都执行
echo “omm ALL=(ALL) NOPASSWD:ALL”>>/etc/sudoers
3. 在主库上添加VIP
添加前
[omm@db1 cm_agent]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
valid_lft 74572sec preferred_lft 74572sec
inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute
valid_lft forever preferred_lft forever
ifconfig enp0s8:15400 192.168.56010 netmask 255.255.255.0 up
添加后
[omm@db1 cm_agent]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
valid_lft 74572sec preferred_lft 74572sec
inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8
valid_lft forever preferred_lft forever
inet 192.168.56.10/24 brd 192.168.56.255 scope global secondary enp0s8:15400
inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute
valid_lft forever preferred_lft forever
4.给集群添加VIP资源 VIP作为openGauss的资源管理
[omm@db2 cm_agent]$cm_ctl res --add --res_name="VIP_az1" --res_attr="resources_type=VIP,float_ip=192.168.56.10"
cm_ctl: add res(VIP_az1) success.
将每个实例加到资源中
[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=1,res_instance_id=6001" --inst_attr=base_ip=192.168.56.11
cm_ctl: edit res(VIP_az1) success.
[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=2,res_instance_id=6002" --inst_attr=base_ip=192.168.56.12
cm_ctl: edit res(VIP_az1) success.
[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=2,res_instance_id=6003" --inst_attr=base_ip=192.168.56.13
cm_ctl: edit res(VIP_az1) success.
查询VIP在哪个节点
[omm@db3 ~]$ cm_ctl show
[ Network Connect State ]
Network timeout: 6s
Current CMServer time: 2023-08-03 06:18:42
Network stat('Y' means connected, otherwise 'N'):
| \ | Y | Y |
| Y | \ | Y |
| Y | Y | \ |
[ Node Disk HB State ]
Node disk hb timeout: 200s
Current CMServer time: 2023-08-03 06:18:43
Node disk hb stat('Y' means connected, otherwise 'N'):
| N | N | N |
[ FloatIp Network State ]
node instance base_ip float_ip_name float_ip
----------------------------------------------------------
1 db1 6001 192.168.56.11 VIP_az1 192.168.56.10
模拟主节点故障
[omm@db3 ~]$ cm_ctl stop -n 1
cm_ctl: stop the node: 1.
cm_ctl: stop node, nodeid: 1
...........
cm_ctl: stop node successfully.
主节点切换到节点2,VIP也到了节点2
[omm@db1 cm_agent]$ cm_ctl show
[ Network Connect State ]
Network timeout: 6s
Current CMServer time: 2023-08-03 06:19:40
Network stat('Y' means connected, otherwise 'N'):
| \ | N | N |
| N | \ | Y |
| N | Y | \ |
[ Node Disk HB State ]
Node disk hb timeout: 200s
Current CMServer time: 2023-08-03 06:19:41
Node disk hb stat('Y' means connected, otherwise 'N'):
| N | N | N |
[ FloatIp Network State ]
node instance base_ip float_ip_name float_ip
----------------------------------------------------------
2 db2 6002 192.168.56.12 VIP_az1 192.168.56.10
节点1的IP,已没有192.168.56.10
[omm@db1 cm_agent]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
valid_lft 74572sec preferred_lft 74572sec
inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute
valid_lft forever preferred_lft forever
节点2的IP,已增加192.168.56.10
[omm@db2 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
valid_lft 74550sec preferred_lft 74550sec
inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:41:73:29 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.12/24 brd 192.168.56.255 scope global noprefixroute enp0s8
valid_lft forever preferred_lft forever
inet 192.168.56.10/24 brd 192.168.56.255 scope global secondary enp0s8:15400
valid_lft forever preferred_lft forever
inet6 fe80::5373:d66d:7a39:ddc2/64 scope link noprefixroute
valid_lft forever preferred_lft forever
资源配置文件
[omm@db1 cm_agent]$ cat cm_resource.json
{
"resources": [{
"name": "VIP_az1",
"resources_type": "VIP",
"instances": [{
"node_id": 1,
"res_instance_id": 6001,
"inst_attr": "base_ip=192.168.56.11"
}, {
"node_id": 2,
"res_instance_id": 6002,
"inst_attr": "base_ip=192.168.56.12"
}, {
"node_id": 3,
"res_instance_id": 6003,
"inst_attr": "base_ip=192.168.56.13"
}],
"float_ip": "192.168.56.10"
}]
同步配置文件到其余节点
scp cm_resource.json db2:/opt/huawei/data/cmserver/cm_agent
scp cm_resource.json db3:/opt/huawei/data/cmserver/cm_agent
启动节点1
[omm@db3 ~]$ cm_ctl start -n 1
cm_ctl: start the node: 1.
cm_ctl: start node, nodeid: 1
...........
cm_ctl: start node successfully.
[omm@db1 cm_agent]$ gs_om -t status --detail
[ CMServer State ]
node node_ip instance state
-----------------------------------------------------------------------
1 db1 192.168.56.11 1 /opt/huawei/data/cmserver/cm_server Standby
2 db2 192.168.56.12 2 /opt/huawei/data/cmserver/cm_server Primary
3 db3 192.168.56.13 3 /opt/huawei/data/cmserver/cm_server Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
-------------------------------------------------------------------------
1 db1 192.168.56.11 6001 /opt/huawei/install/data/dn P Standby Normal
2 db2 192.168.56.12 6002 /opt/huawei/install/data/dn S Primary Normal
3 db3 192.168.56.13 6003 /opt/huawei/install/data/dn S Standby Normal
现在CM的主节点和数据库的主节点在同一机器上了。
本文分享自微信公众号 - openGauss(openGauss)。
如有侵权,请联系 support@oschina.cn 删除。
本文参与“OSC源创计划”,欢迎正在阅读的你也加入,一起分享。