MySQL MHA

2018/07/13 16:39
阅读数 1.3K
MySQL MHA
 
架构介绍:MHA由两部分组成MHA Manager(管理节点)和MHA Node(数据节点),MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master
 
MHA的隐患:在MHA自动故障切换的过程中,MHA试图从宕掉的主服务器上保存二进制日志,最大程度保证数据的不丢失,存在的问题是,如果主服务器硬件故障宕机或无法通过SSH访问,MHA没有办法保存二进制日志,只能进行故障转移而可能丢失最新数据
 
工作原理总结为以下几条:
 
    1.从宕机崩溃的master保存二进制日志事件(binlog events);
    2.识别含有最新更新的slave;
    3.应用差异的中继日志(relay log) 到其他slave;
    4.应用从master保存的二进制日志事件(binlog events);
    5.提升一个slave为新master;
    6.使用其他的slave连接新的master进行复制。
 
 
1、安装mysql:
    
    1.1 添加环境变量
            vim /etc/profile
                export PATH=$PATH:/usr/local/mysql/bin
            source /etc/profile
 
    1.2    解压tar包
            tar -xf mysql-5.7.22-linux-glibc2.12-x86_64.tar.gz
            mv mysql-5.7.22-linux-glibc2.12-x86_64 mysql
            scp -r /usr/local/mysql slave1:/usr/local/mysql
            scp -r /usr/local/mysql slave2:/usr/local/mysql
            scp /etc/my.cnf slave1:/etc/
            scp /etc/my.cnf slave2:/etc/
            所有节点my.cnf的server-id必须唯一
 
    1.3    创建用户,目录,授权,初始化,启动(3台执行)
            useradd mysql
            mkdir -p /home/mysql3306/{mysql3306,logs}
            chown mysql:mysql /home/mysql3306 -R
            chown mysql:mysql /usr/local/mysql -R
            mysqld --defaults-file=/etc/my.cnf --initialize-insecure --datadir=/home/mysql3306/mysql3306 --basedir=/usr/local/mysql --user=mysql
            mysqld_safe --user=mysql &
 
 
2、配置主从
 
    2.1 在master上建立帐户并授权slave:
            grant REPLICATION CLIENT,REPLICATION SLAVE on *.* to rep@'192.168.111.129' identified by '123456';
            grant REPLICATION CLIENT,REPLICATION SLAVE on *.* to rep@'192.168.111.130' identified by '123456';
            flush privileges;
 
    2.2 查看master状态,获取binlog文件和pos点
            mysql> show master status;
 
 
    2.3 slave1、slave2设置需要同步的主库
            change master to master_host='192.168.111.128',master_user='rep',master_password='123456', master_log_file='mysql-bin.000002',master_log_pos=1229,MASTER_PORT=3306;
            flush privileges;
            start slave;
 
    2.4 查看从服务器复制状态
            show slave status\G
            
    2.5    两台slave服务器设置read_only(从库对外提供读服务,之所以没有写进配置文件,是因为随时slave会提升为master)
            mysql -uroot -e "set global read_only=1"
            
    2.6    所有节点创建manager所需的监控用户
            grant all privileges on *.* to 'rep'@'192.168.111.%' identified  by '123456';
 
3、搭建MHA
 
    3.1    配置集群内时间同步、ssh免密码登陆
 
    3.2 MHA    node节点安装
    
            yum install -y perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-DBD-MySQL perl-devel perl-CPAN
            mkdir -p /etc/mha  ##创建安装目录
            tar -xf mha4mysql-node-0.57.tar.gz -C /etc/mha/
            mv /etc/mha/mha4mysql-node-0.57 /etc/mha/node
            cd /etc/mha/node
            perl Makefile.PL
            make && make install
            
    3.3 MHA manager节点安装    
 
            yum install -y perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
            tar -xf mha4mysql-manager-0.57.tar.gz -C /etc/mha/
            mv /etc/mha/mha4mysql-manager-0.57 /etc/mha/manager
            cd /etc/mha/manager
            perl Makefile.PL
            make && make install
 
    3.4 配置MHA
    
            修改manager配置文件
            mkdir /etc/mha/app1  ##创建manager工作目录
            cp /etc/mha/manager/samples/conf/app1.cnf /etc/mha/
            vim /etc/mha/app1.cnf
                [server default]
                manager_workdir=/etc/mha/app1            #MHA工作路径
                manager_log=/etc/mha/app1/manager.log        #MHA日志路径
                master_binlog_dir="/home/mysql3306/mysql3306"        #MHA node端的binlog路径,也就是mysql的数据目录
                remote_workdir=/etc/mha/app1        #远端mysql在发生切换时binlog的保存位置
                master_ip_failover_script=/etc/mha/master_ip_failover        #自动failover时候的切换脚本
                master_ip_online_change_script=/etc/mha/master_ip_online_change        #手动切换脚本
                report_script=/etc/mha/send_report        #发生切换后报警脚本
                user=rep        #监控用户
                password=123456        #监控用户密码
                repl_user=rep        #复制用户
                repl_password=123456        #复制用户密码
                ping_interval=1            #MHA manager的检测时间间隔(1秒)
                secondary_check_script= masterha_secondary_check -s slave1 -s mastre --user=rep --master_host=master --master_ip=192.168.111.128 --master_port=3306        #MHA检测到master出现问题,Manager会尝试从slave1登陆到master
 
                [server1]
                hostname=192.168.111.128
                port=3306
                ssh_port=22
 
                [server2]
                hostname=192.168.111.129
                port=3306
                candidate_master=1        #备用主,如果主库出问题,此库将提升为主库,即使这个库不是集群中事件最新的slave
                ssh_port=22
                check_repl_delay=0        #默认情况下,一个slave落后于master 100M的relay log,MHA将不会选择该slave为一个新的master,设置为0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master
 
                [server3]
                hostname=192.168.111.130
                port=3306
                no_master=1
                ssh_port=22
                
    3.5    设置slave节点relay log清除方式;建立硬连接    
    
            MHA发生切换工程中,从库恢复依赖于relay log,mysql默认情况下,从库应用完就会自动清除relay log,因此将其设置为OFF,采用手动清理方式。
            
                mysql -uroot -p123456 -e "set global relay_log_purge=0"            
            
            定期删除relay log可能会出现复制延迟的问题,所以建立relay log日志硬连接,因为linux系统中通过硬连接删除大文件速度快。
            
                mkdir /home/mysql3306/logs1
                ln /home/mysql3306/logs/mysql-relay* /home/mysql3306/logs1
            
                
    3.6    编写定期清理relay log脚本,结合定时任务清理(slave1、slave2操作)    
            vim /etc/mha/purge_relay_log.sh
                #!/bin/bash
                user=root
                passwd=123456
                port=3306
                log_dir='/home/mysql3306/logs/'
                work_dir='/home/mysql3306/logs1'
                purge='/usr/local/bin/purge_relay_logs'
 
                if [ ! -d $log_dir ]
                then
                   mkdir $log_dir -p
                fi
 
                $purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --host=localhost --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1
                
                
                
            参数说明:
                --work_dir:指定创建relay log的硬链接的位置
                --disable_relay_log_purge :默认情况下,如果relay_log_purge=1,脚本会什么都不清理,自动退出,通过设定这个参数,当relay_log_purge=1的情况下会将relay_log_purge设置为0。清理relay log之后,最后将参数设置为OFF。
            
            此处有几个小细节
                purge_relay_logs脚本中定义了的sock文件位置/var/lib/mysql/mysql.sock,可以做个软链
                    ln -s /tmp/mysql3306.sock /var/lib/mysql/mysql.sock
                purge_relay_logs需要--user=root --host=localhost 没有权限的,需要设置
            
            没问题了,可以先测试下:
                
                purge_relay_logs --user=root --host=localhost  --port=3306 --password=123456 -disable_relay_log_purge --workdir=/home/mysql3306/logs/
                
            出现这个说明测试通过:2018-07-04 05:22:21: All relay log purging operations succeeded.
            
            添加定时任务
                crontab -e
                0 0 */3 * * sh /etc/auto_clean_relay_log.sh    
                
    3.7    创建自动切换脚本
            vim /etc/mha/master_ip_failover
                #!/usr/bin/env perl
                use strict;
                use warnings FATAL => 'all';
                use Getopt::Long;
                my (
                $command, $ssh_user, $orig_master_host, $orig_master_ip,
                $orig_master_port, $new_master_host, $new_master_ip, $new_master_port
                );
                my $vip = '192.168.111.111/24';
                my $key = '0';
                my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
                my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";
                GetOptions(
                'command=s' => \$command,
                'ssh_user=s' => \$ssh_user,
                'orig_master_host=s' => \$orig_master_host,
                'orig_master_ip=s' => \$orig_master_ip,
                'orig_master_port=i' => \$orig_master_port,
                'new_master_host=s' => \$new_master_host,
                'new_master_ip=s' => \$new_master_ip,
                'new_master_port=i' => \$new_master_port,
                );
                exit &main();
                sub main {
                print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
                if ( $command eq "stop" || $command eq "stopssh" ) {
                my $exit_code = 1;
                eval {
                print "Disabling the VIP on old master: $orig_master_host \n";
                &stop_vip();
                $exit_code = 0;
                };
                if ($@) {
                warn "Got Error: $@\n";
                exit $exit_code;
                }
                exit $exit_code;
                }
                elsif ( $command eq "start" ) {
                my $exit_code = 10;
                eval {
                print "Enabling the VIP - $vip on the new master - $new_master_host \n";
                &start_vip();
                $exit_code = 0;
                };
                if ($@) {
                warn $@;
                exit $exit_code;
                }
                exit $exit_code;
                }
                elsif ( $command eq "status" ) {
                print "Checking the Status of the script.. OK \n";
                exit 0;
                }
                else {
                &usage();
                exit 1;
                }
                }
                sub start_vip() {
                `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
                }
                sub stop_vip() {
                return 0 unless ($ssh_user);
                `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
                }
                sub usage {
                print
                "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip
                --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
                }
    
    3.8    创建手动切换脚本
            vim /etc/mha/master_ip_online_change
                #!/usr/bin/env perl
                use strict;
                use warnings FATAL =>'all';
                use Getopt::Long;
                my $vip = '192.168.111.111/24'; # Virtual IP
                my $key = "0";
                my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
                my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";
                my $exit_code = 0;
                my (
                $command, $orig_master_is_new_slave, $orig_master_host,
                $orig_master_ip, $orig_master_port, $orig_master_user,
                $orig_master_password, $orig_master_ssh_user, $new_master_host,
                $new_master_ip, $new_master_port, $new_master_user,
                $new_master_password, $new_master_ssh_user,
                );
                GetOptions(
                'command=s' => \$command,
                'orig_master_is_new_slave' => \$orig_master_is_new_slave,
                'orig_master_host=s' => \$orig_master_host,
                'orig_master_ip=s' => \$orig_master_ip,
                'orig_master_port=i' => \$orig_master_port,
                'orig_master_user=s' => \$orig_master_user,
                'orig_master_password=s' => \$orig_master_password,
                'orig_master_ssh_user=s' => \$orig_master_ssh_user,
                'new_master_host=s' => \$new_master_host,
                'new_master_ip=s' => \$new_master_ip,
                'new_master_port=i' => \$new_master_port,
                'new_master_user=s' => \$new_master_user,
                'new_master_password=s' => \$new_master_password,
                'new_master_ssh_user=s' => \$new_master_ssh_user,
                );
                exit &main();
                sub main {
                #print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
                if ( $command eq "stop" || $command eq "stopssh" ) {
                # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
                # If you manage master ip address at global catalog database,
                # invalidate orig_master_ip here.
                my $exit_code = 1;
                eval {
                print "\n\n\n***************************************************************\n";
                print "Disabling the VIP - $vip on old master: $orig_master_host\n";
                print "***************************************************************\n\n\n\n";
                &stop_vip();
                $exit_code = 0;
                };
                if ($@) {
                warn "Got Error: $@\n";
                exit $exit_code;
                }
                exit $exit_code;
                }
                elsif ( $command eq "start" ) {
                # all arguments are passed.
                # If you manage master ip address at global catalog database,
                # activate new_master_ip here.
                # You can also grant write access (create user, set read_only=0, etc) here.
                my $exit_code = 10;
                eval {
                print "\n\n\n***************************************************************\n";
                print "Enabling the VIP - $vip on new master: $new_master_host \n";
                print "***************************************************************\n\n\n\n";
                &start_vip();
                $exit_code = 0;
                };
                if ($@) {
                warn $@;
                exit $exit_code;
                }
                exit $exit_code;
                }
                elsif ( $command eq "status" ) {
                print "Checking the Status of the script.. OK \n";
                `ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
                exit 0;
                }
                else {
                &usage();
                exit 1;
                }
                }
                # A simple system call that enable the VIP on the new master
                sub start_vip() {
                `ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
                }
                # A simple system call that disable the VIP on the old_master
                sub stop_vip() {
                `ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
                }
                sub usage {
                print
                "Usage: master_ip_failover -command=start|stop|stopssh|status -orig_master_host=host -orig_master_ip=ip -
                orig_master_port=po
                rt -new_master_host=host -new_master_ip=ip -new_master_port=port\n";
                }    
            
    3.9编写切换节点监控报警脚本
            vim /etc/mha/send_report
                #!/usr/bin/perl
                use strict;
                use warnings FATAL => 'all';
                use Mail::Sender;
                use Getopt::Long;
 
                #new_master_host and new_slave_hosts are set only when recovering master succeeded
                my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
                my $smtp=' smtp.163.com';
                my $mail_from=' xxxxx@163.com';
                my $mail_user=' xxxxx@163.com';
                my $mail_pass='xxxxx';
                my $mail_to=[' xxxxx@139.com'];
                GetOptions(
                  'orig_master_host=s' => \$dead_master_host,
                  'new_master_host=s'  => \$new_master_host,
                  'new_slave_hosts=s'  => \$new_slave_hosts,
                  'subject=s'          => \$subject,
                  'body=s'             => \$body,
                );
 
                mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);
 
                sub mailToContacts {
                    my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;
                    open my $DEBUG, "> /tmp/monitormail.log"
                        or die "Can't open the debug      file:$!\n";
                    my $sender = new Mail::Sender {
                        ctype       => 'text/plain; charset=utf-8',
                        encoding    => 'utf-8',
                        smtp        => $smtp,
                        from        => $mail_from,
                        auth        => 'LOGIN',
                        TLS_allowed => '0',
                        authid      => $user,
                        authpwd     => $passwd,
                        to          => $mail_to,
                        subject     => $subject,
                        debug       => $DEBUG
                    };
 
                    $sender->MailMsg(
                        {   msg   => $msg,
                            debug => $DEBUG
                        }
                    ) or print $Mail::Sender::Error;
                    return 1;
                }
                # Do whatever you want here
                exit 0;    
                
                脚本需要修改的地方
                    my $smtp=' smtp.163.com';                   ## 提供smtp服务的服务商地址,通常为smtp.(qq.163.139.)com
                    my $mail_from=' xxxxx@163.com';                 ## 发送邮件的邮箱
                    my $mail_user=' xxxxx@163.com';               ## 同上
                    my $mail_pass='xxxxx';                       ## 邮箱授权码,邮箱开启pop3/smtp时,一般会让你设置密码
                    my $mail_to=[' xxxxx@139.com'];               ## 接收邮件的邮箱,139为移动的短信邮箱,很方便,直接短信接收信息
        
    
            给其执行权限
                chmod +x /etc/mha/master_ip_failover
                chmod +x /etc/mha/master_ip_online_change
                chmod +x /etc/mha/send_report
                    
    3.10manager检查ssh是否成功        
        
            /etc/mha/manager/bin/masterha_check_ssh --conf=/etc/mha/app1.cnf
            
    3.11manager检查复制状态
            所有节点创建软链
                ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
                ln -s /usr/local/mysql/bin/mysql /usr/bin/mysql
                
        /etc/mha/manager/bin/masterha_check_repl --conf=/etc/mha/app1.cnf
            
    3.12为master添加vip
            ifconfig ens33:0 192.168.111.111
            
    3.13manager节点启动mha
            nohup /etc/mha/manager/bin/masterha_manager --conf=/etc/mha/app1.cnf --ignore_last_failover >/tmp/mha_manager.log < /dev/null 2>&1 &
            
    3.14检查mha状态
        /etc/mha/manager/bin/masterha_check_status --conf=/etc/mha/app1.cnf
            
    3.15测试    
        
        实验一:测试自动Failover
        
            1.在slave1 上我先停掉IO线程,模拟主从延迟
                stop slave io_thread;
                
            2.master库导入一张表(数据量尽量大点,建议10W+以上数据)
                这时候slave2一直在同步数据
                
            3.slave1开启IO线程
                start slave io_thread;
                
            4.停掉master mysql
                实验使用pkill mysql(生产禁用)
                
            5.查看manager日志,可以看出master已经换了
                tail -300f /etc/mha/app1/manager.log
                
            6.在新的master上可以看到落后的数据也已经同步过来了
            
            7.查看Vip飘逸情况,vip是否到了slave1这台主机
            
        实验二:手动Failover测试
            
            注意,执行手动Failover时,MHA manager必须没有运行,否则,manager会挂掉
            
            1.停止manager和master的mysql
                /etc/mha/manager/bin/masterha_stop --conf=/etc/mha/app1.cnf
                实验使用pkill mysql(生产禁用)
                        
            2.执行manager上的脚本master_ip_online_change
展开阅读全文
打赏
0
0 收藏
分享
加载中
更多评论
打赏
0 评论
0 收藏
0
分享
返回顶部
顶部