zhu7929443 2016-07-13 02:36 采纳率: 0%
浏览 2019
已结题

mysql+keepalived 切换后vip无法被备机接管,反之可以

今天一共搭建了6台3组两两互备的centos 6.5 mysql+keepalived的主主复制+双backup模式的集群。主机107的keepalived.conf如下:
! Configuration File for keepalived

global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id mysql_ha
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}

vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 117
priority 100
advert_int 1
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.40.6.117
}
}

virtual_server 10.40.6.117 3306 {
delay_loop 2
#lb_algo wrr
#lb_kind DR
persistence_timeout 60
protocol TCP

real_server 10.40.6.107 3306 {
    weight 3
    notify_down /usr/local/etc/keepalived/mysql.sh
    TCP_CHECK {
        connect_timeout 3
        nb_get_retry 3
        delay_before_retry 3
        connect_port 3306
    }
}

}
备机108的keepalived.conf如下:
! Configuration File for keepalived

global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id mysql_ha
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}

vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 117
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.40.6.117
}
}

virtual_server 10.40.6.117 3306 {
delay_loop 2
#lb_algo wrr
#lb_kind DR
persistence_timeout 60
protocol TCP

real_server 10.40.6.108 3306 {
    weight 3
    notify_down /usr/local/etc/keepalived/mysql.sh
    TCP_CHECK {
        connect_timeout 3
        nb_get_retry 3
        delay_before_retry 3
        connect_port 3306
    }
}

}

其中有两台集群很奇怪,real_ip分别为107和108,vip为117,当MySQL服务和keepalived服务都启动完成后,一切正常,107占有117的虚拟ip,此时测试切换,将mysql服务停止,按理说3306端口检查不健康的时候会执行我的mysql.sh脚本,实际上就是pkill keepalived,使备机占有vip,但实际上117的vip没有正常漂移到备机,而一直被主机占有,查看message日志发现一直报错:

Jul 12 16:09:12 hs-10-40-6-107 Keepalived_healthcheckers[9204]: TCP connection to [10.40.6.107]:3306 failed.
Jul 12 16:09:15 hs-10-40-6-107 Keepalived_healthcheckers[9204]: TCP connection to [10.40.6.107]:3306 failed.
Jul 12 16:09:15 hs-10-40-6-107 Keepalived_healthcheckers[9204]: Check on service [10.40.6.107]:3306 failed after 1 retry.
Jul 12 16:09:15 hs-10-40-6-107 Keepalived_healthcheckers[9204]: Removing service [10.40.6.107]:3306 from VS [10.40.6.117]:3306
Jul 12 16:09:15 hs-10-40-6-107 Keepalived_healthcheckers[9204]: IPVS: Service not defined
Jul 12 16:09:15 hs-10-40-6-107 Keepalived_healthcheckers[9204]: SMTP connection ERROR to [127.0.0.1]:25.
Jul 12 16:09:17 hs-10-40-6-107 Keepalived_healthcheckers[9204]: TCP connection to [10.40.6.107]:3306 failed.
Jul 12 16:09:20 hs-10-40-6-107 Keepalived_healthcheckers[9204]: TCP connection to [10.40.6.107]:3306 failed.
Jul 12 16:09:20 hs-10-40-6-107 Keepalived_healthcheckers[9204]: Check on service [10.40.6.107]:3306 failed after 1 retry.
Jul 12 16:09:20 hs-10-40-6-107 Keepalived_healthcheckers[9204]: Removing service [10.40.6.107]:3306 from VS [10.40.6.117]:3306
Jul 12 16:09:20 hs-10-40-6-107 Keepalived_healthcheckers[9204]: IPVS: Service not defined
Jul 12 16:09:20 hs-10-40-6-107 Keepalived_healthcheckers[9204]: SMTP connection ERROR to [127.0.0.1]:25.

然后重新恢复所有服务,测试从108切换到107,一切正常,108的mysql 停止之后,执行notify_down脚本,杀掉keepalived进程,从而使之前108所占用的vip 117备107所抢占,108的操作系统日志如下:
ul 12 14:18:40 hs-10-40-6-108 Keepalived_healthcheckers[6258]: TCP connection to [10.40.6.108]:3306 failed.
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_healthcheckers[6258]: TCP connection to [10.40.6.108]:3306 failed.
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_healthcheckers[6258]: Check on service [10.40.6.108]:3306 failed after 1 retry.
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_healthcheckers[6258]: Removing service [10.40.6.108]:3306 from VS [10.40.6.117]:3306
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_healthcheckers[6258]: IPVS: No such destination
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_healthcheckers[6258]: Executing [/usr/local/etc/keepalived/mysql.sh] for service [10.40.6.108]:3306 in VS [10.40.6.117]:3306
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_healthcheckers[6258]: Lost quorum 1-0=1 > 0 for VS [10.40.6.117]:3306
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_healthcheckers[6258]: SMTP connection ERROR to [127.0.0.1]:25.
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_vrrp[6259]: VRRP_Instance(VI_1) sent 0 priority
Jul 12 14:18:43 hs-10-40-6-108 Keepalived[6257]: Stopping
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_vrrp[6259]: VRRP_Instance(VI_1) removing protocol VIPs.
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_healthcheckers[6258]: Netlink reflector reports IP 10.40.6.117 removed
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_healthcheckers[6258]: IPVS: No such file or directory
Jul 12 14:18:43 hs-10-40-6-108 Keepalived_healthcheckers[6258]: Stopped

今天一共装了6台机器,只有这一组主切备的时候有问题,notify_down 脚本一直不会执行,并且报错,不知道哪位大牛知道原因?

  • 写回答

1条回答 默认 最新

  • 普通网友 2016-10-04 11:00
    关注

    keepalived 切换后vip无法被备机接管,反之可以 1C
    今天一共搭建了6台3组两两互备的centos 6.5 mysql+keepalived的主主复制+双backup模式的集群。主机107的keepalived.conf如下:
    ! Configuration File for keepalived
    global_defs {
    notification_email {
    acassen@firewall.loc
    failover@firewall.loc
    sysadmin@firewall.loc
    }
    notification_email_from Alexandre.Cassen@firewall.loc
    smtp_server 127.0.0.1
    smtp_connect_timeout 30
    router_id mysql_ha
    vrrp_skip_check_adv_addr
    vrrp_strict

    评论

报告相同问题?

悬赏问题

  • ¥15 oracle集群安装出bug
  • ¥15 关于#python#的问题:自动化测试
  • ¥20 问题请教!vue项目关于Nginx配置nonce安全策略的问题
  • ¥15 教务系统账号被盗号如何追溯设备
  • ¥20 delta降尺度方法,未来数据怎么降尺度
  • ¥15 c# 使用NPOI快速将datatable数据导入excel中指定sheet,要求快速高效
  • ¥15 再不同版本的系统上,TCP传输速度不一致
  • ¥15 高德地图2.0 版本点聚合中Marker的位置无法实时更新,如何解决呢?
  • ¥15 DIFY API Endpoint 问题。
  • ¥20 sub地址DHCP问题