问题遇到的现象和发生背景
redis 7.0 集群, 关掉主后,从未提升.
用代码块功能插入代码,请勿粘贴截图
运行结果及报错内容
一. 按下列步骤建立4主2从,其中两主(M3,M4)无槽,用于选举.
7001-M1 0~9999
7002-M2 10000~10363
7003-M3 无槽
7004-M4 无槽
7005-S1 M1的从
7006-S2 M2的从
1 启动节点M1,M2,M3,M4,S1,S2
2 进入节点M1,依次执行cluster meet m2 m3 s4 s1 s2 此时所有节点都是无槽master.
3 进入S1执行cluster replicate m1-node-id 把它变为M1的从. S2一样.
3 进入M1 执行cluster addslotsrange 0 9999
4 进入M2 执行cluster addslotsrange 10000 16383
3 使用 redis-cli --cluster check 127.0.0.1:7001 检查集群配置是否有ERR, 槽位是否全覆盖.
5 测试get,set,主从自动切换.
二. 查看集群状态: 正常,且7005为7001的从.
# redis-cli --cluster check 127.0.0.1:7001
> 127.0.0.1:7001 (2511fb43...) -> 0 keys | 10000 slots | 1 slaves.
127.0.0.1:7004 (c9c4edea...) -> 0 keys | 0 slots | 0 slaves.
127.0.0.1:7003 (107ef5ad...) -> 0 keys | 0 slots | 0 slaves.
127.0.0.1:7002 (c17c15f5...) -> 0 keys | 6384 slots | 1 slaves.
[OK] 0 keys in 4 masters.
0.00 keys per slot on average.
Performing Cluster Check (using node 127.0.0.1:7001)
M: 2511fb43ad6750b41a01bd23f4d3baa1a694085c 127.0.0.1:7001
slots:[0-9999] (10000 slots) master
1 additional replica(s)
M: c9c4edea90e7a857c1f83230c18d4c2ab467d266 127.0.0.1:7004
slots: (0 slots) master
S: 56f62db0c9a4f0b5d2f5500b62acecb31a885e74 127.0.0.1:7006
slots: (0 slots) slave
replicates c17c15f5332124597e7d6f39498806e6f38b8721
M: 107ef5ad62c70dfe0a5ae8889d0bd7e43d9c2d09 127.0.0.1:7003
slots: (0 slots) master
S: dbd59395320ed88f49e42fb05c60b3d30c45cff5 127.0.0.1:7005
slots: (0 slots) slave
replicates 2511fb43ad6750b41a01bd23f4d3baa1a694085c
M: c17c15f5332124597e7d6f39498806e6f38b8721 127.0.0.1:7002
slots:[10000-16383] (6384 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
三. kill M1
/ # ps -ef|grep redis
> 60 root 0:00 redis-server *:7001 [cluster]
61 root 0:00 redis-server *:7002 [cluster]
62 root 0:00 redis-server *:7003 [cluster]
63 root 0:00 redis-server *:7004 [cluster]
64 root 0:00 redis-server *:7005 [cluster]
65 root 0:00 redis-server *:7006 [cluster]
66 root 0:00 redis-server *:7007 [cluster]
67 root 0:00 redis-server *:7008 [cluster]
109 root 0:00 grep redis
/ # kill -9 60
四. 查看集群状态: 7005未变为主
# redis-cli --cluster check 127.0.0.1:7002
> Could not connect to Redis at 127.0.0.1:7001: Connection refused
*** WARNING: 127.0.0.1:7005 claims to be slave of unknown node ID 2511fb43ad6750b41a01bd23f4d3baa1a694085c.
127.0.0.1:7002 (c17c15f5...) -> 1 keys | 6384 slots | 1 slaves.
127.0.0.1:7003 (107ef5ad...) -> 0 keys | 0 slots | 0 slaves.
127.0.0.1:7004 (c9c4edea...) -> 0 keys | 0 slots | 0 slaves.
[OK] 1 keys in 3 masters.
0.00 keys per slot on average.
Performing Cluster Check (using node 127.0.0.1:7002)
M: c17c15f5332124597e7d6f39498806e6f38b8721 127.0.0.1:7002
slots:[10000-16383] (6384 slots) master
1 additional replica(s)
S: 56f62db0c9a4f0b5d2f5500b62acecb31a885e74 127.0.0.1:7006
slots: (0 slots) slave
replicates c17c15f5332124597e7d6f39498806e6f38b8721
M: 107ef5ad62c70dfe0a5ae8889d0bd7e43d9c2d09 127.0.0.1:7003
slots: (0 slots) master
S: dbd59395320ed88f49e42fb05c60b3d30c45cff5 127.0.0.1:7005
slots: (0 slots) slave
replicates 2511fb43ad6750b41a01bd23f4d3baa1a694085c
M: c9c4edea90e7a857c1f83230c18d4c2ab467d266 127.0.0.1:7004
slots: (0 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[ERR] Not all 16384 slots are covered by nodes.
我的解答思路和尝试过的方法
尝试过各种办法建立4主2从(其中2主0槽)的集群,都无法切换主从.
但我用redis-cli --cluster create一次性建立4主4从,就能正常切换.
我想要达到的结果
- 为什么cluster check正常,却不进行主从切换呢?
- 是不是有什么参数未正确设置?
- 有没有其它办法建立这种4主2从(其中2主0槽)的集群?