K8S创建Redis集群并迁移数据

公司之前生产环境一直用的是单节点的Redis,生产环境版本是6.2.6,开发需要使用PEXPIRETIME命令,这个命令是7.0版本之后才有的,所以趁着这个机会,升级Redis版本顺便把单节点转成集群。

说明

k8s部署redis集群

  • 推荐7.0+版本,这里是7.4.0,因为当使用 redis-cli 7.0前的版本组建集群时只能使用ip端口,而不能使用pod的名称
  • 由于redis重启pod的ip会变化,集群会失效,k8s环境需要配置cluster-announce-ip通告地址,否则某些场景集群会失败

看过网上一些k8s部署redis集群的文章在某些场景下多少都有点问题,比如升级、不正当重启等情况都可能导致集群失效,容易踩坑

本例中部署6节点3主3从的集群。对升级、重启、高可用都进行了测试,均正常

要求

  • 使用nfs作为redis的存储,需要准备好存储类,本例是nfs-client
  • 本例操作都在test命名空间,请改成你自己的命名空间
  • redis的密码这里为hello_redis@234,请改成你自己的

部署

准备redis-cluster-sts.yaml,如下内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
# 1. redis的无头服务
---
apiVersion: v1
kind: Service
metadata:
name: redis-cluster-headless
spec:
clusterIP: None
selector:
app: redis-cluster
ports:
- port: 6379
protocol: TCP
targetPort: 6379
name: redis
- port: 16379
protocol: TCP
targetPort: 16379
name: election

# 2.redis集群的配置信息,redis的密码改成你自己的
---
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-cluster-cm
data:
redis-cluster.conf: |
bind 0.0.0.0
port 6379
daemonize no
protected-mode no
dir /data
cluster-announce-bus-port 16379
cluster-enabled yes
cluster-node-timeout 15000
cluster-config-file /data/nodes.conf
requirepass hello_redis@234
masterauth hello_redis@234


# 3.部署StatefulSet的reids集群
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-cluster-sts
spec:
selector:
matchLabels:
app: redis-cluster
serviceName: redis-cluster-headless
replicas: 6
template:
metadata:
labels:
app: redis-cluster
spec:
affinity:
# 反亲和软策略,尽量不要让pod在同一节点
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- redis-cluster
topologyKey: kubernetes.io/hostname
weight: 100
containers:
- name: redis-cluster
image: docker-0.unsee.tech/redis:7.4.0
imagePullPolicy: IfNotPresent
env:
- name: TZ
value: Asia/Shanghai
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
# 上面的service名称
- name: POD_SERVICE_NAME
value: "redis-cluster-headless"
args:
- /etc/redis/redis-cluster.conf
# 节点的通告地址,因为重启后pod的ip发生了变化,redis集群必须配置
# 可以是pod的ip,强烈建议是节点的短名称或长名称,但是不能超过46个字符
# 这里使用的是节点的短名称
- --cluster-announce-ip "$(POD_NAME).$(POD_SERVICE_NAME)"
# - --cluster-announce-ip $(POD_IP)
ports:
- name: redis
containerPort: 6379
protocol: TCP
- name: election
containerPort: 16379
protocol: TCP
resources:
requests:
cpu: "0.5"
memory: "1Gi"
limits:
cpu: "1"
memory: "2Gi"
# 存活探针
livenessProbe:
failureThreshold: 2
tcpSocket:
port: redis
initialDelaySeconds: 16
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
# 就绪探针
readinessProbe:
failureThreshold: 2
tcpSocket:
port: redis
initialDelaySeconds: 16
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
volumeMounts:
- name: redis-conf
mountPath: /etc/redis
- name: claim
mountPath: /data
volumes:
- name: redis-conf
configMap:
name: redis-cluster-cm
items:
- key: redis-cluster.conf
path: redis-cluster.conf
volumeClaimTemplates:
- metadata:
name: claim
spec:
accessModes: ["ReadWriteMany"]
storageClassName: nfs-client
volumeMode: Filesystem
resources:
requests:
storage: 10Gi

创建上面的资源

1
kubectl -n test apply -f redis-cluster-sts.yaml

查看pod已经起来了

1
2
3
4
5
6
7
8
# kubectl -n midd get pod 
NAME READY STATUS RESTARTS AGE
redis-cluster-sts-0 1/1 Running 0 125m
redis-cluster-sts-1 1/1 Running 0 125m
redis-cluster-sts-2 1/1 Running 0 125m
redis-cluster-sts-3 1/1 Running 0 126m
redis-cluster-sts-4 1/1 Running 0 126m
redis-cluster-sts-5 1/1 Running 0 111m

开始使用redis-cli创建集群,会组建3主3从的集群,然后分配槽位。会提示你输入yes就行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
redis-cli这里使用6个pod的短域名建集群,即pod名称.svc名称

# kubectl -n test exec -it \
redis-cluster-sts-0 \
-- redis-cli -a hello_redis@234 \
--cluster create \
--cluster-replicas 1 \
redis-cluster-sts-0.redis-cluster-headless:6379 \
redis-cluster-sts-1.redis-cluster-headless:6379 \
redis-cluster-sts-2.redis-cluster-headless:6379 \
redis-cluster-sts-3.redis-cluster-headless:6379 \
redis-cluster-sts-4.redis-cluster-headless:6379 \
redis-cluster-sts-5.redis-cluster-headless:6379

Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica redis-cluster-sts-4.redis-cluster-headless:6379 to redis-cluster-sts-0.redis-cluster-headless:6379
Adding replica redis-cluster-sts-5.redis-cluster-headless:6379 to redis-cluster-sts-1.redis-cluster-headless:6379
Adding replica redis-cluster-sts-3.redis-cluster-headless:6379 to redis-cluster-sts-2.redis-cluster-headless:6379
M: 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f redis-cluster-sts-0.redis-cluster-headless:6379
slots:[0-5460] (5461 slots) master
M: 21d7a21ef992f52c36ce4c1ed2be0d1a20102800 redis-cluster-sts-1.redis-cluster-headless:6379
slots:[5461-10922] (5462 slots) master
M: 1f0478896e6f5164bbd87e89f38b0ee5d3db0560 redis-cluster-sts-2.redis-cluster-headless:6379
slots:[10923-16383] (5461 slots) master
S: ed6237e68494d834a33420453e58fa47de9eeeaf redis-cluster-sts-3.redis-cluster-headless:6379
replicates 1f0478896e6f5164bbd87e89f38b0ee5d3db0560
S: 5c5fa8090096d12bb3c9616893405d027c84c1d5 redis-cluster-sts-4.redis-cluster-headless:6379
replicates 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f
S: e6bfa3ce7ddf65252e3d07a82fa13d3a18f3af7b redis-cluster-sts-5.redis-cluster-headless:6379
replicates 21d7a21ef992f52c36ce4c1ed2be0d1a20102800
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join

>>> Performing Cluster Check (using node redis-cluster-sts-0.redis-cluster-headless:6379)
M: 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f redis-cluster-sts-0.redis-cluster-headless:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
M: 21d7a21ef992f52c36ce4c1ed2be0d1a20102800 redis-cluster-sts-1.redis-cluster-headless:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: ed6237e68494d834a33420453e58fa47de9eeeaf redis-cluster-sts-3.redis-cluster-headless:6379
slots: (0 slots) slave
replicates 1f0478896e6f5164bbd87e89f38b0ee5d3db0560
S: 5c5fa8090096d12bb3c9616893405d027c84c1d5 redis-cluster-sts-4.redis-cluster-headless:6379
slots: (0 slots) slave
replicates 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f
M: 1f0478896e6f5164bbd87e89f38b0ee5d3db0560 redis-cluster-sts-2.redis-cluster-headless:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: e6bfa3ce7ddf65252e3d07a82fa13d3a18f3af7b redis-cluster-sts-5.redis-cluster-headless:6379
slots: (0 slots) slave
replicates 21d7a21ef992f52c36ce4c1ed2be0d1a20102800
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

查看集群状态、测试数据读写,进入任意一个pod执行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# kubectl -n test exec -it redis-cluster-sts-0 -- bash
root@redis-cluster-sts-0:/data# redis-cli -a hello_redis@234 -c

Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.

# CLUSTER NODES 查看集群节点,正常
127.0.0.1:6379> CLUSTER NODES
ed6237e68494d834a33420453e58fa47de9eeeaf redis-cluster-sts-3.redis-cluster-headless:6379@16379 slave 1f0478896e6f5164bbd87e89f38b0ee5d3db0560 0 1733304225719 3 connected
37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f redis-cluster-sts-0.redis-cluster-headless:6379@16379 myself,master - 0 1733294745577 1 connected 0-5460
5c5fa8090096d12bb3c9616893405d027c84c1d5 redis-cluster-sts-4.redis-cluster-headless:6379@16379 slave 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f 0 1733304227000 1 connected
21d7a21ef992f52c36ce4c1ed2be0d1a20102800 redis-cluster-sts-1.redis-cluster-headless:6379@16379 master - 0 1733304226000 2 connected 5461-10922
1f0478896e6f5164bbd87e89f38b0ee5d3db0560 redis-cluster-sts-2.redis-cluster-headless:6379@16379 master - 0 1733304227000 3 connected 10923-16383
e6bfa3ce7ddf65252e3d07a82fa13d3a18f3af7b redis-cluster-sts-5.redis-cluster-headless:6379@16379 slave 21d7a21ef992f52c36ce4c1ed2be0d1a20102800 0 1733304227729 2 connected


# CLUSTER INFO 查看集群信息:cluster_state:ok表示集群正常
127.0.0.1:6379> CLUSTER INFO
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:7
cluster_my_epoch:1
cluster_stats_messages_ping_sent:9439
cluster_stats_messages_pong_sent:9440
cluster_stats_messages_sent:18879
cluster_stats_messages_ping_received:9440
cluster_stats_messages_pong_received:9439
cluster_stats_messages_fail_received:1
cluster_stats_messages_received:18880
total_cluster_links_buffer_limit_exceeded:0

# 写入,读取key, 正常
127.0.0.1:6379> set boo foo
127.0.0.1:6379> get boo
"foo"

每个节点都有noeds.conf文件,包含了集群节点的主从关系,它是redis自己维护的,一般不需要去修改。它们内容应该是一致的(排序可能不一样),如果不一致说明集群有问题,需要排查

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# kubectl -n test exec -it redis-cluster-sts-0 -- cat /data/nodes.conf
ed6237e68494d834a33420453e58fa47de9eeeaf redis-cluster-sts-3.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 slave 1f0478896e6f5164bbd87e89f38b0ee5d3db0560 0 1733308120285 3 connected
21d7a21ef992f52c36ce4c1ed2be0d1a20102800 redis-cluster-sts-1.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f master - 0 1733308121000 2 connected 5461-10922
5c5fa8090096d12bb3c9616893405d027c84c1d5 redis-cluster-sts-4.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 slave 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f 0 1733308120000 1 connected
1f0478896e6f5164bbd87e89f38b0ee5d3db0560 redis-cluster-sts-2.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 master - 0 1733308121290 3 connected 10923-16383
37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f redis-cluster-sts-0.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 myself,master - 0 1733308028173 1 connected 0-5460
e6bfa3ce7ddf65252e3d07a82fa13d3a18f3af7b redis-cluster-sts-5.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f slave 21d7a21ef992f52c36ce4c1ed2be0d1a20102800 0 1733308122296 2 connected
vars currentEpoch 8 lastVoteEpoch 7

# kubectl -n test exec -it redis-cluster-sts-1 -- cat /data/nodes.conf
5c5fa8090096d12bb3c9616893405d027c84c1d5 redis-cluster-sts-4.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 slave 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f 0 1733308120871 1 connected
21d7a21ef992f52c36ce4c1ed2be0d1a20102800 redis-cluster-sts-1.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f myself,master - 0 1733308043534 2 connected 5461-10922
37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f redis-cluster-sts-0.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 master - 0 1733308121877 1 connected 0-5460
ed6237e68494d834a33420453e58fa47de9eeeaf redis-cluster-sts-3.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 slave 1f0478896e6f5164bbd87e89f38b0ee5d3db0560 0 1733308119865 3 connected
1f0478896e6f5164bbd87e89f38b0ee5d3db0560 redis-cluster-sts-2.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 master - 0 1733308121000 3 connected 10923-16383
e6bfa3ce7ddf65252e3d07a82fa13d3a18f3af7b redis-cluster-sts-5.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f slave 21d7a21ef992f52c36ce4c1ed2be0d1a20102800 0 1733308122884 2 connected
vars currentEpoch 8 lastVoteEpoch 0

# kubectl -n test exec -it redis-cluster-sts-2 -- cat /data/nodes.conf
1f0478896e6f5164bbd87e89f38b0ee5d3db0560 redis-cluster-sts-2.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 myself,master - 0 1733308059003 3 connected 10923-16383
e6bfa3ce7ddf65252e3d07a82fa13d3a18f3af7b redis-cluster-sts-5.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f slave 21d7a21ef992f52c36ce4c1ed2be0d1a20102800 0 1733308125906 2 connected
37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f redis-cluster-sts-0.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 master - 0 1733308124000 1 connected 0-5460
ed6237e68494d834a33420453e58fa47de9eeeaf redis-cluster-sts-3.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 slave 1f0478896e6f5164bbd87e89f38b0ee5d3db0560 0 1733308124901 3 connected
21d7a21ef992f52c36ce4c1ed2be0d1a20102800 redis-cluster-sts-1.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f master - 0 1733308123894 2 connected 5461-10922
5c5fa8090096d12bb3c9616893405d027c84c1d5 redis-cluster-sts-4.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 slave 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f 0 1733308123000 1 connected
vars currentEpoch 8 lastVoteEpoch 0

# kubectl -n test exec -it redis-cluster-sts-3 -- cat /data/nodes.conf
e6bfa3ce7ddf65252e3d07a82fa13d3a18f3af7b redis-cluster-sts-5.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f slave 21d7a21ef992f52c36ce4c1ed2be0d1a20102800 0 1733308124088 2 connected
21d7a21ef992f52c36ce4c1ed2be0d1a20102800 redis-cluster-sts-1.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f master - 0 1733308122000 2 connected 5461-10922
ed6237e68494d834a33420453e58fa47de9eeeaf redis-cluster-sts-3.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 myself,slave 1f0478896e6f5164bbd87e89f38b0ee5d3db0560 0 1733308069358 3 connected
5c5fa8090096d12bb3c9616893405d027c84c1d5 redis-cluster-sts-4.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 slave 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f 0 1733308123000 1 connected
1f0478896e6f5164bbd87e89f38b0ee5d3db0560 redis-cluster-sts-2.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 master - 0 1733308123082 3 connected 10923-16383
37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f redis-cluster-sts-0.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 master - 0 1733308122077 1 connected 0-5460
vars currentEpoch 8 lastVoteEpoch 0

# kubectl -n test exec -it redis-cluster-sts-4 -- cat /data/nodes.conf
e6bfa3ce7ddf65252e3d07a82fa13d3a18f3af7b redis-cluster-sts-5.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f slave 21d7a21ef992f52c36ce4c1ed2be0d1a20102800 0 1733308119478 2 connected
21d7a21ef992f52c36ce4c1ed2be0d1a20102800 redis-cluster-sts-1.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f master - 0 1733308118760 2 connected 5461-10922
5c5fa8090096d12bb3c9616893405d027c84c1d5 redis-cluster-sts-4.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 myself,slave 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f 0 1733308084722 1 connected
37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f redis-cluster-sts-0.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 master - 0 1733308118000 1 connected 0-5460
ed6237e68494d834a33420453e58fa47de9eeeaf redis-cluster-sts-3.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 slave 1f0478896e6f5164bbd87e89f38b0ee5d3db0560 0 1733308118000 3 connected
1f0478896e6f5164bbd87e89f38b0ee5d3db0560 redis-cluster-sts-2.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 master - 0 1733308118000 3 connected 10923-16383
vars currentEpoch 8 lastVoteEpoch 0

# kubectl -n test exec -it redis-cluster-sts-5 -- cat /data/nodes.conf
21d7a21ef992f52c36ce4c1ed2be0d1a20102800 redis-cluster-sts-1.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f master - 0 1733308116847 2 connected 5461-10922
1f0478896e6f5164bbd87e89f38b0ee5d3db0560 redis-cluster-sts-2.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 master - 0 1733308118000 3 connected 10923-16383
e6bfa3ce7ddf65252e3d07a82fa13d3a18f3af7b redis-cluster-sts-5.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=fa67a2b4a6644b7d71c0b3d7dfea54bb814fbf6f myself,slave 21d7a21ef992f52c36ce4c1ed2be0d1a20102800 0 1733308095410 2 connected
ed6237e68494d834a33420453e58fa47de9eeeaf redis-cluster-sts-3.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=a8ae7e1cf170794d0b089d9cf94230d7d9adef17 slave 1f0478896e6f5164bbd87e89f38b0ee5d3db0560 0 1733308117871 3 connected
5c5fa8090096d12bb3c9616893405d027c84c1d5 redis-cluster-sts-4.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 slave 37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f 0 1733308118395 1 connected
37818a6e8e5a31235e72c77e4daf5dba5bbc9e1f redis-cluster-sts-0.redis-cluster-headless:6379@16379,,tls-port=0,shard-id=10133e70314181c4d23d1558381740014c590c22 master - 0 1733308117000 1 connected 0-5460
vars currentEpoch 8 lastVoteEpoch 0

客户端如何连接

集群内其它pod可以通过 svc或pod的FQDN访问

  • 通过svc访问
1
2
redis-cluster-headless:6379                        # 同一个命名空间访问,短域名
redis-cluster-headless.test.svc.cluster.local:6379 # 不同命名空间访问,长域名
  • 通过pod的FQDN访问,6个节点
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# 同一个命名空间访问,短域名
redis-cluster-sts-0.redis-cluster-headless:6379
redis-cluster-sts-1.redis-cluster-headless:6379
redis-cluster-sts-2.redis-cluster-headless:6379
redis-cluster-sts-3.redis-cluster-headless:6379
redis-cluster-sts-4.redis-cluster-headless:6379
redis-cluster-sts-5.redis-cluster-headless:6379

# 不同命名空间访问,长域名
redis-cluster-sts-0.redis-cluster-headless.test.svc.cluster.local:6379
redis-cluster-sts-1.redis-cluster-headless.test.svc.cluster.local:6379
redis-cluster-sts-2.redis-cluster-headless.test.svc.cluster.local:6379
redis-cluster-sts-3.redis-cluster-headless.test.svc.cluster.local:6379
redis-cluster-sts-4.redis-cluster-headless.test.svc.cluster.local:6379
redis-cluster-sts-5.redis-cluster-headless.test.svc.cluster.local:6379

推荐第2种,写6个节点的主机名与端口,客户端的库保证高可用

如:nacos配置可以修改成如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
redis:
timeout: 30000
lettuce:
pool:
max-active: 10
max-idle: 5
max-wait: 3000
min-idle: 2
password: hello_redis@234
cluster:
nodes:
- redis-cluster-sts-0.redis-cluster-headless.qifu-uat.svc.cluster.local:6379
- redis-cluster-sts-1.redis-cluster-headless.qifu-uat.svc.cluster.local:6379
- redis-cluster-sts-2.redis-cluster-headless.qifu-uat.svc.cluster.local:6379
- redis-cluster-sts-3.redis-cluster-headless.qifu-uat.svc.cluster.local:6379
- redis-cluster-sts-4.redis-cluster-headless.qifu-uat.svc.cluster.local:6379
- redis-cluster-sts-5.redis-cluster-headless.qifu-uat.svc.cluster.local:6379
max-redirects: 3

各种测试

对redis集群进行 缩容、升级、重启,验证redis集群是否正常

重启

重启redis集群。如下可以看到集群正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# kubectl -n test rollout restart sts redis-cluster-sts 
statefulset.apps/redis-cluster-sts restarted

# kubectl -n test exec -it redis-cluster-sts-0 -- redis-cli -a hello_redis@234 -c
127.0.0.1:6379> CLUSTER INFO
cluster_state:ok # 集群正常
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:7
cluster_my_epoch:1
cluster_stats_messages_ping_sent:135
cluster_stats_messages_pong_sent:85
cluster_stats_messages_sent:220
cluster_stats_messages_ping_received:85
cluster_stats_messages_pong_received:135
cluster_stats_messages_received:220
total_cluster_links_buffer_limit_exceeded:0

# 写入,读取key, 正常
127.0.0.1:6379> set boo foo
127.0.0.1:6379> get boo
"foo"

另外一种重启,先缩容为0个,然后扩容为6,可以看到集群依然正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
kubectl -n test scale statefulset redis-cluster-sts --replicas 0 && sleep 30
kubectl -n midd scale statefulset redis-cluster-sts --replicas 6

# 等待所有pod都正常启动, 查看状态

# kubectl -n test exec -it redis-cluster-sts-0 -- redis-cli -a hello_redis@234 -c
127.0.0.1:6379> CLUSTER INFO
cluster_state:ok # 集群正常
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:7
cluster_my_epoch:1
cluster_stats_messages_ping_sent:135
cluster_stats_messages_pong_sent:85
cluster_stats_messages_sent:220
cluster_stats_messages_ping_received:85
cluster_stats_messages_pong_received:135
cluster_stats_messages_received:220
total_cluster_links_buffer_limit_exceeded:0

# 写入,读取key, 正常
127.0.0.1:6379> set boo foo
127.0.0.1:6379> get boo
"foo"
缩容
  • 现在6个节点,缩容为5个,这样模拟一个节点故障,可以看到集群正常
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# kubectl -n test scale statefulset redis-cluster-sts --replicas 5  

# kubectl -n test exec -it redis-cluster-sts-0 -- redis-cli -a hello_redis@234 -c
127.0.0.1:6379> CLUSTER INFO
cluster_state:ok # 集群正常
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:7
cluster_my_epoch:1
cluster_stats_messages_ping_sent:135
cluster_stats_messages_pong_sent:85
cluster_stats_messages_sent:220
cluster_stats_messages_ping_received:85
cluster_stats_messages_pong_received:135
cluster_stats_messages_received:220
total_cluster_links_buffer_limit_exceeded:0

# 写入,读取key, 正常
127.0.0.1:6379> set boo foo
127.0.0.1:6379> get boo
"foo"
升级

将redis版本升级,集群是否正常?在本例中将7.4.0升级到7.4.1

1
kubectl -n test set image sts redis-cluster-sts redis-cluster=docker-0.unsee.tech/redis:7.4.1

同理 查看集群状态,数据可以读写表示正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
127.0.0.1:6379> CLUSTER INFO   # 状态正常
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:7
cluster_my_epoch:1
cluster_stats_messages_ping_sent:9439
cluster_stats_messages_pong_sent:9440
cluster_stats_messages_sent:18879
cluster_stats_messages_ping_received:9440
cluster_stats_messages_pong_received:9439
cluster_stats_messages_fail_received:1
cluster_stats_messages_received:18880
total_cluster_links_buffer_limit_exceeded:0

# 写入,读取key, 正常
127.0.0.1:6379> set boo foo
127.0.0.1:6379> get boo
"foo"
其它

k8s中部署redis集群最重要的就是配置--cluster-announce-ip通告地址,因为pod的变化的,重启后pod地址变化了导致集群失败

  • 强烈建议配置--cluster-announce-ip 为 pod的FQDN如下
1
2
3
4
5
6
7
8
9
10
11
12
13
env:
- name: TZ
value: Asia/Shanghai
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_SERVICE_NAME
value: "redis-cluster-headless"
args:
- /etc/redis/redis-cluster.conf
# 配置为短域名
- --cluster-announce-ip "$(POD_NAME).$(POD_SERVICE_NAME)"
  • 不建议是pod的ip
1
2
3
4
5
6
7
8
9
10
env:
- name: TZ
value: Asia/Shanghai
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
- /etc/redis/redis-cluster.conf
- --cluster-announce-ip $(POD_IP)

这也是很多网上这样配置的。这个配置在 reids 升级(换镜像)、缩容为0在扩容为6时 集群失效,特别要注意

所以推荐第1种方式,但是需要7.0x或以上的版本。

至此,Redis三主三从集群就部署完成了。

数据迁移

使用Redis-shake迁移

介绍

redis-shake是阿里云Redis&MongoDB团队开源的用于redis数据同步的工具,用于在两个 redis之 间同步数据的工具,满足用户非常灵活的同步、迁移需求。

基本功能

redis-shake是阿里基于redis-port基础上进行改进的一款产品。它支持解析恢复备份同步四个功能。以下主要介绍同步sync。

  • 恢复restore:将RDB文件恢复到目的redis数据库。

  • 备份dump:将源redis的全量数据通过RDB文件备份起来。

  • 解析decode:对RDB文件进行读取,并以json格式解析存储。

  • 同步sync:支持源redis和目的redis的数据同步,支持全量和增量数据的迁移,支持从云下到阿里云云上的同步,也支持云下到云下不同环境的同步,支持单节点、主从版、集群版之间的互相同步。需要注意的是,如果源端是集群版,可以启动一个RedisShake,从不同的db结点进行拉取,同时源端不能开启move slot功能;对于目的端,如果是集群版,写入可以是1个或者多个db结点。

  • 同步rump:支持源redis和目的redis的数据同步,仅支持全量的迁移。采用scan和restore命令进行迁移,支持不同云厂商不同redis版本的迁移。

基本原理

redis-shake的基本原理就是模拟一个从节点加入源redis集群,首先进行全量拉取并回放,然后进行增量的拉取(通过psync命令)。如下图所示:

如果源端是集群模式,只需要启动一个redis-shake进行拉取,同时不能开启源端的move slot操作。如果目的端是集群模式,可以写入到一个结点,然后再进行slot的迁移,当然也可以多对多写入。

​ 目前,redis-shake到目的端采用单链路实现,对于正常情况下,这不会成为瓶颈,但对于极端情况,qps比较大的时候,此部分性能可能成为瓶颈,后续我们可能会计划对此进行优化。另外,redis-shake到目的端的数据同步采用异步的方式,读写分离在2个线程操作,降低因为网络时延带来的同步性能下降。

迁移
安装redis-shake
1
2
[root]# wget https://github.com/alibaba/RedisShake/releases/download/release-v2.1.1-20210903/release-v2.1.1-20210903.tar.gz
[root]# tar -zxvf release-v2.1.1-20210903.tar.gz
配置参数文件

配置redis-shake.conf参数文件(修改的部分)

1
2
3
4
5
6
7
8
9
[root]# cd release-v2.1.1-20210903
[root]# vi redis-shake.conf
source.type = standalone #源端架构类型
source.address = 127.0.0.1:6379 #源端IP:PORT
source.password_raw = Passwd@123 #源端密码
target.type = cluster #目的端架构类型
target.address = 10.150.57.13:6381;10.150.57.13:6382;10.150.57.13:6383 #目的端IP:PORT(Redis Cluster的Master或者Slave)
target.password_raw = Passwd@123 #目的端密码
key_exists = rewrite #如果目的端有同样的键值对,则覆盖
迁移

启动redis-shake

1
[root]# ./redis-shake.linux -type=sync -conf=redis-shake.conf

redis-full-check校验工具

简介

redis-full-check是阿里云Redis&MongoDB团队开源的用于校验2个redis数据是否一致的工具,通常用于redis数据迁移(redis-shake)后正确性的校验。

​ 支持:单节点、主从版、集群版、带proxy的云上集群版(阿里云)之间的同构或者异构对比,版本支持2.x-5.x。

基本原理

下图是基本的逻辑比较:

​ redis-full-check通过全量对比源端和目的端的redis中的数据的方式来进行数据校验,其比较方式通过多轮次比较:每次都会抓取源和目的端的数据进行差异化比较,记录不一致的数据进入下轮对比。然后通过多伦比较不断收敛,减少因数据增量同步导致的源库和目的库的数据不一致。最后sqlite中存在的数据就是最终的差异结果。

​ redis-full-check对比的方向是单向:抓取源库A的数据,然后检测是否位于B中,反向不会检测,也就是说,它检测的是源库是否是目的库的子集。如果希望对比双向,则需要对比2次,第一次以A为源库,B为目的库,第二次以B为源库,A为目的库。

​ 下图是基本的数据流图,redis-full-check内部分为多轮比较,也就是黄色框所指示的部分。每次比较,会先抓取比较的key,第一轮是从源库中进行抓取,后面轮次是从sqlite3 db中进行抓取;抓取key之后是分别抓取key对应的field和value进行对比,然后将存在差异的部分存入sqlite3 db中,用于下次比较。

不一致类型

redis-full-check判断不一致的方式主要分为2类:key不一致和value不一致。

key不一致

key不一致主要分为以下几种情况:

  • lack_target : key存在于源库,但不存在于目的库。
  • type: key存在于源库和目的库,但是类型不一致。
  • value: key存在于源库和目的库,且类型一致,但是value不一致。

value不一致

不同数据类型有不同的对比标准:

  • string: value不同。
  • hash: 存在field,满足下面3个条件之一:
    • field存在于源端,但不存在与目的端。
    • field存在于目的端,但不存在与源端。
    • field同时存在于源和目的端,但是value不同。
  • set/zset:与hash类似。
  • list: 与hash类似。

field冲突类型有以下几种情况(只存在于hash,set,zset,list类型key中):

  • lack_source: field存在于源端key,field不存在与目的端key。
  • lack_target: field不存在与源端key,field存在于目的端key。
  • value: field存在于源端key和目的端key,但是field对应的value不同。
比较原理

对比模式(comparemode)有三种可选:

  • KeyOutline:只对比key值是否相等。
  • ValueOutline:只对比value值的长度是否相等。
  • FullValue:对比key值、value长度、value值是否相等。

对比会进行comparetimes轮(默认comparetimes=3)比较:

  • 第一轮,首先找出在源库上所有的key,然后分别从源库和目的库抓取进行比较。
  • 第二轮开始迭代比较,只比较上一轮结束后仍然不一致的key和field。
    • 对于key不一致的情况,包括lack_source ,lack_target 和type,从源库和目的库重新取key、value进行比较。
    • value不一致的string,重新比较key:从源和目的取key、value比较。
    • value不一致的hash、set和zset,只重新比较不一致的field,之前已经比较且相同的filed不再比较。这是为了防止对于大key情况下,如果更新频繁,将会导致校验永远不通过的情况。
    • value不一致的list,重新比较key:从源和目的取key、value比较。
  • 每轮之间会停止一定的时间(Interval)。

对于hash,set,zset,list大key处理采用以下方式:

  • len <= 5192,直接取全量field、value进行比较,使用如下命令:hgetall,smembers,zrange 0 -1 withscores,lrange 0 -1。
  • len > 5192,使用hscan,sscan,zscan,lrange分批取field和value。
校验
安装

GitHub:https://github.com/alibaba/RedisFullCheck

安装redis-full-check

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
[root]# wget https://github.com/alibaba/RedisFullCheck/releases/download/release-v1.4.8-20200212/redis-full-check-1.4.8.tar.gz
[root]# tar -zxvf redis-full-check-1.4.8.tar.gz
redis-full-check-1.4.8/
redis-full-check-1.4.8/redis-full-check
redis-full-check-1.4.8/ChangeLog
[root]# redis-full-check-1.4.8
[root]# ./redis-full-check --help
Usage:
redis-full-check [OPTIONS]

Application Options:
-s, --source=SOURCE Set host:port of source redis. If db type is cluster, split by semicolon(;'), e.g.,
10.1.1.1:1000;10.2.2.2:2000;10.3.3.3:3000. We also support auto-detection, so "master@10.1.1.1:1000" or
"slave@10.1.1.1:1000" means choose master or slave. Only need to give a role in the master or slave.
-p, --sourcepassword=Password Set source redis password
--sourceauthtype=AUTH-TYPE useless for opensource redis, valid value:auth/adminauth (default: auth)
--sourcedbtype= 0: db, 1: cluster 2: aliyun proxy, 3: tencent proxy (default: 0)
--sourcedbfilterlist= db white list that need to be compared, -1 means fetch all, "0;5;15" means fetch db 0, 5, and 15 (default: -1)
-t, --target=TARGET Set host:port of target redis. If db type is cluster, split by semicolon(;'), e.g.,
10.1.1.1:1000;10.2.2.2:2000;10.3.3.3:3000. We also support auto-detection, so "master@10.1.1.1:1000" or
"slave@10.1.1.1:1000" means choose master or slave. Only need to give a role in the master or slave.
-a, --targetpassword=Password Set target redis password
--targetauthtype=AUTH-TYPE useless for opensource redis, valid value:auth/adminauth (default: auth)
--targetdbtype= 0: db, 1: cluster 2: aliyun proxy 3: tencent proxy (default: 0)
--targetdbfilterlist= db white list that need to be compared, -1 means fetch all, "0;5;15" means fetch db 0, 5, and 15 (default: -1)
-d, --db=Sqlite3-DB-FILE sqlite3 db file for store result. If exist, it will be removed and a new file is created. (default: result.db)
--result=FILE store all diff result into the file, format is 'db diff-type key field'
--comparetimes=COUNT Total compare count, at least 1. In the first round, all keys will be compared. The subsequent rounds of the
comparison will be done on the previous results. (default: 3)
-m, --comparemode= compare mode, 1: compare full value, 2: only compare value length, 3: only compare keys outline, 4: compare full
value, but only compare value length when meets big key (default: 2)
--id= used in metric, run id, useless for open source (default: unknown)
--jobid= used in metric, job id, useless for open source (default: unknown)
--taskid= used in metric, task id, useless for open source (default: unknown)
-q, --qps= max batch qps limit: e.g., if qps is 10, full-check fetches 10 * $batch keys every second (default: 15000)
--interval=Second The time interval for each round of comparison(Second) (default: 5)
--batchcount=COUNT the count of key/field per batch compare, valid value [1, 10000] (default: 256)
--parallel=COUNT concurrent goroutine number for comparison, valid value [1, 100] (default: 5)
--log=FILE log file, if not specified, log is put to console
--loglevel=LEVEL log level: 'debug', 'info', 'warn', 'error', default is 'info'
--metric print metric in log
--bigkeythreshold=COUNT
-f, --filterlist=FILTER if the filter list isn't empty, all elements in list will be synced. The input should be split by '|'. The end of
the string is followed by a * to indicate a prefix match, otherwise it is a full match. e.g.: 'abc*|efg|m*'
matches 'abc', 'abc1', 'efg', 'm', 'mxyz', but 'efgh', 'p' aren't'
--systemprofile=SYSTEM-PROFILE port that used to print golang inner head and stack message (default: 20445)
-v, --version

Help Options:
-h, --help Show this help message

参数解释:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-s, --source=SOURCE               源redis库地址(ip:port),如果是集群版,那么需要以分号(;)分割不同的db,只需要配置主或者从的其中之一。例如:10.1.1.1:1000;10.2.2.2:2000;10.3.3.3:3000
-p, --sourcepassword=Password 源redis库密码
--sourceauthtype=AUTH-TYPE 源库管理权限,开源reids下此参数无用。
--sourcedbtype= 源库的类别,0:db(standalone单节点、主从),1: cluster(集群版),2: 阿里云
--sourcedbfilterlist= 源库需要抓取的逻辑db白名单,以分号(;)分割,例如:0;5;15表示db0,db5和db15都会被抓取
-t, --target=TARGET 目的redis库地址(ip:port)
-a, --targetpassword=Password 目的redis库密码
--targetauthtype=AUTH-TYPE 目的库管理权限,开源reids下此参数无用。
--targetdbtype= 参考sourcedbtype
--targetdbfilterlist= 参考sourcedbfilterlist
-d, --db=Sqlite3-DB-FILE 对于差异的key存储的sqlite3 db的位置,默认result.db
--comparetimes=COUNT 比较轮数
-m, --comparemode= 比较模式,1表示全量比较,2表示只对比value的长度,3只对比key是否存在,4全量比较的情况下,忽略大key的比较
--id= 用于打metric
--jobid= 用于打metric
--taskid= 用于打metric
-q, --qps= qps限速阈值
--interval=Second 每轮之间的时间间隔
--batchcount=COUNT 批量聚合的数量
--parallel=COUNT 比较的并发协程数,默认5
--log=FILE log文件
--result=FILE 不一致结果记录到result文件中,格式:'db diff-type key field'
--metric=FILE metric文件
--bigkeythreshold=COUNT 大key拆分的阈值,用于comparemode=4
-f, --filterlist=FILTER 需要比较的key列表,以分号(;)分割。例如:"abc*|efg|m*"表示对比'abc', 'abc1', 'efg', 'm', 'mxyz',不对比'efgh', 'p'
-v, --version
校验

校验源端与目的端键值对

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
[root]# ./redis-full-check --source=10.150.57.9:6379 --sourcepassword= --sourcedbtype=0 --target="10.150.57.13:6381;10.150.57.13:6382;10.150.57.13:6383" --targetpassword=Gaoyu@029 --targetdbtype=1 --comparemode=1 --qps=10 --batchcount=1000 --parallel=10
[root@guizhou_hp-pop-10-150-57-9 redis-full-check-1.4.8]# ./redis-full-check --source=10.150.57.9:6379 --sourcepassword= --sourcedbtype=0 --target="10.150.57.13:6381;10.150.57.13:6382;10.150.57.13:6383" --targetpassword=Gaoyu@029 --targetdbtype=1 --comparemode=1 --qps=10 --batchcount=1000 --parallel=10
[INFO 2021-12-03-10:47:25 main.go:65]: init log success
[INFO 2021-12-03-10:47:25 main.go:168]: configuration: {10.150.57.9:6379 auth 0 -1 10.150.57.13:6381;10.150.57.13:6382;10.150.57.13:6383 Gaoyu@029 auth 1 -1 result.db 3 1 unknown unknown unknown 10 5 1000 10 false 16384 20445 false}
[INFO 2021-12-03-10:47:25 main.go:170]: ---------
[INFO 2021-12-03-10:47:25 full_check.go:238]: sourceDbType=0, p.sourcePhysicalDBList=[meaningless]
[INFO 2021-12-03-10:47:25 full_check.go:243]: db=0:keys=9
[INFO 2021-12-03-10:47:25 full_check.go:253]: ---------------- start 1th time compare
[INFO 2021-12-03-10:47:25 full_check.go:278]: start compare db 0
[INFO 2021-12-03-10:47:25 scan.go:20]: build connection[source redis addr: [10.150.57.9:6379]]
[INFO 2021-12-03-10:47:26 full_check.go:203]: stat:
times:1, db:0, dbkeys:9, finish:33%, finished:true
KeyScan:{9 9 0}
KeyEqualInProcess|string|equal|{9 9 0}

[INFO 2021-12-03-10:47:26 full_check.go:250]: wait 5 seconds before start
[INFO 2021-12-03-10:47:31 full_check.go:253]: ---------------- start 2th time compare
[INFO 2021-12-03-10:47:31 full_check.go:278]: start compare db 0
[INFO 2021-12-03-10:47:31 full_check.go:203]: stat:
times:2, db:0, finished:true
KeyScan:{0 0 0}

[INFO 2021-12-03-10:47:31 full_check.go:250]: wait 5 seconds before start
[INFO 2021-12-03-10:47:36 full_check.go:253]: ---------------- start 3th time compare
[INFO 2021-12-03-10:47:36 full_check.go:278]: start compare db 0
[INFO 2021-12-03-10:47:36 full_check.go:203]: stat:
times:3, db:0, finished:true
KeyScan:{0 0 0}

[INFO 2021-12-03-10:47:36 full_check.go:328]: --------------- finished! ----------------
all finish successfully, totally 0 key(s) and 0 field(s) conflict

校验完毕,没有键冲突。

Redis集群监控

redis-exporter安装

redis-cluster-exporter.yaml:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-cluster-exporter
namespace: kubesphere-monitoring-system
labels:
k8s-app: redis-cluster-exporter
spec:
selector:
matchLabels:
k8s-app: redis-cluster-exporter
template:
metadata:
labels:
k8s-app: redis-cluster-exporter
spec:
containers:
- name: redis-cluster-exporter
image: oliver006/redis_exporter:latest
args:
- '-redis.addr'
- 'redis-cluster-headless.test.svc.cluster.local:6379'
- '-redis.password'
- 'Passwd@123'
ports:
- containerPort: 9121
name: http
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: redis-cluster-exporter
name: redis-cluster-exporter
namespace: kubesphere-monitoring-system
spec:
ports:
- name: http
port: 9121
targetPort: http
selector:
k8s-app: redis-cluster-exporter

查看redis-exporter容器日志:

添加Prometheus配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
- job_name: 'redis-cluster-exporter-uat_target'
static_configs:
- targets:
- redis://redis-cluster-sts-0.redis-cluster-headless.test.svc.cluster.local:6379
- redis://redis-cluster-sts-1.redis-cluster-headless.test.svc.cluster.local:6379
- redis://redis-cluster-sts-2.redis-cluster-headless.test.svc.cluster.local:6379
- redis://redis-cluster-sts-3.redis-cluster-headless.test.svc.cluster.local:6379
- redis://redis-cluster-sts-4.redis-cluster-headless.test.svc.cluster.local:6379
- redis://redis-cluster-sts-5.redis-cluster-headless.test.svc.cluster.local:6379
labels:
env: uat
cluster: uat-redis-cluster
metrics_path: /scrape
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: redis-cluster-exporter.kubesphere-monitoring-system.svc.cluster.local:9121
- job_name: 'redis-cluster-exporter-uat'
static_configs:
- targets:
- redis-cluster-exporter.kubesphere-monitoring-system.svc.cluster.local:9121

在Prometheus查看是否有数据收集:

创建Grafana可视化数据

导入仪表盘,id为763:

此时的仪表盘还不可以按集群和角色去筛选数据,需要修改下变量:

修改完成保存即可:

或者使用json,创建一个仪表盘:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"id": 66,
"iteration": 1606455383511,
"links": [],
"panels": [
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": true,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "prometheus",
"description": "集群个数",
"fieldConfig": {
"defaults": {
"custom": {}
},
"overrides": []
},
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 3,
"w": 5,
"x": 0,
"y": 0
},
"id": 8,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false,
"ymax": null,
"ymin": null
},
"tableColumn": "",
"targets": [
{
"expr": "count(count(redis_up{cluster=~\"$cluster\"}>0) by (cluster))",
"interval": "",
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"timeFrom": null,
"timeShift": null,
"title": "集群个数",
"type": "singlestat",
"valueFontSize": "100%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": true,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "prometheus",
"description": "集群节点数量",
"fieldConfig": {
"defaults": {
"custom": {}
},
"overrides": []
},
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 3,
"w": 5,
"x": 5,
"y": 0
},
"id": 4,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false,
"ymax": null,
"ymin": null
},
"tableColumn": "",
"targets": [
{
"expr": "count(redis_up{cluster=~\"$cluster\"})",
"interval": "",
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"timeFrom": null,
"timeShift": null,
"title": "集群节点数",
"type": "singlestat",
"valueFontSize": "100%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": true,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "prometheus",
"description": "集群活跃节点数量",
"fieldConfig": {
"defaults": {
"custom": {}
},
"overrides": []
},
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 3,
"w": 4,
"x": 10,
"y": 0
},
"id": 6,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false,
"ymax": null,
"ymin": null
},
"tableColumn": "",
"targets": [
{
"expr": "count(redis_up{cluster=~\"$cluster\"}>0)",
"interval": "",
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"timeFrom": null,
"timeShift": null,
"title": "集群活跃节点数",
"type": "singlestat",
"valueFontSize": "100%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorPrefix": false,
"colorValue": true,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "prometheus",
"description": "主节点数量",
"fieldConfig": {
"defaults": {
"custom": {}
},
"overrides": []
},
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 3,
"w": 5,
"x": 14,
"y": 0
},
"id": 10,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false,
"ymax": null,
"ymin": null
},
"tableColumn": "",
"targets": [
{
"expr": "count(redis_instance_info{cluster=~\"$cluster\", role=\"master\"})",
"interval": "",
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"timeFrom": null,
"timeShift": null,
"title": "主节点个数",
"type": "singlestat",
"valueFontSize": "100%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": true,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "prometheus",
"description": "从节点数量",
"fieldConfig": {
"defaults": {
"custom": {}
},
"overrides": []
},
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 3,
"w": 5,
"x": 19,
"y": 0
},
"id": 12,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false,
"ymax": null,
"ymin": null
},
"tableColumn": "",
"targets": [
{
"expr": "count(redis_instance_info{cluster=~\"$cluster\", role=\"slave\"})",
"interval": "",
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "",
"timeFrom": null,
"timeShift": null,
"title": "从节点个数",
"type": "singlestat",
"valueFontSize": "100%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "Redus Cluster OPS指标",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 3
},
"hiddenSeries": false,
"id": 16,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "ceil(sum(rate(redis_commands_processed_total{cluster=~\"$cluster\",instance=~\"$master\"}[$interval])) by (cluster))",
"interval": "5s",
"legendFormat": "{{cluster}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Redis Cluster OPS",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "Redis 内存使用大小值",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 3
},
"hiddenSeries": false,
"id": 18,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "redis_memory_used_bytes{instance=~\"$instance\"} ",
"interval": "5s",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Redis 内存使用大小",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "Redis Server CPU使用率",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 11
},
"hiddenSeries": false,
"id": 14,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "100* (rate(redis_cpu_sys_seconds_total{instance=~\"$instance\"}[$interval]) + rate(redis_cpu_user_seconds_total{instance=~\"$instance\"}[$interval]))",
"interval": "5s",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Redis CPU使用率",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "percent",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "Redis 内存使用率",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 11
},
"hiddenSeries": false,
"id": 20,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "(redis_memory_used_bytes{instance=~\"$instance\"} / redis_config_maxmemory{instance=~\"$instance\"}) * 100",
"interval": "5s",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Redis 内存使用率",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "percent",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "各个节点系统CPU使用率",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 19
},
"hiddenSeries": false,
"id": 24,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "topk(3,clamp_max((avg by (hostname,mode) ((clamp_max(rate(node_cpu_seconds_total{hostname=~\"$host\",mode!=\"idle\"}[$interval]),1)) or (clamp_max(irate(node_cpu_seconds_total{hostname=~\"$host\",mode!=\"idle\"}[5m]),1)) ))*100,100))",
"hide": true,
"interval": "",
"legendFormat": "{{mode}}",
"refId": "A"
},
{
"expr": "1 - avg by (hostname)(rate(node_cpu_seconds_total{hostname=~\"$host\", mode=\"idle\"}[$interval]))",
"interval": "",
"legendFormat": "{{hostname}}",
"refId": "B"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "系统CPU使用率",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "各个节点系统物理内存利用率",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 19
},
"hiddenSeries": false,
"id": 30,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "(node_memory_Cached_bytes{hostname=~\"$host\"} + node_memory_Buffers_bytes{hostname=~\"$host\"} + node_memory_MemFree_bytes{hostname=~\"$host\"})/node_memory_MemTotal_bytes{hostname=~\"$host\"}",
"hide": true,
"interval": "",
"legendFormat": "{{hostname}}",
"refId": "A"
},
{
"expr": "((node_memory_MemTotal_bytes{hostname=~\"$host\"} - (node_memory_MemAvailable_bytes{hostname=~\"$host\"} or (node_memory_MemFree_bytes{hostname=~\"$host\"} + node_memory_Buffers_bytes{hostname=~\"$host\"} + node_memory_Cached_bytes{hostname=~\"$host\"})))*100 / node_memory_MemTotal_bytes{hostname=~\"$host\"})",
"hide": false,
"interval": "",
"legendFormat": "{{hostname}}",
"refId": "B"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "系统物理内存利用率",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "percent",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": true,
"cacheTimeout": null,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "各个节点系统磁盘使用率",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 27
},
"hiddenSeries": false,
"id": 32,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": false,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "topk(10,max((1 - node_filesystem_avail_bytes{hostname=~\"$host\", fstype!~\"rootfs|selinuxfs|autofs|rpc_pipefs|tmpfs\"} / node_filesystem_size_bytes{hostname=~\"$host\", fstype!~\"rootfs|selinuxfs|autofs|rpc_pipefs|tmpfs\"})*100) by (hostname) > 0)",
"interval": "",
"legendFormat": "{{hostname}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "系统磁盘使用率",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "series",
"name": null,
"show": true,
"values": [
"current"
]
},
"yaxes": [
{
"format": "percent",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "各个节点以及集群的 KEYS 总数",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 27
},
"hiddenSeries": false,
"id": 22,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(redis_db_keys{instance=~\"$master\"}) by (instance,db)",
"interval": "",
"legendFormat": "{{instance}}",
"refId": "A"
},
{
"expr": "sum(redis_db_keys{instance=~\"$master\"}) by (cluster)",
"interval": "",
"legendFormat": "{{cluster}}_total",
"refId": "B"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "KEYS 总数",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "各个节点活跃连接数",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 35
},
"hiddenSeries": false,
"id": 26,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "redis_connected_clients{instance=~\"$instance\"}",
"interval": "",
"legendFormat": "{{hostip}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "活跃连接数",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "各个节点内存碎片率",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 35
},
"hiddenSeries": false,
"id": 28,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "redis_mem_fragmentation_ratio{instance=~\"$instance\"}",
"interval": "",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "内存碎片率",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "各个节点 的负载",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 43
},
"hiddenSeries": false,
"id": 34,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"percentage": false,
"pluginVersion": "7.1.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node_load1{hostname=~\"$host\"}",
"interval": "",
"legendFormat": "{{hostname}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "负载(Load)",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
}
],
"refresh": "5s",
"schemaVersion": 26,
"style": "dark",
"tags": [],
"templating": {
"list": [
{
"auto": false,
"auto_count": 100,
"auto_min": "1s",
"current": {
"selected": false,
"text": "1m",
"value": "1m"
},
"hide": 0,
"label": "interval",
"name": "interval",
"options": [
{
"selected": true,
"text": "1m",
"value": "1m"
},
{
"selected": false,
"text": "10m",
"value": "10m"
},
{
"selected": false,
"text": "30m",
"value": "30m"
},
{
"selected": false,
"text": "1h",
"value": "1h"
},
{
"selected": false,
"text": "6h",
"value": "6h"
},
{
"selected": false,
"text": "12h",
"value": "12h"
},
{
"selected": false,
"text": "1d",
"value": "1d"
}
],
"query": "1m,10m,30m,1h,6h,12h,1d",
"queryValue": "",
"refresh": 2,
"skipUrlSync": false,
"type": "interval"
},
{
"allValue": null,
"current": {
"selected": true,
"text": "All",
"value": [
"$__all"
]
},
"datasource": "prometheus",
"definition": "label_values(redis_up,cluster)",
"hide": 0,
"includeAll": true,
"label": "Cluster",
"multi": true,
"name": "cluster",
"options": [],
"query": "label_values(redis_up,cluster)",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"selected": true,
"text": "All",
"value": [
"$__all"
]
},
"datasource": "prometheus",
"definition": "label_values(redis_instance_info,role)",
"hide": 0,
"includeAll": true,
"label": "Role",
"multi": true,
"name": "role",
"options": [],
"query": "label_values(redis_instance_info,role)",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"selected": true,
"text": "All",
"value": [
"$__all"
]
},
"datasource": "prometheus",
"definition": "label_values(redis_instance_info{cluster=~\"$cluster\",role=~\"$role\"},instance)",
"hide": 2,
"includeAll": true,
"label": "instance",
"multi": true,
"name": "instance",
"options": [],
"query": "label_values(redis_instance_info{cluster=~\"$cluster\",role=~\"$role\"},instance)",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"selected": true,
"text": "All",
"value": [
"$__all"
]
},
"datasource": "prometheus",
"definition": "label_values(redis_instance_info{cluster=~\"$cluster\",role=\"master\"},instance)",
"hide": 2,
"includeAll": true,
"label": "master",
"multi": true,
"name": "master",
"options": [],
"query": "label_values(redis_instance_info{cluster=~\"$cluster\",role=\"master\"},instance)",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"selected": true,
"text": "All",
"value": [
"$__all"
]
},
"datasource": "prometheus",
"definition": "label_values(redis_up{cluster=~\"$cluster\"}, hostip)",
"hide": 2,
"includeAll": true,
"label": "host",
"multi": true,
"name": "host",
"options": [],
"query": "label_values(redis_up{cluster=~\"$cluster\"}, hostip)",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-5m",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
]
},
"timezone": "",
"title": "Redis_Cluster监控",
"uid": "5FfBHG3Zzddsds",
"version": 1
}
Thank you for your accept. mua!
-------------本文结束感谢您的阅读-------------