Cluster Scaling: Adding and Removing Nodes
How to remove a node from a Cozystack cluster
When a cluster node fails, Cozystack automatically handles high availability by recreating replicated PVCs and workloads on other nodes. However, there can be issues that require removing the node to resolve:
Local storage PVs may remain bound to the failed node, which can cause issues with new pods. These need to be cleaned up manually.
The failed node will still exist in the cluster, which can lead to inconsistencies in the cluster state and affect pod scheduling.
Step 1: Remove the Node from the Cluster
Run the following command to remove the failed node (replace mynode with the actual node name):
kubectl delete node mynode
If the failed node is a control-plane node, you must also remove its etcd member from the etcd cluster:
talm -f nodes/node1.yaml etcd member list
Example output:
NODE ID HOSTNAME PEER URLS CLIENT URLS LEARNER
37.27.60.28 2ba6e48b8cf1a0c1 node1 https://192.168.100.11:2380 https://192.168.100.11:2379 false
37.27.60.28 b82e2194fb76ee42 node2 https://192.168.100.12:2380 https://192.168.100.12:2379 false
37.27.60.28 f24f4de3d01e5e88 node3 https://192.168.100.13:2380 https://192.168.100.13:2379 false
Then remove the corresponding member (replace the ID with the one for your failed node):
talm -f nodes/node1.yaml etcd remove-member f24f4de3d01e5e88
Step 2: Remove PVCs and Pods Bound to the Failed Node
Here are few commands to help you clean up the failed node:
Delete PVCs bound to the failed node:
(Replacemynode
with the name of your failed node)kubectl get pv -o json | jq -r '.items[] | select(.spec.nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values[0] == "mynode").spec.claimRef | "kubectl delete pvc -n \(.namespace) \(.name)"' | sh -x
Delete pods stuck in
Pending
state across all namespaces:kubectl get pod -A | awk '/Pending/ {print "kubectl delete pod -n " $1 " " $2}' | sh -x
Step 3: Check Resource Status
After cleanup, check for any resource issues using linstor advise
:
# linstor advise resource
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Resource ┊ Issue ┊ Possible fix ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ pvc-02b0c0a1-e0b6-4e98-9384-60ff24f3b3b6 ┊ Resource expected to have 3 replicas, got only 2. ┊ linstor rd ap --place-count 3 pvc-02b0c0a1-e0b6-4e98-9384-60ff24f3b3b6 ┊
┊ pvc-06e3b406-23f0-4f10-8b03-84063c1b2a12 ┊ Resource expected to have 3 replicas, got only 2. ┊ linstor rd ap --place-count 3 pvc-06e3b406-23f0-4f10-8b03-84063c1b2a12 ┊
┊ pvc-a0b8aeaf-076e-4bd9-93ed-c4db09c04d0b ┊ Resource expected to have 3 replicas, got only 2. ┊ linstor rd ap --place-count 3 pvc-a0b8aeaf-076e-4bd9-93ed-c4db09c04d0b ┊
┊ pvc-a523ebeb-c3b6-468d-abe5-f6afbbf31081 ┊ Resource expected to have 3 replicas, got only 2. ┊ linstor rd ap --place-count 3 pvc-a523ebeb-c3b6-468d-abe5-f6afbbf31081 ┊
┊ pvc-cf7e87b5-3e6d-4034-903d-4625830fb5b4 ┊ Resource expected to have 1 replicas, got only 0. ┊ linstor rd ap --place-count 1 pvc-cf7e87b5-3e6d-4034-903d-4625830fb5b4 ┊
┊ pvc-d344bc83-97fd-4489-bbe7-5399eea57165 ┊ Resource expected to have 3 replicas, got only 2. ┊ linstor rd ap --place-count 3 pvc-d344bc83-97fd-4489-bbe7-5399eea57165 ┊
┊ pvc-d39345a9-5446-4c64-a5ba-957ff7c7a31f ┊ Resource expected to have 3 replicas, got only 2. ┊ linstor rd ap --place-count 3 pvc-d39345a9-5446-4c64-a5ba-957ff7c7a31f ┊
┊ pvc-db6d4236-93bd-4268-9dcc-0ed275b17067 ┊ Resource expected to have 1 replicas, got only 0. ┊ linstor rd ap --place-count 1 pvc-db6d4236-93bd-4268-9dcc-0ed275b17067 ┊
┊ pvc-ebb412c3-083c-4eee-93dc-70917ea6d87e ┊ Resource expected to have 1 replicas, got only 0. ┊ linstor rd ap --place-count 1 pvc-ebb412c3-083c-4eee-93dc-70917ea6d87e ┊
┊ pvc-f107aacb-78d7-4ac6-97f8-8ed529a9c292 ┊ Resource expected to have 3 replicas, got only 2. ┊ linstor rd ap --place-count 3 pvc-f107aacb-78d7-4ac6-97f8-8ed529a9c292 ┊
┊ pvc-f347d71a-b646-45e5-a717-f0a745061beb ┊ Resource expected to have 1 replicas, got only 0. ┊ linstor rd ap --place-count 1 pvc-f347d71a-b646-45e5-a717-f0a745061beb ┊
┊ pvc-f6e96c83-6144-4510-b0ab-61936db52391 ┊ Resource expected to have 3 replicas, got only 2. ┊ linstor rd ap --place-count 3 pvc-f6e96c83-6144-4510-b0ab-61936db52391 ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Run the linstor rd ap
commands suggested in the “Possible fix” column to restore the desired replica count.