Introduction
Sometimes you will see Node has a Faulty drive in your Elastifile cluster, but the status of the node will be active.
In order to resolve this issue, Please follow below steps.
- SSH into EMS.
2. Get node ip-address & Id with faulty drive by running below command.
Make note of the ids and their availability zone.
3. Add equal no of nodes using ECFS. i.e. If you have 2 nodes with faulty drives add 2 new nodes in the system.
Now check the newly created nodes, if deployed in the same zone like faulty ones.
4. Wait for ownership recovery and data rebuild to be finished.
Remove nodes
Use the below command to remove the old nodes from the system. You should use the same id as identified in step 2.
** While adding new nodes if you have created redundant nodes. Remove a specific node using node Id based on its IP address.
elfs-cli enode delete_multi --ids <node_ids> --async true
Node status will change from Active to Pending removal.
Post Checks
1. Check status of node removal task by executing below command.
2. To check if node has been removed successfully you can run below commands and check if the node count matches between ELFS and ECS.
elfs-cli enode list -t
ecs-cli nodes
Please contact Google Elastifile support at https://support.elastifile.com if you need any further help.