ECFS Version 3.2.1.x has known issue that can cause the file system to move to “Protocols Write Disable Mode” before the system has reached 100% (95% for single replica).
In version 3.2.1.51, in order to mitigate the issue, before ECFS goes into “Protocols Write Disable Mode” due to PStoreDirectBucketOverflowed (or mitigate the issue faster for systems already in PStoreDirectBucketOverflowed condition) , Elastifile introduced 2 new events (both events alert the Google Elastifile support) -
PStoreDirectBucketOverflowed - System has an over utilized bucket (ECFS internal metadata object) and Google support needs to manual workaround the issue.
Until the issue is resolved the system will stay in “Protocols Write Disable Mode” (reads and deletes are ok).
PStore direct bucket approach overflow - System status is OK but it's approaching “Protocols Write Disable Mode” due to PStoreDirectBucketOverflowed.
When this event is fired, system status is still OK, but Google Google support needs to implement a manual workaround the issue.
Elastifile created this monitoring script to help storage admins identify the current status of the Elastifile Cloud File System, and to speed up the mitigation in case system is already in one of the states described above.
The script can be used to check current status, in case any of the events described above, or if requested by Google Elastifile support.
How-to use the script -
ssh into the EMS instance.
cd /tmp
wget https://storage.googleapis.com/elastifile-software-repo/ECFS/pstore_monitoring/bucket_overflow_monitoring.tar.gz
tar zxvf bucket_overflow_monitoring.tar.gz
python bucket_overflow_monitoring.py > /tmp/output.txt
The tar contains 2 files -
- bucket_overflow_monitoring.py
- README