-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECS Consuming Hugely Disproportionate Data Disk Space in Overheads, and Consuming for No Apparent Reason #561
Comments
Metadata usage is very high on the above example which is unexpected for ECS code , we reserve many copies of btree pages , not sure is this the cause or the btree/journal garbage which happens after 30/15 days. Is the capacity usage remains same or dropping it ? I'm currently working on 3.8 and will test this , also optimize some GC parameters for smaller system. |
We left the node running for about 2 weeks but the usage kept increasing day after day. As this is is a single-node installation, I'm not sure if this also occurs in multi-node. I think even if garbage disposal did clean it out after 15 / 30 days it wouldn't help systems with smaller disks, for example I have seen the garbage build up enough to crash a 500 GB disk machine in a day or two. Thanks for the response. |
Please test 3.8 , I have made some optimization for metadata reduction for small systems , please try and let me know. |
I used OVA installation. |
I can test 3.8 from source as it appears the OVA still has the issue. Hopefully will have some results in the coming days. As mentioned, I installed 3.7 from OVA and it presented this issue. Update: I tried to install from source but encountered a few errors in step 1 which I unfortunately don't have time to debug at the moment, but I think it's a reasonable assumption that the issue likely exists there also. I have the ECS 3.8 OVA set up so if there is anything you would like to try out then let me know. |
Manual deployment will not change. We are investigating on it , will let you know the outcome. |
Is there any knowledge that this problem is not present on earlier build? Is it worth to install 3.6.2.0 to prevent disk space consumption It seems that versions 3.7 ja 3.8 suffers this. We have installed 3.8 both with ova and manual method and discovered same symptoms what are described here |
We are releasing a new OVA image for 3.8 next week , will update once it is posted. |
To answer tihopa's question: I installed multiple versions and the problem was present on them, including 3.6, 3.5. |
Is there any way to remove all the excess data that is being stored, or prevent any more of the excess data from being stored . I use my server for testing so I don't need any of the data after testing but its becoming difficult to set up a whole new server every time we need to test because the server has used up all of the storage on my machine |
Expected Behavior
The ECS OVA v3.7 installation. The expectation is that ECS storage uses a reasonable amount of disk space when a file is uploaded. For example a 1 GB file is uploaded, the user expects 1GB of disk space to be consumed, understandably this could be something like 1.5 GB with various overheads. It is expected that deleting the file releases all of the consumed disk space, allowing the space to be used for additional files.
Actual Behavior
ECS consumes massively more disk space on the data disk than the uploaded files should occupy (7 to 10 times more, multiplicative). Beyond this the disk space is also consumed passively after an upload, I have not been able to observe when this passive consumption of disk space stops and it appears to keep consuming more disk space as time goes on. The system was not rebooted during this time. All of the excess consumed space falls under metadata and protection overhead. I have a single node installation and metadata or protection are not enabled in my deploy.yml or bucket files. The machine has a large disk but this issue made another ECS deployment of mine with a smaller disk crash.
Examples of what happened: In my experience the disk space consumed by user files disk is dwarfed by metadata overhead (over 2x size of user files) and protection overhead (over 4x size of user files). More worrying is that ECS appears to be consuming extra disk space passively, for no apparent reason. For example I logged off on a Friday evening and approximately 700 GB was consumed, by Monday morning it was over 1 TB, all of the additional consumption appeared to be metadata and protection overhead. I can view in real time the consumption rising, in the last few hours I have added no files but an additional 30 GB was used. I have a sizeable amount of disk space on this ECS node so I have had the ability to monitor this consumption, but on another ECS node I made some weeks ago I had 100 GB data disk, I decided to check this node and it is now crashed with 96% disk space used by ECS, I cannot check the logs specifically but it carries the same symptoms of my problem on my main node. The smaller test node only had a single 6.8 MB file uploaded to it. Neither node was rebooted since they were set up as I'm aware of the known issue that rebooting ECS can tie up disk space.
I would greatly appreciate assistance with this issue, as you can see it's quite serious and effectively renders ECS Community Edition unusable for any length of time, and certainly prevents evaluation of the utility. See screenshots below.
Steps to Reproduce Behavior
Relevant Output and Logs
Notifies: @nikhil-vr
The text was updated successfully, but these errors were encountered: