You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A user (not me) reports frustration due to not being able to restore a backup, as per the following message:
Notice that the problem complained of is "no space left on device" and the problem appears to be attempting to restore multiple core dump files.
Core dump files can become very large, taking up gigabytes, without a user being aware of it at all. It's happened to me, filling up a physical device (not on Gitpod) and I know what core dumps are. I've seen multiple reports on the discord forum of people asking "what are these core files?" where the user doesn't even know what core dumps are, and it turns out they were dropped by failed start tasks. And because of that: there's a lot of them if it turns out they were dropped during a before or command phase on each workspace startup.
It is very bad behavior to fail to restore a workspace just because it has multiple useless core dump files. It leads to user frustration, and, without Gitpod operator intervention, loss of valid user data. (Uncommitted/unpushed/unsaved stuff in his workspace.) Even with Gitpod operator intervention it leads to user delays and frustration, and Gitpod $$$$ spent on support.
Something should be done about this! It needs to be considered properly, but as examples:
Periodic task notices a bunch of coredump files in /home/gitpod (or elsewhere) and notifies user. Perhaps as part of a vscode/jetbrains plugin.
If the workspace backup doesn't restore then the restore script checks for the existence of core dump files in the tarchive and if so retries the restore using the command line option of tar to exclude restoring files according to a pattern.
They're not backed up at all on workspace backup using the command line option of tar to exclude archiving files according to a pattern.
They're just nuked on workspace backup if they exceed some threshold, perhaps oldest first, until below threshold.
They're just nuked on workspace backup, unconditionally.
Extra credit:
Can also look for other things such as: *.log files, other files known to contain metrics/telemetry. Don't back up (or don't restore) if greater than some size limit.
Steps to reproduce
Run a whole bunch of programs that crash (or the same one repeatedly). Check for existence of core dumps. When you get ~30Gb of them or whatever, stop your workspace and try to restart it.
Workspace affected
No response
Expected behavior
The workspace should restore (without the offending large coredump files).
Example repository
No response
Anything else?
Core dump files are almost always useless, especially when the user doesn't know what they are and doesn't expect them. They're only useful in certain debugging/troubleshooting scenarios. It could, for example, be documented behavior that all core files matching the typical pattern core\.[0-9]+ are nuked. Then, if a user is working with a core file and wants it persisted, he can rename it to something else.
david-bakin
changed the title
"Cannot restore backup" where root cause is too large archive caused by multiple core files.
"Cannot restore backup" due to multiple very large core files.
Sep 19, 2022
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Bug description
A user (not me) reports frustration due to not being able to restore a backup, as per the following message:
Notice that the problem complained of is "no space left on device" and the problem appears to be attempting to restore multiple core dump files.
Core dump files can become very large, taking up gigabytes, without a user being aware of it at all. It's happened to me, filling up a physical device (not on Gitpod) and I know what core dumps are. I've seen multiple reports on the discord forum of people asking "what are these core files?" where the user doesn't even know what core dumps are, and it turns out they were dropped by failed start tasks. And because of that: there's a lot of them if it turns out they were dropped during a
before
orcommand
phase on each workspace startup.It is very bad behavior to fail to restore a workspace just because it has multiple useless core dump files. It leads to user frustration, and, without Gitpod operator intervention, loss of valid user data. (Uncommitted/unpushed/unsaved stuff in his workspace.) Even with Gitpod operator intervention it leads to user delays and frustration, and Gitpod $$$$ spent on support.
Something should be done about this! It needs to be considered properly, but as examples:
/home/gitpod
(or elsewhere) and notifies user. Perhaps as part of a vscode/jetbrains plugin.tar
to exclude restoring files according to a pattern.tar
to exclude archiving files according to a pattern.Extra credit:
Can also look for other things such as:
*.log
files, other files known to contain metrics/telemetry. Don't back up (or don't restore) if greater than some size limit.Steps to reproduce
Run a whole bunch of programs that crash (or the same one repeatedly). Check for existence of core dumps. When you get ~30Gb of them or whatever, stop your workspace and try to restart it.
Workspace affected
No response
Expected behavior
The workspace should restore (without the offending large coredump files).
Example repository
No response
Anything else?
Core dump files are almost always useless, especially when the user doesn't know what they are and doesn't expect them. They're only useful in certain debugging/troubleshooting scenarios. It could, for example, be documented behavior that all core files matching the typical pattern
core\.[0-9]+
are nuked. Then, if a user is working with a core file and wants it persisted, he can rename it to something else.See also #12453 and #12814.
The text was updated successfully, but these errors were encountered: