"Cannot restore backup" due to multiple very large `core` files. #13102

david-bakin · 2022-09-19T21:44:18Z

Bug description

A user (not me) reports frustration due to not being able to restore a backup, as per the following message:

Notice that the problem complained of is "no space left on device" and the problem appears to be attempting to restore multiple core dump files.

Core dump files can become very large, taking up gigabytes, without a user being aware of it at all. It's happened to me, filling up a physical device (not on Gitpod) and I know what core dumps are. I've seen multiple reports on the discord forum of people asking "what are these core files?" where the user doesn't even know what core dumps are, and it turns out they were dropped by failed start tasks. And because of that: there's a lot of them if it turns out they were dropped during a before or command phase on each workspace startup.

It is very bad behavior to fail to restore a workspace just because it has multiple useless core dump files. It leads to user frustration, and, without Gitpod operator intervention, loss of valid user data. (Uncommitted/unpushed/unsaved stuff in his workspace.) Even with Gitpod operator intervention it leads to user delays and frustration, and Gitpod $$$$ spent on support.

Something should be done about this! It needs to be considered properly, but as examples:

Periodic task notices a bunch of coredump files in /home/gitpod (or elsewhere) and notifies user. Perhaps as part of a vscode/jetbrains plugin.
If the workspace backup doesn't restore then the restore script checks for the existence of core dump files in the tarchive and if so retries the restore using the command line option of tar to exclude restoring files according to a pattern.
They're not backed up at all on workspace backup using the command line option of tar to exclude archiving files according to a pattern.
They're just nuked on workspace backup if they exceed some threshold, perhaps oldest first, until below threshold.
They're just nuked on workspace backup, unconditionally.

Extra credit:

Can also look for other things such as: *.log files, other files known to contain metrics/telemetry. Don't back up (or don't restore) if greater than some size limit.

Steps to reproduce

Run a whole bunch of programs that crash (or the same one repeatedly). Check for existence of core dumps. When you get ~30Gb of them or whatever, stop your workspace and try to restart it.

Workspace affected

No response

Expected behavior

The workspace should restore (without the offending large coredump files).

Example repository

No response

Anything else?

Core dump files are almost always useless, especially when the user doesn't know what they are and doesn't expect them. They're only useful in certain debugging/troubleshooting scenarios. It could, for example, be documented behavior that all core files matching the typical pattern core\.[0-9]+ are nuked. Then, if a user is working with a core file and wants it persisted, he can rename it to something else.

See also #12453 and #12814.

The text was updated successfully, but these errors were encountered:

stale · 2022-12-20T20:57:28Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

david-bakin added the type: bug Something isn't working label Sep 19, 2022

david-bakin changed the title ~~"Cannot restore backup" where root cause is too large archive caused by multiple core files.~~ "Cannot restore backup" due to multiple very large core files. Sep 19, 2022

david-bakin mentioned this issue Sep 19, 2022

Disable core dumps for workspaces #12814

Closed

1 task

kylos101 mentioned this issue Sep 20, 2022

Workspace Cluttered with core.* files #12453

Closed

stale bot added the meta: stale This issue/PR is stale and will be closed soon label Dec 20, 2022

stale bot closed this as completed Jan 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Cannot restore backup" due to multiple very large `core` files. #13102

"Cannot restore backup" due to multiple very large `core` files. #13102

david-bakin commented Sep 19, 2022 •

edited

Loading

stale bot commented Dec 20, 2022

"Cannot restore backup" due to multiple very large core files. #13102

"Cannot restore backup" due to multiple very large core files. #13102

Comments

david-bakin commented Sep 19, 2022 • edited Loading

Bug description

Steps to reproduce

Workspace affected

Expected behavior

Example repository

Anything else?

stale bot commented Dec 20, 2022

"Cannot restore backup" due to multiple very large `core` files. #13102

"Cannot restore backup" due to multiple very large `core` files. #13102

david-bakin commented Sep 19, 2022 •

edited

Loading