Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleted tasks and dataset exports use up Docker space #2859

Closed
1 task done
Kpraetori opened this issue Feb 24, 2021 · 6 comments
Closed
1 task done

Deleted tasks and dataset exports use up Docker space #2859

Kpraetori opened this issue Feb 24, 2021 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@Kpraetori
Copy link

My actions before raising this issue

The issues relate to space use from CVAT operations. When I delete a task, it makes no change to the amount of space allocated to Docker. Also, when I attempt to export a dataset, space is used in the failed attempt and does not clear out. When I was successful in downloading, the docker space used up in the process matched the zip file size and doesn't reduce. Overall, it seems that CVAT files aren't actually being removed.

Expected Behaviour

I expect that when a task is deleted, it frees up space. Also, if an export fails, the (temp?) files are then deleted after the failed attempt.

Current Behaviour

Recent example:
Last night I tried to export a dataset. cvat_cvat_data increased to using 46.65Gb (checked by doing 'docker system df -v') after the export attempt. The failure occurred because it used up the 30Gb of free space I had in Docker, to a total of 101.7Gb used. Note that the only container/app I have in Docker is CVAT and I only had three labelled tasks; much of the space used was from a buildup of previous exports or deleted videos not being cleared. I stopped and restarted both Docker and CVAT and the space available didn't change.
This morning I tested deleting an unneeded task to confirm that deletions don't impact space available. I noticed cvat_cvat_data went down to 11.26Gb in size. Even with this, the disk space used up in Docker remains the same (101.7Gb) since the failed export attempt. I have since stopped CVAT and restarted docker and nothing changed. I have done a similar process several times in the past, and it seems that deleted tasks and exports (failed or successful) are not being removed.

Possible Solution

I have tried docker image prune, docker image prune -a, docker volume prune, and docker system prune.

A bit ago I managed, with outside IT help, to free up space that was used up from failed/deleted CVAT files and exports. However, neither of us knew what we did that actually made it work since we kept getting the message that the space cleared was 0. I can't seem to replicate what the IT help did and now my Docker space is filled up again.

I could keep increasing the Docker image size but I have done that several times already. This computer has limited space remaining so I can't keep using that as a solution.

Steps to Reproduce (for bugs)

  1. Create a task and label a video
  2. Check space used on Docker
  3. Export or try exporting a dataset (I've mostly used YOLO but I tried COCO just to see if that did anything)
    3b. Alternative: delete the task
  4. Check space used on Docker
    Result: If you deleted the task, the Docker space will not change. If you exported a dataset, the space used will increase matching the file size of the zip file.

Context

I can't upload or download any more until I can get this to resolve, so I'm stuck without a solution.

Your Environment

  • Docker version docker version (e.g. Docker 17.0.05):
    Client: Docker Engine - Community
    Cloud integration: 1.0.7
    Version: 20.10.2
  • Are you using Docker Swarm or Kubernetes? Docker Desktop
  • Operating System and version (e.g. Linux, Windows, MacOS):
    Mac OS Catalina 10.15.7

CVAT:
Server version: 1.3
Core version: 3.10.0
Canvas version: 2.2.2
UI version: 1.13.8

  • Other diagnostic information / logs:

I don't know what logs or details are most helpful to identify the issue. I'll be happy to supply what you need. I am new to CVAT and Docker so there may be something simple I'm missing or doing wrong. Thank you and your help is much appreciated.

@nmanovic nmanovic added bug Something isn't working enhancement New feature or request labels Feb 24, 2021
@nmanovic nmanovic added this to the 1.3.0-beta milestone Feb 24, 2021
@nmanovic
Copy link
Contributor

@azhavoro , could you please investigate the issue?

@iraadit
Copy link

iraadit commented Feb 26, 2021

Hi, I have the exact same problem.
I uploaded and deleted hundred of tasks several times (upload through cli, delete through Django admin interface).
My disk space is now saturated,

1,5T docker_data/containers/7ff721f52a4870daa48881d76a8c13c9c2333bc450eaa088e65d816cf3238934

I don't know what this container is but here is what I get when executing docker inspect on it:

[
{
"Id": "7ff721f52a4870daa48881d76a8c13c9c2333bc450eaa088e65d816cf3238934",
"Created": "2021-02-22T13:18:29.445595643Z",
"Path": "/docker-entrypoint.sh",
"Args": [
"sh",
"-c",
"./runner.sh"
],
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 20379,
"ExitCode": 0,
"Error": "",
"StartedAt": "2021-02-22T13:18:32.450575143Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},
"Image": "sha256:b2f2226584d837457fd819be2d60a92823e903593ca907893f9706aad0a7d3ce",
"ResolvConfPath": "/mnt/DATA/docker_data/containers/7ff721f52a4870daa48881d76a8c13c9c2333bc450eaa088e65d816cf3238934/resolv.conf",
"HostnamePath": "/mnt/DATA/docker_data/containers/7ff721f52a4870daa48881d76a8c13c9c2333bc450eaa088e65d816cf3238934/hostname",
"HostsPath": "/mnt/DATA/docker_data/containers/7ff721f52a4870daa48881d76a8c13c9c2333bc450eaa088e65d816cf3238934/hosts",
"LogPath": "/mnt/DATA/docker_data/containers/7ff721f52a4870daa48881d76a8c13c9c2333bc450eaa088e65d816cf3238934/7ff721f52a4870daa48881d76a8c13c9c2333bc450eaa088e65d816cf3238934-json.log",
"Name": "/nuclio",
"RestartCount": 0,
"Driver": "overlay2",
"Platform": "linux",
"MountLabel": "",
"ProcessLabel": "",
"AppArmorProfile": "docker-default",
"ExecIDs": null,
"HostConfig": {
"Binds": [
"/var/run/docker.sock:/var/run/docker.sock:rw",
"/tmp:/tmp:rw"
],
"ContainerIDFile": "",
"LogConfig": {
"Type": "json-file",
"Config": {}
},
"NetworkMode": "cvat_default",
"PortBindings": {
"8070/tcp": [
{
"HostIp": "",
"HostPort": "8070"
}
]
},
"RestartPolicy": {
"Name": "always",
"MaximumRetryCount": 0
},
"AutoRemove": false,
"VolumeDriver": "",
"VolumesFrom": [],
"CapAdd": null,
"CapDrop": null,
"CgroupnsMode": "host",
"Dns": null,
"DnsOptions": null,
"DnsSearch": null,
"ExtraHosts": null,
"GroupAdd": null,
"IpcMode": "private",
"Cgroup": "",
"Links": null,
"OomScoreAdj": 0,
"PidMode": "",
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": null,
"UTSMode": "",
"UsernsMode": "",
"ShmSize": 67108864,
"Runtime": "runc",
"ConsoleSize": [
0,
0
],
"Isolation": "",
"CpuShares": 0,
"Memory": 0,
"NanoCpus": 0,
"CgroupParent": "",
"BlkioWeight": 0,
"BlkioWeightDevice": null,
"BlkioDeviceReadBps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteIOps": null,
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": null,
"DeviceCgroupRules": null,
"DeviceRequests": null,
"KernelMemory": 0,
"KernelMemoryTCP": 0,
"MemoryReservation": 0,
"MemorySwap": 0,
"MemorySwappiness": null,
"OomKillDisable": false,
"PidsLimit": null,
"Ulimits": null,
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0,
"MaskedPaths": [
"/proc/asound",
"/proc/acpi",
"/proc/kcore",
"/proc/keys",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/proc/scsi",
"/sys/firmware"
],
"ReadonlyPaths": [
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
},
"GraphDriver": {
"Data": {
"LowerDir": "/mnt/DATA/docker_data/overlay2/ded822c42c56fcbb1ba85e9c79c17512dac3b374be28776d3a1e2354c62aa34e-init/diff:/mnt/DATA/docker_data/overlay2/1228b645a9189689b6cafb2d612fb1cfe0b2592a73f4bed969ff9a20c90a6dba/diff:/mnt/DATA/docker_data/overlay2/64bcef9e4ddd31d40c5c7a4bf28b2d5e1295498326320d5cb723ec5d01c22bad/diff:/mnt/DATA/docker_data/overlay2/00b85337157216042fd9fda49867dbb15eb3d050333cf6ad186f9ed96b57e802/diff:/mnt/DATA/docker_data/overlay2/ded47b296a461c3659d3c7e862f02bedecaef80639f8474174e0b98baada5981/diff:/mnt/DATA/docker_data/overlay2/bcb455c4d2fbc01991e2dfd42092d7b2a482f8ffb63efac171142d83391d09f7/diff:/mnt/DATA/docker_data/overlay2/717776916a40b6dccc4f8d1c173dd190079aa5f80799803d1ea4bcddd053050d/diff:/mnt/DATA/docker_data/overlay2/7e54e457f76576c5bb194d142a37a705e15e105f86eeee4e9f7442aea9d7600e/diff:/mnt/DATA/docker_data/overlay2/2f0dba495ccee804b0746fd64f7367d0c7bf520a20e862f033dbeff76708d0ad/diff:/mnt/DATA/docker_data/overlay2/23eafd3d8c74f49960ce65ff7905c8afe3f6219fac90ebd9aa295e0bdb82667e/diff:/mnt/DATA/docker_data/overlay2/9010fc790ac4bed1216abceedf7f6f5cbbd8e0b88334db8827963b8686e17ab8/diff:/mnt/DATA/docker_data/overlay2/26d2b31ddc22b382473d9869eab6edb4ef6429b8b818752df3ba82883ee6585d/diff:/mnt/DATA/docker_data/overlay2/0cdb8248bd5203d187d746508a743205f380446d758260e327fe21b14cdc524c/diff",
"MergedDir": "/mnt/DATA/docker_data/overlay2/ded822c42c56fcbb1ba85e9c79c17512dac3b374be28776d3a1e2354c62aa34e/merged",
"UpperDir": "/mnt/DATA/docker_data/overlay2/ded822c42c56fcbb1ba85e9c79c17512dac3b374be28776d3a1e2354c62aa34e/diff",
"WorkDir": "/mnt/DATA/docker_data/overlay2/ded822c42c56fcbb1ba85e9c79c17512dac3b374be28776d3a1e2354c62aa34e/work"
},
"Name": "overlay2"
},
"Mounts": [
{
"Type": "bind",
"Source": "/var/run/docker.sock",
"Destination": "/var/run/docker.sock",
"Mode": "rw",
"RW": true,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/tmp",
"Destination": "/tmp",
"Mode": "rw",
"RW": true,
"Propagation": "rprivate"
}
],
"Config": {
"Hostname": "7ff721f52a48",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"ExposedPorts": {
"80/tcp": {},
"8070/tcp": {}
},
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"http_proxy",
"https_proxy",
"no_proxy=172.28.0.1,",
"NUCLIO_CHECK_FUNCTION_CONTAINERS_HEALTHINESS=true",
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"NGINX_VERSION=1.18.0",
"NJS_VERSION=0.4.4",
"PKG_RELEASE=2",
"DOWNLOAD_URL=https://download.docker.com/linux/static/stable/x86_64/docker-19.03.12.tgz"
],
"Cmd": [
"sh",
"-c",
"./runner.sh"
],
"Image": "quay.io/nuclio/dashboard:1.5.8-amd64",
"Volumes": {
"/tmp": {},
"/var/run/docker.sock": {}
},
"WorkingDir": "",
"Entrypoint": [
"/docker-entrypoint.sh"
],
"OnBuild": null,
"Labels": {
"com.docker.compose.config-hash": "60ec4ccf829f1258d31d490dc3d35727192884ea83eeb52adbeb3c75eb08891d",
"com.docker.compose.container-number": "1",
"com.docker.compose.oneoff": "False",
"com.docker.compose.project": "cvat",
"com.docker.compose.project.config_files": "docker-compose.yml,docker-compose.override.yml,components/serverless/docker-compose.serverless.yml",
"com.docker.compose.project.working_dir": "/home/fnlocal/Projects/cvat",
"com.docker.compose.service": "serverless",
"com.docker.compose.version": "1.27.4",
"maintainer": "NGINX Docker Maintainers [email protected]",
"nuclio.version_info": "{"git_commit": "c12022750a6003a309d4f20c72a98addacd3e95d", "label": "1.5.8"}"
},
"StopSignal": "SIGQUIT"
},
"NetworkSettings": {
"Bridge": "",
"SandboxID": "6f50940051f38e5d1d193e17d8fbe07fdf8eee11a1634c1f0488b03df2089aa9",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {
"80/tcp": null,
"8070/tcp": [
{
"HostIp": "0.0.0.0",
"HostPort": "8070"
}
]
},
"SandboxKey": "/var/run/docker/netns/6f50940051f3",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {
"cvat_default": {
"IPAMConfig": null,
"Links": null,
"Aliases": [
"serverless",
"nuclio",
"7ff721f52a48"
],
"NetworkID": "8e08e4c262ec23063f2c239725e1ebf3fadb49bce209a2daea6183ae58087ca1",
"EndpointID": "529b9c303af9aa7c9a60887e3722712cac556b00cffa0e19ece16ba3fe113197",
"Gateway": "172.28.0.1",
"IPAddress": "172.28.0.4",
"IPPrefixLen": 24,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:1c:00:04",
"DriverOpts": null
}
}
}
}
]

Task data should be deleted when tasks are deleted.
How can I now manually delete this data? Is there a way or should I "reinstall" (erase everything and start over) cvat?

I found these related (closed?) issue #1925 and merge request #2179.
I use cvat 1.2.0 and problem is still present.

Thank you

@iraadit
Copy link

iraadit commented Feb 26, 2021

After more research, it seems the docker container taking up a lot of space in my case would be quay.io/nuclio/dashboard:1.5.8-amd64 (Installed with nuclio for automatic object detection I guess).
Could it therefore be a separate problem?

Edit:
I deleted the nuclio dashboard container.
No more saturated space, my server is behaving normally again and I'm able to access CVAT again.
As I deleted the container, I now have this error message in CVAT:

Could not get models from the server
Open the Browser Console to get details
Error in console:
Error: Request failed with status code 503. "HTTPConnectionPool(host='nuclio', port=8070): Max retries exceeded with url: /api/functions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ffb08422f70>: Failed to establish a new connection: [Errno -2] Name or service not known'))".
This error was expected as I deleted the container.

I guess I could install nuclio again, but I'm afraid that container would grow out of hands again. I'll create a separate issue.
Sorry for the noise.

@nmanovic nmanovic removed this from the 1.3.0-beta milestone Mar 8, 2021
@nmanovic nmanovic added this to the Backlog milestone Mar 8, 2021
@nmanovic
Copy link
Contributor

@azhavoro , it looks like a problem with logs from nuclio. Could you please explain how to overcome the issue and probably prepare a fix for upstream?

@nmanovic nmanovic removed the enhancement New feature or request label Nov 24, 2021
@nmanovic nmanovic removed this from the Backlog milestone Nov 24, 2021
@skarfa-ts
Copy link

@nmanovic Any update or fix for this. I am facing same issue, we are running automatic annotation.

@azhavoro
Copy link
Contributor

Problem with nuclio logs fixed in #4969

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants