Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error:SamAutomaticMaskGenerator function has a large memory footprint #110

Open
lstswb opened this issue Dec 29, 2023 · 12 comments
Open

Error:SamAutomaticMaskGenerator function has a large memory footprint #110

lstswb opened this issue Dec 29, 2023 · 12 comments

Comments

@lstswb
Copy link

lstswb commented Dec 29, 2023

GPU:4090 24G
System:Ubuntu for WSL2
Model:sam_vit_h
Image_size:[1024,1024]
Parameter settings:
model=sam,
points_per_side=128,
points_per_batch = 64,
pred_iou_thresh=0.86,
stability_score_thresh=0.92,
crop_n_layers=3,
crop_n_points_downscale_factor=2,
min_mask_region_area=100,
process_batch_size=4
Issue:
When I use SamAutomaticMaskGenerator,GPU memory usage up to 55GB.
微信图片_20231229155121
And there will be an error.
[torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.63 GiB. GPU 0 has a total capacity of 23.99 GiB of which 0 bytes is free. Including non-PyTorch memory, this process has 17179869184.00 GiB memory in use. Of the allocated memory 34.86 GiB is allocated by PyTorch, and 5.63 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
]
微信图片_20231229155250
However, when using the original SAM code, this problem does not exist, and the GPU memory will not exceed 24GB.

@lstswb lstswb changed the title SamAutomaticMaskGenerator function has a large memory footprint Error:SamAutomaticMaskGenerator function has a large memory footprint Jan 3, 2024
@lyf6
Copy link

lyf6 commented Jan 11, 2024

@lstswb have u solved this?

@lstswb
Copy link
Author

lstswb commented Jan 12, 2024

@lstswb have u solved this?

Not yet

@cpuhrsch
Copy link
Contributor

cpuhrsch commented Jan 18, 2024

Does the code snippet from the example help?

In particular

from segment_anything_fast import sam_model_registry, sam_model_fast_registry, SamAutomaticMaskGenerator
sam_checkpoint = "checkpoints/sam_vit_h_4b8939.pth"
model_type = "vit_h"
device = "cuda"
sam = sam_model_fast_registry[model_type](checkpoint=sam_checkpoint)
sam.to(device=device)
mask_generator = SamAutomaticMaskGenerator(sam, process_batch_size=8)

note that you can adjust process_batch_size for a smaller memory footprint and note the use of sam_model_fast_registry

@lstswb
Copy link
Author

lstswb commented Jan 18, 2024

Does the code snippet from the example help?

In particular

from segment_anything_fast import sam_model_registry, sam_model_fast_registry, SamAutomaticMaskGenerator
sam_checkpoint = "checkpoints/sam_vit_h_4b8939.pth"
model_type = "vit_h"
device = "cuda"
sam = sam_model_fast_registry[model_type](checkpoint=sam_checkpoint)
sam.to(device=device)
mask_generator = SamAutomaticMaskGenerator(sam, process_batch_size=8)

note that you can adjust process_batch_size for a smaller memory footprint and note the use of sam_model_fast_registry

I tried to adjust batch_size, and the GPU memory footprint was reduced, but it still far exceeded the original code.

@cpuhrsch
Copy link
Contributor

Yes, the batch is larger, but should be faster. The original code uses batch size 1. You can try setting it to batch size 1.

@lstswb
Copy link
Author

lstswb commented Jan 19, 2024

Yes, the batch is larger, but should be faster. The original code uses batch size 1. You can try setting it to batch size 1.

I tried to adjust batch_size=1, but still got GPU memory error.
6334625e7e7b7f0846c22cdafea2f82

@cpuhrsch
Copy link
Contributor

Hm, I assume you're also using the GPU for the display manager? That will take up additional memory as well. Maybe the solution in #97 will help.

Can you use your onboard GPU (if you have one) for the display manager and the GPU for the model only? Does it work with vit_b?

@lstswb
Copy link
Author

lstswb commented Jan 19, 2024

Hm, I assume you're also using the GPU for the display manager? That will take up additional memory as well. Maybe the solution in #97 will help.

Can you use your onboard GPU (if you have one) for the display manager and the GPU for the model only? Does it work with vit_b?

Vit_b can be used normally. Display takes up only a small portion of GPU memory. Setting vit_h equally works fine with the original code.

@cpuhrsch
Copy link
Contributor

Hm, can you try setting the environment variable SEGMENT_ANYTHING_FAST_USE_FLASH_4 to 0?

@lstswb
Copy link
Author

lstswb commented Jan 19, 2024

Hm, can you try setting the environment variable SEGMENT_ANYTHING_FAST_USE_FLASH_4 to 0?
SEGMENT_ANYTHING_FAST_USE_FLASH_4 has been set to 0. But the problem remains.
微信图片_20240119144641

@cpuhrsch
Copy link
Contributor

Hm, I'm not sure to be honest. It seems to work on other 4090s, but I think they're on Linux and not Windows.

@lstswb
Copy link
Author

lstswb commented Jan 22, 2024

Hm, I'm not sure to be honest. It seems to work on other 4090s, but I think they're on Linux and not Windows.

Well. I try to do this using a linux system instead of using ubuntu based on WSL2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants