[MM] speed up OPs using hf models (clip, ...) #199

drcege · 2024-01-26T09:26:00Z

Currently, when set np=28, clip of vit-base-p32 takes over 1h to compute similarities for 558k dataset, and tens of hours for vit-large-p14-336.

Perhaps the following can help:

loading on GPU (implemented)
~~using batched computing (not easy to implement, as batching is closely related to the internal logic of operators)~~

github-actions · 2024-02-17T09:31:46Z

This issue is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this issue will be closed in 3 day.

drcege self-assigned this Jan 26, 2024

drcege added this to the Basic Multimodal Support milestone Jan 26, 2024

drcege added enhancement New feature or request dj:multimodal issues/PRs about multimodal data processing labels Jan 26, 2024

drcege linked a pull request Jan 30, 2024 that will close this issue

Enhance/gpu support #203

Merged

github-actions bot added the stale-issue label Feb 17, 2024

HYLcool removed the stale-issue label Feb 18, 2024

drcege closed this as completed in #203 Feb 22, 2024

drcege linked a pull request Feb 29, 2024 that will close this issue

Refine _cuda_device_count logic #222

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MM] speed up OPs using hf models (clip, ...) #199

[MM] speed up OPs using hf models (clip, ...) #199

drcege commented Jan 26, 2024 •

edited

Loading

github-actions bot commented Feb 17, 2024

[MM] speed up OPs using hf models (clip, ...) #199

[MM] speed up OPs using hf models (clip, ...) #199

Comments

drcege commented Jan 26, 2024 • edited Loading

github-actions bot commented Feb 17, 2024

drcege commented Jan 26, 2024 •

edited

Loading