Enable Sentence Transformer Inference with Intel Gaudi2 GPU Supported ( 'hpu' ) - Follow up for #2557 #2608

ZhengHongming888 · 2024-04-23T15:41:24Z

This PR belongs to one of enabling Intel's Gaudi2 GPU supported tasks for Sentence Transformer's inference/training.

This is the follow up PR to 2557. There are few following considerations/modifications in this new PR on hpu device -

Padding strategy for hpu device

We have tested the performance over all 37 models listed in pretrained models page and compared the performance between padding strategy of 'max_length' (str) and 'longest_length' (default value = True). Found over 2/3 of all 37 models the padding strategy from 'longest_length' is better than 'max_length' option for hpu device based on the eval over 100000 sentences. So we propose to also change hpu's padding setting into 'longest_length' (value = True) and same as previous default setting.
But also we want to expose this padding setting as one argument in SentenceTransformer class definition which the users can easily/freely choose what they want for padding strategy and in some cases the 'max_length' padding strategy will show better performance.
hpu support for PR ( 2573

Test this PR on hpu device/ failed. So we proposed the revision version to support this PR with hpu device. So now the test case of this PR (# test_encode_truncate) under tests/test_sentence_transformer.py also successfully passed.
hpu support for PR 2599

Test this PR on hpu device/ failed. So we proposed the revision version to support this PR with hpu device. So now the test case of this PR (# test_simple_encode ) under tests/test_image_embeddings.py also successfully passed

Welcome for any questions/comments!

Thanks.

ZhengHongming888 · 2024-04-30T07:00:39Z

Will submit one new PR with complete performance results!

ZhengHongming888 added 3 commits April 22, 2024 23:02

revision for padding argument and truncate dim test

ebd7ec8

Merge branch 'UKPLab:master' into intel_gaudi_inference_revision

33685e8

Merge branch 'UKPLab:master' into intel_gaudi_inference_revision

d741ccb

ZhengHongming888 marked this pull request as draft April 30, 2024 06:58

ZhengHongming888 closed this Apr 30, 2024

Provide feedback