Enable Sentence Transformer Inference with Intel Gaudi2 GPU Supported ( 'hpu' ) - Follow up for #2557 #2608
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR belongs to one of enabling Intel's Gaudi2 GPU supported tasks for Sentence Transformer's inference/training.
This is the follow up PR to 2557. There are few following considerations/modifications in this new PR on hpu device -
Padding strategy for hpu device
We have tested the performance over all 37 models listed in pretrained models page and compared the performance between padding strategy of 'max_length' (str) and 'longest_length' (default value = True). Found over 2/3 of all 37 models the padding strategy from 'longest_length' is better than 'max_length' option for hpu device based on the eval over 100000 sentences. So we propose to also change hpu's padding setting into 'longest_length' (value = True) and same as previous default setting.
But also we want to expose this padding setting as one argument in SentenceTransformer class definition which the users can easily/freely choose what they want for padding strategy and in some cases the 'max_length' padding strategy will show better performance.
hpu support for PR ( 2573
Test this PR on hpu device/ failed. So we proposed the revision version to support this PR with hpu device. So now the test case of this PR (# test_encode_truncate) under tests/test_sentence_transformer.py also successfully passed.
hpu support for PR 2599
Test this PR on hpu device/ failed. So we proposed the revision version to support this PR with hpu device. So now the test case of this PR (# test_simple_encode ) under tests/test_image_embeddings.py also successfully passed
Welcome for any questions/comments!
Thanks.