You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following @byshiue recommendations from #46 I made an ensemble model using BERT embeddings for preprocessing, BERT fastertransformer encoder as the main model, and finally QA head as postprocessing. I wanted to test if I can run QA task as an ensemble end-to-end.
The results are spectacular to say the least. So good that I want to first verify that I am doing everything as I should. I am providing python script comparing plain-vanilla PyTorch HuggingFace execution along with Triton/FT setup. Here are the results of running that script a few times. Here are my triton logs.
Looking forward to your comments and feedback.
Best, Vladimir
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hey everyone,
Following @byshiue recommendations from #46 I made an ensemble model using BERT embeddings for preprocessing, BERT fastertransformer encoder as the main model, and finally QA head as postprocessing. I wanted to test if I can run QA task as an ensemble end-to-end.
The results are spectacular to say the least. So good that I want to first verify that I am doing everything as I should. I am providing python script comparing plain-vanilla PyTorch HuggingFace execution along with Triton/FT setup. Here are the results of running that script a few times. Here are my triton logs.
Looking forward to your comments and feedback.
Best, Vladimir
Beta Was this translation helpful? Give feedback.
All reactions