Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Pytorch to make search faster #16

Merged
merged 2 commits into from
Feb 4, 2021
Merged

Conversation

FelixGoetze
Copy link
Contributor

@FelixGoetze FelixGoetze commented Feb 4, 2021

This changes the matrix multiplication in the find_best_matches function to use Pytorch.
With an available CUDA GPU it can run more than 100 times faster.
When there is no GPU available, the Float16 Numpy arrays are converted to Float32 tensors, which also runs faster, due to better hardware support on the CPU (see also).

Below are the runtime performances for the following command:

%time find_best_matches(text_features, photo_features, photo_ids)

Pytorch with Float16 on Colab with GPU:

CPU times: user 8.32 ms, sys: 15.5 ms, total: 23.8 ms
Wall time: 24.4 ms
['wL4Pgswo2hM', '1EaAfoo37cM', '9dnyZgq9aPI']

Pytorch with Float32 on Colab with CPU:

CPU times: user 696 ms, sys: 29.1 ms, total: 725 ms
Wall time: 737 ms
['wL4Pgswo2hM', '1EaAfoo37cM', '9dnyZgq9aPI']

Previous Numpy implementation using Float16:

CPU times: user 3.71 s, sys: 163 ms, total: 3.87 s
Wall time: 3.45 s
['wL4Pgswo2hM', '1EaAfoo37cM', '9dnyZgq9aPI']

@haltakov
Copy link
Owner

haltakov commented Feb 4, 2021

Thank you for your contribution! Doing the multiplication on the GPU was something I wanted to do, but got distracted by other things... I didn't know about the inefficiency of using float16 on the CPU!

I'll merge the PR and will then do some small changes to the notebook (for example keeping the search output).

@haltakov haltakov merged commit 84c6135 into haltakov:main Feb 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants