-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BERT4Rec tutorial #212
BERT4Rec tutorial #212
Conversation
Let'r reorder and make different structure. This is very hard to read and understand in current structure. Transformer models<Introduction. What are we here for? Why should we use transformer models & why from RecTools?>
|
" Item embeddings from these sequences are fed to multi-head self-attention to acquire user sequence latent representation" |
"After one or several stacked attention blocks, resulting embeddings are used to predict the next item." |
About main differences in the table. Let's go further.
|
Let's add to RecTools implementation section references for our implementations (which implementations we took as goals, what we changed). Let'a also emphasize that we included number of negatives as a hyper-perameter. |
Let's remove |
In losses section please add you description of what's happening. What is the difference between losses? |
Let's remove |
For Item-to-item recommendations please provide your short description of what is happening under the hood. How do we recommend to items? |
In |
Transformer models -> Transformer models tutorial |
Categorical feature embeddings are dealt with separately from id embeddings and are concatenated after weight update.
You can just write "Categorical feature embeddings are summed up with other embeddings for each item if they are present in the model." |
Merge with updated experimental/sasrec
eca31d4
into
MobileTeleSystems:experimental/sasrec
Added BERT4Rec tutorial