Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BERT4Rec tutorial #212

Merged

Conversation

spirinamayya
Copy link
Contributor

Added BERT4Rec tutorial

@blondered blondered marked this pull request as ready for review November 27, 2024 11:31
@blondered
Copy link
Collaborator

blondered commented Nov 27, 2024

Let'r reorder and make different structure. This is very hard to read and understand in current structure.

Transformer models

<Introduction. What are we here for? Why should we use transformer models & why from RecTools?>
Table of Contents

  • Prepare data

  • SASRec & BERT4Rec

    • SASRec (remove sasrec image from paper because we will have very similar very soon)
    • BERT4Rec
    • Main differences (+ 2 imgs for attention direction)
  • RecTools implementation (features in a table: all about losses and other customization options, checklist for both models in two columns)

  • Models application

    • Basic usage (only main well-known kwargs here. initialization + fit + recommend) (draw atetion to mask_prob for BERT)
    • Adding item features to models (selecting item net block types) (init only)
    • Selecting losses (example for initialization with different losses and n_negatives) (init only)
    • Customizing model (saying about use_pos_emb, train_min_user_interaction, pos_encoding_type, transformer_layers_type, data_preparator_type, lightning_module_type, custom item_net_block_types, trainer) (not saying about use_causal_attn, use_key_padding_mask - just drop from tutorial for now) (init only)
    • Cross-validation (from previously initialized models)
    • Item-to-item recommendations
    • SASRec with item ids embeddings in ItemNetBlock
    • SASRec with item ids and category features embeddings in ItemNetBlock
    • SASRec with category item features embeddings in ItemNetBlock
    • Additional details Inference tricks (model known items and inference for cold users)
  • Detailed SASRec and BERT4Rec description

    • Dataset processing
    • Transformer layers
    • Losses
  • Recommendations (we don't really need implicit ranker info)

  • Links (add Attetion is all you need) (add Sasha Petrov paper with info which BERT is the best)

@blondered
Copy link
Collaborator

" Item embeddings from these sequences are fed to multi-head self-attention to acquire user sequence latent representation"
That's not exactly true. Embeddings are fed into transformer blocks which have more login inside then just multi-head self-attention

@blondered
Copy link
Collaborator

"After one or several stacked attention blocks, resulting embeddings are used to predict the next item."
Ambiguous.
resulting embeddings are used to predict all items in the sequence. Each item is being predicted based only on previous items information.

@blondered
Copy link
Collaborator

About main differences in the table. Let's go further.

  1. You didn't specify losses. Original implementations of these models had different losses.
  2. Some of the differences are conceptual (target and attention direction). They cannot be excluded from model. But other (losses, activation functions, transformer blocks) could be a subject of change. Try to explicitly emphasize it. It really is important.

@blondered
Copy link
Collaborator

Let's add to RecTools implementation section references for our implementations (which implementations we took as goals, what we changed). Let'a also emphasize that we included number of negatives as a hyper-perameter.

@blondered
Copy link
Collaborator

blondered commented Nov 29, 2024

Let's remove def recommend from basic usage section.
We merge recos with items only once. No need for a separate function

@blondered
Copy link
Collaborator

In losses section please add you description of what's happening. What is the difference between losses?

@blondered
Copy link
Collaborator

Let's remove TODO: Add category item features embeddings
We will add them afterwards

@blondered
Copy link
Collaborator

For Item-to-item recommendations please provide your short description of what is happening under the hood. How do we recommend to items?

@blondered
Copy link
Collaborator

In Adding item features to models let'a add a note that we do not support numerical item features for now. Let's also add a short note on how categorical features are being used under the hood.

@blondered
Copy link
Collaborator

Transformer models -> Transformer models tutorial

@blondered
Copy link
Collaborator

Categorical feature embeddings are dealt with separately from id embeddings and are concatenated after weight update.

  1. They are not concatenated. They are summed up.
  2. Why "after weight update"?

You can just write "Categorical feature embeddings are summed up with other embeddings for each item if they are present in the model."

@blondered blondered merged commit eca31d4 into MobileTeleSystems:experimental/sasrec Dec 6, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants