BERT4Rec tutorial #212

spirinamayya · 2024-11-27T08:50:20Z

Added BERT4Rec tutorial

merge with updated branch

blondered · 2024-11-27T12:22:39Z

Let'r reorder and make different structure. This is very hard to read and understand in current structure.

Transformer models

<Introduction. What are we here for? Why should we use transformer models & why from RecTools?>
Table of Contents

Prepare data
SASRec & BERT4Rec
- SASRec (remove sasrec image from paper because we will have very similar very soon)
- BERT4Rec
- Main differences (+ 2 imgs for attention direction)
RecTools implementation (features in a table: all about losses and other customization options, checklist for both models in two columns)
Models application
- Basic usage (only main well-known kwargs here. initialization + fit + recommend) (draw atetion to mask_prob for BERT)
- Adding item features to models (selecting item net block types) (init only)
- Selecting losses (example for initialization with different losses and n_negatives) (init only)
- Customizing model (saying about use_pos_emb, train_min_user_interaction, pos_encoding_type, transformer_layers_type, data_preparator_type, lightning_module_type, custom item_net_block_types, trainer) (not saying about use_causal_attn, use_key_padding_mask - just drop from tutorial for now) (init only)
- Cross-validation (from previously initialized models)
- Item-to-item recommendations
- ~~SASRec with item ids embeddings in ItemNetBlock~~
- ~~SASRec with item ids and category features embeddings in ItemNetBlock~~
- ~~SASRec with category item features embeddings in ItemNetBlock~~
- ~~Additional details~~ Inference tricks (model known items and inference for cold users)
Detailed SASRec and BERT4Rec description
- Dataset processing
- Transformer layers
- Losses
~~Recommendations~~ (we don't really need implicit ranker info)
Links (add Attetion is all you need) (add Sasha Petrov paper with info which BERT is the best)

blondered · 2024-11-29T08:11:59Z

" Item embeddings from these sequences are fed to multi-head self-attention to acquire user sequence latent representation"
That's not exactly true. Embeddings are fed into transformer blocks which have more login inside then just multi-head self-attention

blondered · 2024-11-29T08:14:06Z

"After one or several stacked attention blocks, resulting embeddings are used to predict the next item."
Ambiguous.
resulting embeddings are used to predict all items in the sequence. Each item is being predicted based only on previous items information.

blondered · 2024-11-29T08:17:31Z

About main differences in the table. Let's go further.

You didn't specify losses. Original implementations of these models had different losses.
Some of the differences are conceptual (target and attention direction). They cannot be excluded from model. But other (losses, activation functions, transformer blocks) could be a subject of change. Try to explicitly emphasize it. It really is important.

blondered · 2024-11-29T08:19:14Z

Let's add to RecTools implementation section references for our implementations (which implementations we took as goals, what we changed). Let'a also emphasize that we included number of negatives as a hyper-perameter.

blondered · 2024-11-29T08:19:59Z

Let's remove def recommend from basic usage section.
We merge recos with items only once. No need for a separate function

blondered · 2024-11-29T08:22:00Z

In losses section please add you description of what's happening. What is the difference between losses?

blondered · 2024-11-29T08:22:47Z

Let's remove TODO: Add category item features embeddings
We will add them afterwards

blondered · 2024-11-29T08:23:37Z

For Item-to-item recommendations please provide your short description of what is happening under the hood. How do we recommend to items?

blondered · 2024-11-29T08:24:46Z

In Adding item features to models let'a add a note that we do not support numerical item features for now. Let's also add a short note on how categorical features are being used under the hood.

blondered · 2024-11-29T08:28:39Z

Transformer models -> Transformer models tutorial

blondered · 2024-12-06T09:37:10Z

Categorical feature embeddings are dealt with separately from id embeddings and are concatenated after weight update.

They are not concatenated. They are summed up.
Why "after weight update"?

You can just write "Categorical feature embeddings are summed up with other embeddings for each item if they are present in the model."

Merge with updated experimental/sasrec

spirinamayya added 7 commits November 18, 2024 19:01

fixed sasrec tutorial

890e0b8

rearranged order

752b05e

added transformer blocks

00932ea

Added losses

549ff4f

Merge branch 'experimental/sasrec' into tutorial/bert4rec

9c76623

merge with updated branch

fixed typos

d67e39c

added info about losses

0aed0ea

blondered marked this pull request as ready for review November 27, 2024 11:31

spirinamayya added 2 commits November 28, 2024 14:34

changed order, cross-validation

7cee679

added popular and non_defualt to cross-validation

2fa1e10

spirinamayya added 4 commits December 2, 2024 12:56

added metrics app

f61718f

changed main differences

6b6b33e

updated tutorial

6688255

added cross-validation results picture

7fbfecc

spirinamayya added 3 commits December 6, 2024 13:13

changed categorical feature description

bb81ceb

Merge branch 'experimental/sasrec' into tutorial/bert4rec

8e0c3c3

Merge with updated experimental/sasrec

fixed test_serialization.py

5c8cc72

blondered merged commit eca31d4 into MobileTeleSystems:experimental/sasrec Dec 6, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BERT4Rec tutorial #212

BERT4Rec tutorial #212

spirinamayya commented Nov 27, 2024

blondered commented Nov 27, 2024 •

edited

Loading

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024 •

edited

Loading

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Dec 6, 2024

BERT4Rec tutorial #212

BERT4Rec tutorial #212

Conversation

spirinamayya commented Nov 27, 2024

blondered commented Nov 27, 2024 • edited Loading

Transformer models

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024 • edited Loading

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Nov 29, 2024

blondered commented Dec 6, 2024

blondered commented Nov 27, 2024 •

edited

Loading

blondered commented Nov 29, 2024 •

edited

Loading