Skip to content

Commit

Permalink
Changed boldness in the table (#145)
Browse files Browse the repository at this point in the history
  • Loading branch information
Expertium authored Jan 1, 2025
1 parent 00c9e8e commit 6f97fdf
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 10 deletions.
20 changes: 11 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,10 @@ We use three metrics in the SRS benchmark to evaluate how well these algorithms
- ACT-R: the model proposed in [this paper](http://act-r.psy.cmu.edu/wordpress/wp-content/themes/ACT-R/workshops/2003/proceedings/46.pdf). It includes an activation-based system of declarative memory. It explains the spacing effect by the activation of memory traces.
- HLR: the model proposed by Duolingo. Its full name is Half-Life Regression. For further information, please refer to the [this paper](https://github.com/duolingo/halflife-regression).
- Transformer: a type of neural network that has gained popularity in recent years due to its superior performance in natural language processing. ChatGPT is based on this architecture. Both GRU and Transformer use the same power forgetting curve as FSRS-4.5 and FSRS-5 to make the comparison more fair.
- SM-2: one of the early algorithms used by SuperMemo, the first spaced repetition software. It was developed more than 30 years ago, and it's still popular today. [Anki's default algorithm is based on SM-2](https://faqs.ankiweb.net/what-spaced-repetition-algorithm.html), [Mnemosyne](https://mnemosyne-proj.org/principles.php) also uses it. This algorithm does not predict the probability of recall natively; therefore, for the sake of the benchmark, the output was modified based on some assumptions about the forgetting curve.
- SM-2-trainable: a variant of SM-2 where the parameters are trainable.
- SM-2: one of the early algorithms used by SuperMemo, the first spaced repetition software. It was developed more than 30 years ago, and it's still popular today. [Anki's default algorithm is based on SM-2](https://faqs.ankiweb.net/what-spaced-repetition-algorithm.html), [Mnemosyne](https://mnemosyne-proj.org/principles.php) also uses it. This algorithm does not predict the probability of recall natively; therefore, for the sake of the benchmark, the output was modified based on some assumptions about the forgetting curve. The algorithm is described by Piotr Wozniak [here](https://super-memory.com/english/ol/sm2.htm).
- SM-2 trainable: SM-2 algorithm with optimizable parameters.
- Anki: a variant of the SM-2 algorithm that is used in Anki.
- Anki trainable: Anki algorithm with optimizable parameters.
- NN-17: a neural network approximation of [SM-17](https://supermemo.guru/wiki/Algorithm_SM-17). It has a comparable number of parameters, and according to our estimates, it performs similarly to SM-17.
- Ebisu v2: [an algorithm that uses Bayesian statistics](https://fasiha.github.io/ebisu/) to update its estimate of memory half-life after every review.
- AVG: an "algorithm" that outputs a constant equal to the user's average retention. Has no practical applications and is intended only to serve as a baseline.
Expand Down Expand Up @@ -94,11 +96,11 @@ The following tables present the means and the 99% confidence intervals. The bes
| ACT-R | 5 | 0.362±0.0089 | 0.086±0.0024 | 0.534±0.0054 |
| FSRS v1 | 7 | 0.40±0.011 | 0.086±0.0024 | 0.633±0.0046 |
| AVG | 0 | 0.363±0.0090 | 0.088±0.0025 | 0.508±0.0046 |
| Anki | 7 | 0.41±0.011 | 0.094±0.0030 | 0.616±0.0057 |
| Anki trainable | 7 | 0.41±0.011 | 0.094±0.0030 | 0.616±0.0057 |
| HLR | 3 | 0.41±0.012 | 0.105±0.0030 | 0.633±0.0050 |
| HLR-short | 3 | 0.44±0.013 | 0.116±0.0036 | 0.615±0.0062 |
| SM2-trainable | 6 | 0.44±0.012 | 0.119±0.0033 | 0.599±0.0050 |
| Anki default param. | 0 | 0.49±0.015 | 0.128±0.0037 | 0.597±0.0055 |
| SM-2 trainable | 6 | 0.44±0.012 | 0.119±0.0033 | 0.599±0.0050 |
| Anki | 0 | 0.49±0.015 | 0.128±0.0037 | 0.597±0.0055 |
| SM-2-short | 0 | 0.51±0.015 | 0.128±0.0038 | 0.593±0.0064 |
| SM-2 | 0 | 0.55±0.017 | 0.148±0.0041 | 0.600±0.0051 |
| Ebisu-v2 | 0 | 0.46±0.012 | 0.158±0.0038 | 0.594±0.0050 |
Expand All @@ -110,7 +112,7 @@ The following tables present the means and the 99% confidence intervals. The bes
| --- | --- | --- | --- | --- |
| **GRU-P-short** | 297 | **0.346±0.0042** | **0.062±0.0011** | 0.699±0.0026 |
| GRU-P | 297 | 0.352±0.0042 | 0.063±0.0011 | 0.687±0.0025 |
| FSRS-5 recency | 19 | 0.355±0.0043 | 0.072±0.0012 | 0.701±0.0023 |
| **FSRS-5 recency** | 19 | 0.355±0.0043 | 0.072±0.0012 | **0.701±0.0023** |
| FSRS-rs | 19 | 0.356±0.0045 | 0.074±0.0012 | 0.698±0.0023 |
| FSRS-5 | 19 | 0.357±0.0043 | 0.074±0.0012 | 0.699±0.0023 |
| FSRS-5 preset | 19 | 0.358±0.0045 | 0.074±0.0012 | 0.699±0.0023 |
Expand All @@ -133,12 +135,12 @@ The following tables present the means and the 99% confidence intervals. The bes
| HLR | 3 | 0.469±0.0073 | 0.128±0.0019 | 0.637±0.0026 |
| FSRS v1 | 7 | 0.491±0.0080 | 0.132±0.0022 | 0.630±0.0025 |
| HLR-short | 3 | 0.493±0.0079 | 0.140±0.0021 | 0.611±0.0029 |
| Anki | 7 | 0.513±0.0089 | 0.140±0.0024 | 0.618±0.0023 |
| Anki trainable | 7 | 0.513±0.0089 | 0.140±0.0024 | 0.618±0.0023 |
| Ebisu-v2 | 0 | 0.499±0.0078 | 0.163±0.0021 | 0.605±0.0026 |
| Transformer | 127 | 0.468±0.0059 | 0.167±0.0022 | 0.531±0.0030 |
| SM2-trainable | 6 | 0.58±0.012 | 0.170±0.0028 | 0.597±0.0025 |
| SM-2 trainable | 6 | 0.58±0.012 | 0.170±0.0028 | 0.597±0.0025 |
| SM-2-short | 0 | 0.65±0.015 | 0.170±0.0028 | 0.590±0.0027 |
| Anki default param. | 0 | 0.62±0.011 | 0.172±0.0026 | 0.613±0.0022 |
| Anki | 0 | 0.62±0.011 | 0.172±0.0026 | 0.613±0.0022 |
| SM-2 | 0 | 0.72±0.017 | 0.203±0.0030 | 0.603±0.0025 |

Averages weighted by the number of reviews are more representative of "best case" performance when plenty of data is available. Since almost all algorithms perform better when there's a lot of data to learn from, weighting by n(reviews) biases the average towards lower values.
Expand Down
6 changes: 5 additions & 1 deletion superiority.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,21 +128,25 @@
# small changes to labels
index_5_dry_run = models.index("FSRS-5-dry-run")
index_anki_dry_run = models.index("Anki-dry-run")
index_anki_train = models.index("Anki")
index_5_pretrain = models.index("FSRS-5-pretrain")
index_v4 = models.index("FSRSv4")
index_v3 = models.index("FSRSv3")
index_v2 = models.index("FSRSv2")
index_v1 = models.index("FSRSv1")
index_sm2 = models.index("SM2")
index_sm2_train = models.index("SM2-trainable")
index_sm2_short = models.index("SM2-short")
models[index_5_dry_run] = "FSRS-5 \n def. param."
models[index_anki_dry_run] = "Anki \n def. param."
models[index_anki_train] = "Anki \n trainable"
models[index_5_pretrain] = "FSRS-5 \n pretrain"
models[index_v4] = "FSRS v4"
models[index_v3] = "FSRS v3"
models[index_v2] = "FSRS v2"
models[index_v1] = "FSRS v1"
models[index_sm2] = "SM-2"
models[index_sm2] = "SM-2 \n def. param."
models[index_sm2_train] = "SM-2 trainable"
models[index_sm2_short] = "SM-2-short"

fig, ax = plt.subplots(figsize=(16, 16), dpi=200)
Expand Down

0 comments on commit 6f97fdf

Please sign in to comment.