Changed boldness in the table (#145)

open-spaced-repetition · Jan 1, 2025 · 6f97fdf · 6f97fdf
1 parent 00c9e8e
commit 6f97fdf
Show file tree

Hide file tree

Showing 2 changed files with 16 additions and 10 deletions.
diff --git a/README.md b/README.md
@@ -48,8 +48,10 @@ We use three metrics in the SRS benchmark to evaluate how well these algorithms
 - ACT-R: the model proposed in [this paper](http://act-r.psy.cmu.edu/wordpress/wp-content/themes/ACT-R/workshops/2003/proceedings/46.pdf). It includes an activation-based system of declarative memory. It explains the spacing effect by the activation of memory traces.
 - HLR: the model proposed by Duolingo. Its full name is Half-Life Regression. For further information, please refer to the [this paper](https://github.com/duolingo/halflife-regression).
 - Transformer: a type of neural network that has gained popularity in recent years due to its superior performance in natural language processing. ChatGPT is based on this architecture. Both GRU and Transformer use the same power forgetting curve as FSRS-4.5 and FSRS-5 to make the comparison more fair.
-- SM-2: one of the early algorithms used by SuperMemo, the first spaced repetition software. It was developed more than 30 years ago, and it's still popular today. [Anki's default algorithm is based on SM-2](https://faqs.ankiweb.net/what-spaced-repetition-algorithm.html), [Mnemosyne](https://mnemosyne-proj.org/principles.php) also uses it. This algorithm does not predict the probability of recall natively; therefore, for the sake of the benchmark, the output was modified based on some assumptions about the forgetting curve.
-  - SM-2-trainable: a variant of SM-2 where the parameters are trainable.
+- SM-2: one of the early algorithms used by SuperMemo, the first spaced repetition software. It was developed more than 30 years ago, and it's still popular today. [Anki's default algorithm is based on SM-2](https://faqs.ankiweb.net/what-spaced-repetition-algorithm.html), [Mnemosyne](https://mnemosyne-proj.org/principles.php) also uses it. This algorithm does not predict the probability of recall natively; therefore, for the sake of the benchmark, the output was modified based on some assumptions about the forgetting curve. The algorithm is described by Piotr Wozniak [here](https://super-memory.com/english/ol/sm2.htm).
+  - SM-2 trainable: SM-2 algorithm with optimizable parameters.
+- Anki: a variant of the SM-2 algorithm that is used in Anki.
+  - Anki trainable: Anki algorithm with optimizable parameters.
 - NN-17: a neural network approximation of [SM-17](https://supermemo.guru/wiki/Algorithm_SM-17). It has a comparable number of parameters, and according to our estimates, it performs similarly to SM-17.
 - Ebisu v2: [an algorithm that uses Bayesian statistics](https://fasiha.github.io/ebisu/) to update its estimate of memory half-life after every review.
 - AVG: an "algorithm" that outputs a constant equal to the user's average retention. Has no practical applications and is intended only to serve as a baseline.
@@ -94,11 +96,11 @@ The following tables present the means and the 99% confidence intervals. The bes
 | ACT-R | 5 | 0.362±0.0089 | 0.086±0.0024 | 0.534±0.0054 |
 | FSRS v1 | 7 | 0.40±0.011 | 0.086±0.0024 | 0.633±0.0046 |
 | AVG | 0 | 0.363±0.0090 | 0.088±0.0025 | 0.508±0.0046 |
-| Anki | 7 | 0.41±0.011 | 0.094±0.0030 | 0.616±0.0057 |
+| Anki trainable | 7 | 0.41±0.011 | 0.094±0.0030 | 0.616±0.0057 |
 | HLR | 3 | 0.41±0.012 | 0.105±0.0030 | 0.633±0.0050 |
 | HLR-short | 3 | 0.44±0.013 | 0.116±0.0036 | 0.615±0.0062 |
-| SM2-trainable | 6 | 0.44±0.012 | 0.119±0.0033 | 0.599±0.0050 |
-| Anki default param. | 0 | 0.49±0.015 | 0.128±0.0037 | 0.597±0.0055 |
+| SM-2 trainable | 6 | 0.44±0.012 | 0.119±0.0033 | 0.599±0.0050 |
+| Anki | 0 | 0.49±0.015 | 0.128±0.0037 | 0.597±0.0055 |
 | SM-2-short | 0 | 0.51±0.015 | 0.128±0.0038 | 0.593±0.0064 |
 | SM-2 | 0 | 0.55±0.017 | 0.148±0.0041 | 0.600±0.0051 |
 | Ebisu-v2 | 0 | 0.46±0.012 | 0.158±0.0038 | 0.594±0.0050 |
@@ -110,7 +112,7 @@ The following tables present the means and the 99% confidence intervals. The bes
 | --- | --- | --- | --- | --- |
 | **GRU-P-short** | 297 | **0.346±0.0042** | **0.062±0.0011** | 0.699±0.0026 |
 | GRU-P | 297 | 0.352±0.0042 | 0.063±0.0011 | 0.687±0.0025 |
-| FSRS-5 recency | 19 | 0.355±0.0043 | 0.072±0.0012 | 0.701±0.0023 |
+| **FSRS-5 recency** | 19 | 0.355±0.0043 | 0.072±0.0012 | **0.701±0.0023** |
 | FSRS-rs | 19 | 0.356±0.0045 | 0.074±0.0012 | 0.698±0.0023 |
 | FSRS-5 | 19 | 0.357±0.0043 | 0.074±0.0012 | 0.699±0.0023 |
 | FSRS-5 preset | 19 | 0.358±0.0045 | 0.074±0.0012 | 0.699±0.0023 |
@@ -133,12 +135,12 @@ The following tables present the means and the 99% confidence intervals. The bes
 | HLR | 3 | 0.469±0.0073 | 0.128±0.0019 | 0.637±0.0026 |
 | FSRS v1 | 7 | 0.491±0.0080 | 0.132±0.0022 | 0.630±0.0025 |
 | HLR-short | 3 | 0.493±0.0079 | 0.140±0.0021 | 0.611±0.0029 |
-| Anki | 7 | 0.513±0.0089 | 0.140±0.0024 | 0.618±0.0023 |
+| Anki trainable | 7 | 0.513±0.0089 | 0.140±0.0024 | 0.618±0.0023 |
 | Ebisu-v2 | 0 | 0.499±0.0078 | 0.163±0.0021 | 0.605±0.0026 |
 | Transformer | 127 | 0.468±0.0059 | 0.167±0.0022 | 0.531±0.0030 |
-| SM2-trainable | 6 | 0.58±0.012 | 0.170±0.0028 | 0.597±0.0025 |
+| SM-2 trainable | 6 | 0.58±0.012 | 0.170±0.0028 | 0.597±0.0025 |
 | SM-2-short | 0 | 0.65±0.015 | 0.170±0.0028 | 0.590±0.0027 |
-| Anki default param. | 0 | 0.62±0.011 | 0.172±0.0026 | 0.613±0.0022 |
+| Anki | 0 | 0.62±0.011 | 0.172±0.0026 | 0.613±0.0022 |
 | SM-2 | 0 | 0.72±0.017 | 0.203±0.0030 | 0.603±0.0025 |
 
 Averages weighted by the number of reviews are more representative of "best case" performance when plenty of data is available. Since almost all algorithms perform better when there's a lot of data to learn from, weighting by n(reviews) biases the average towards lower values.

diff --git a/superiority.py b/superiority.py
@@ -128,21 +128,25 @@
     # small changes to labels
     index_5_dry_run = models.index("FSRS-5-dry-run")
     index_anki_dry_run = models.index("Anki-dry-run")
+    index_anki_train = models.index("Anki")
     index_5_pretrain = models.index("FSRS-5-pretrain")
     index_v4 = models.index("FSRSv4")
     index_v3 = models.index("FSRSv3")
     index_v2 = models.index("FSRSv2")
     index_v1 = models.index("FSRSv1")
     index_sm2 = models.index("SM2")
+    index_sm2_train = models.index("SM2-trainable")
     index_sm2_short = models.index("SM2-short")
     models[index_5_dry_run] = "FSRS-5 \n def. param."
     models[index_anki_dry_run] = "Anki \n def. param."
+    models[index_anki_train] = "Anki \n trainable"
     models[index_5_pretrain] = "FSRS-5 \n pretrain"
     models[index_v4] = "FSRS v4"
     models[index_v3] = "FSRS v3"
     models[index_v2] = "FSRS v2"
     models[index_v1] = "FSRS v1"
-    models[index_sm2] = "SM-2"
+    models[index_sm2] = "SM-2 \n def. param."
+    models[index_sm2_train] = "SM-2 trainable"
     models[index_sm2_short] = "SM-2-short"
 
     fig, ax = plt.subplots(figsize=(16, 16), dpi=200)