Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collection of low priority & breaking ideas for FSRS #300

Closed
user1823 opened this issue Jun 15, 2023 · 31 comments
Closed

Collection of low priority & breaking ideas for FSRS #300

user1823 opened this issue Jun 15, 2023 · 31 comments
Labels
enhancement New feature or request

Comments

@user1823
Copy link
Collaborator

This issue serves to collect low-priority ideas that would break compatibility with the current version of the scheduler and helper. So, these should be implemented only when we release a new major version of FSRS.

  • For the sake of consistency, it is suggested to either replace $w_9$ in the stability_after_lapse formula with $e^{w_9}$ or replace $e^{w_6}$ in the stability_after_recall formula with $w_6$. I earlier suggested to replace $e^{w_6}$ in the stability_after_recall formula with $w_6$ because that is clearer. But, I now think that it might lead to worse results because after making this change, the value of $w_6$ would take more number of reviews to converge. So, a better alternative might be to replace $w_9$ in the stability_after_lapse formula with $e^{w_9}$.
  • For the sake of clarity, it is suggested that all formulas should be re-written in such a way that all the parameters are always positive. For example, if you have D^w (w is a parameter) and w is always positive - good, don't change anything. If w is always negative - rewrite the formula as D^-w and keep w positive. This would make it easier to read the wiki. This change has already been implemented in the experimental versions of the optimizer but I am still mentioning it because it has not been implemented in the stable version.
@user1823 user1823 added the enhancement New feature or request label Jun 15, 2023
@Expertium
Copy link
Collaborator

We had an issue like that before. I'm pretty sure these 2 ideas are already implemented.
Though I guess we can keep this one for things that aren't related to improving the algorithm and accuracy.

@Expertium
Copy link
Collaborator

Sherlock didn't close this issue, so I'll leave it here too.
#295

@Expertium
Copy link
Collaborator

A minor suggestion. I think renaming these options to "Reschedule all cards" and "Reschedule recently reviewed cards" would be nice.
image

@Expertium
Copy link
Collaborator

Expertium commented Jun 29, 2023

@L-M-Sherlock I recommend removing these 2 graphs
image
image
First of all, we already have a calibration graph. Second, they are harder to read than the first calibration graph. Third, if I want to know how well the algorithm performs for different levels of S and D, I just look at the B-W Heatmap.

@Expertium
Copy link
Collaborator

Expertium commented Jun 29, 2023

Unrelated, but I just wanted to ask: is it still necessary to have this code in each card type's template?
<div id=deck deck_name="{{Deck}}"></div>

@L-M-Sherlock
Copy link
Member

L-M-Sherlock commented Jun 30, 2023

I recommend removing these 2 graphs

I need them to check the distribution of stability and difficulty.

Unrelated, but I just wanted to ask: is it still necessary to have this code in each card type's template?

If your Anki version is higher than 2.1.62, you don't need to add that code.

@user1823
Copy link
Collaborator Author

user1823 commented Jun 30, 2023

A major problem in releasing a new major version was that it would force all users to retrain their parameters.

Today, I got an idea that would make this problem less severe.

No action is needed right now but knowing that the following solution exists decreases the resistance to the release of the next major version of the algorithm.

The scheduler.js is not a problem because a user can continue to use the older scheduler with the older parameters.

The main issue is with the helper add-on. If the helper add-on insists on the presence of a newer version of the scheduler, the user would have to update the scheduler and update the parameters.

The solution lies in updating the add-on while maintaining backward compatibility.

With the update, the main changes would be related to reschedule.py. So, I recommend creating a new copy of that file and using one file for dealing with scheduler v3 and one file for dealing with scheduler v4.

Other features (such as postpone, advance, disperse, stats, browser) would have minimal changes. So, the code for dealing with both v3 and v4 scheduler can be incorporated in the same file.

One obvious problem with this approach is that any bug fixes to reschedule.py would need to be applied to both the files. But, this is not a major problem in my opinion because both files would still be very similar.

Eventually, we would drop the support for v3 FSRS scheduler, reducing the maintenance load.

Another important change, which is related to the optimizer:

Just after the output parameters (in section 2.3), add a message: "Note: These values should be used with FSRS scheduler v4.0 or above."

@Expertium
Copy link
Collaborator

Expertium commented Jun 30, 2023

Other features (such as postpone, advance, disperse, stats, browser) would have minimal changes

No, it's actually the opposite. Once the R-matrix is implemented, all calculations involving R (postpone, advance, average retention in Stats, etc.) must use weighted R (a weighted average of the theoretical R and its corresponding R-matrix entry). This will require major changes to the code.

A major problem in releasing a new major version was that it would force all users to retrain their parameters.

Today, I got an idea that would make this problem less severe.

I think you are overestimating how much of a problem that is. For people who are already using FSRS, optimizing parameters is a familiar task, they won't mind. For people who are not using FSRS right now, this process will be just as unfamiliar in v4 as in v3.

Backward compatibility is a good idea though.

@user1823
Copy link
Collaborator Author

Other features (such as postpone, advance, disperse, stats, browser) would have minimal changes

Once the R-matrix is implemented, all calculations involving R (postpone, advance, average retention in Stats, etc.) must use weighted R (a weighted average of the theoretical R and its corresponding R-matrix entry). This will require major changes to the code.

Well, I didn't take the R-matrix into account (partly because I don't completely understand it).

A major problem in releasing a new major version was that it would force all users to retrain their parameters.
Today, I got an idea that would make this problem less severe.

I think you are overestimating how much of a problem that is. For people who are already using FSRS, optimizing parameters is a familiar task, they won't mind.

But, nobody would like an update (to the helper add-on) forcing them to immediately update the scheduler code and to immediately re-optimize the parameters.

@Expertium
Copy link
Collaborator

Well, I didn't take the R-matrix into account (partly because I don't completely understand it).

Here's the important part.
First, we group reviews into categories according to their D, S, and predicted (theoretical) R.
Then, within each category, we calculate measured R. We take reviews that fall within this category and do the usual: treat "Again" as 0, and treat other grades as 1.
Now we have a different estimate of R. So we have two estimates of R: theoretical and an R-matrix entry that corresponds to this D, this S, and this theoretical R.
Finally, we take a weighted average to obtain the best available estimate of R. The weight depends on the number of reviews that the entry in the matrix is based on. An entry based on 5 reviews will have very little weight; an entry based on 300 or more reviews will have the maximum possible weight (the number 300 is kind of arbitrary).
The logic is that this way we can correct bad theoretical predictions. If theoretical R is systematically over- or underestimated, R-matrix can help alleviate such a problem.

@Expertium
Copy link
Collaborator

@L-M-Sherlock I asked this a long time ago, but I still want to ask: do you have any recommendations at all regarding how often parameters should be re-optimized? Anything at all, even a simple rule of thumb like "4 times per year for collections younger than 1 year, 2 times per year otherwise". Of course, a formula would be better.
The reason I'm asking is not just because it's something that I have been wondering about, but also because I saw people on Reddit asking this a few times, so I assume many people have this question.

@L-M-Sherlock
Copy link
Member

@L-M-Sherlock I asked this a long time ago, but I still want to ask: do you have any recommendations at all regarding how often parameters should be re-optimized? Anything at all, even a simple rule of thumb like "4 times per year for collections younger than 1 year, 2 times per year otherwise". Of course, a formula would be better.

I often recommend once a month is enough in other forums.

@Expertium
Copy link
Collaborator

Expertium commented Jul 1, 2023

Sherlock, I also have a bit of an odd question - how do you reply so fast to questions related to FSRS on r/Anki? Like here: https://www.reddit.com/r/Anki/comments/14nrrur/fsrs_ankidroid_compatability/
Either you browse r/Anki every hour, or you have some kind of script or whatever that notifies you whenever someone makes a post that has "FSRS" in it (in the title and/or in the text of the post). That would be pretty cool, actually.

@Expertium
Copy link
Collaborator

Expertium commented Jul 1, 2023

Also, just for fun let's add Memrise's "algorithm" to comparison.

def memrise(history):
    ivl = 0
    reps = 0
    for delta_t, rating in history:
        delta_t = delta_t.item()
        rating = rating.item() + 1
        intervals = [1, 6, 12, 24, 48, 96, 180]
        if rating > 1:
            reps += 1
            if reps > 7:
                reps = 7
            ivl = intervals[reps-1]
        else:
            ivl = 1
            reps = 1
    return ivl

dataset['memrise_interval'] = dataset['tensor'].map(memrise)
dataset['memrise_p'] = np.exp(np.log(0.9) * dataset['delta_t'] / dataset['memrise_interval'])

It's that simple. It has no adaptive properties at all, just constant intervals.
https://memrise.zendesk.com/hc/en-us/articles/360015889057-How-does-the-spaced-repetition-system-work-

@Expertium
Copy link
Collaborator

Expertium commented Jul 1, 2023

Huh, I'm surprised. It's not that much worse, compared to SM2. I thought it would be far worse. I'm using Sherlock's collection and a version of 4.0 Alpha with some changes.
image

@L-M-Sherlock
Copy link
Member

Sherlock, I also have a bit of an odd question - how do you reply so fast to questions related to FSRS on r/Anki? Like here: https://www.reddit.com/r/Anki/comments/14nrrur/fsrs_ankidroid_compatability/ Either you browse r/Anki every hour, or you have some kind of script or whatever that notifies you whenever someone makes a post that has "FSRS" in it (in the title and/or in the text of the post). That would be pretty cool, actually.

Query FSRS Anki site:www.reddit.com and limit the time in past 24 hour from Google Search API.

Huh, I'm surprised. It's not that much worse, compared to SM2. I thought it would be far worse. I'm using Sherlock's collection and a version of 4.0 Alpha with some changes.

I have an another surprising experiment:

https://github.com/open-spaced-repetition/fsrs4anki/blob/Expt/new-baseline/experiment/RNN.ipynb

FSRS with 13 parameters has approaching performance to LSTM/GRU with hundreds parameters.

@Expertium
Copy link
Collaborator

Expertium commented Jul 1, 2023

Query FSRS Anki site:www.reddit.com and limit the time in past 24 hour from Google Search API.

I was actually trying something similar: site:reddit.com/r/anki "fsrs" (paste it in google and limit results to 24 hours), but for some reason it doesn't show some posts.
image

https://github.com/open-spaced-repetition/fsrs4anki/blob/Expt/new-baseline/experiment/RNN.ipynb

FSRS with 13 parameters has approaching performance to LSTM/GRU with hundreds parameters.

I'm actually not all that surprised. Thanks to Woz we know that it's possible to make a good algorithm that doesn't involve neural networks.
Out of curiosity, would you make an LSTM with 1000 (or the closest number to 1000) parameters and add it to the comparison? I know it will take a while to optimize such a thing, but I still want to have this option for the sake of future comparisons once v4 is released.

@L-M-Sherlock
Copy link
Member

L-M-Sherlock commented Jul 1, 2023

I'm actually not all that surprised. Thanks to Woz we know that it's possible to make a good algorithm that doesn't involve neural networks. Out of curiosity, would you make an LSTM with 1000 (or the closest number to 1000) parameters and add it to the comparison? I know it will take a while to optimize such a thing, but I still want to have this option for the sake of future comparisons once v4 is released.

Maybe later. And I have an extra finding:

image

The neural network provides some weird intervals in the long-term. The interval doesn't increase anymore.

@Expertium
Copy link
Collaborator

Expertium commented Jul 1, 2023

The neural network provides some weird intervals in the long-term. The interval doesn't increase anymore.

Isn't that to be expected? Sinc is proportional to S^-w, so Sinc tends to decrease as S itself increases. If anything, I'm actually impressed that the neural net figured it out. Although I would assume that the limit of S would be more like 20 years, not 7 months.

@L-M-Sherlock
Copy link
Member

L-M-Sherlock commented Jul 1, 2023

Isn't that to be expected? Sinc is proportional to S^-w, so Sinc tends to decrease as S itself increases. If anything, I'm actually impressed that the neural net figured it out. Although I would assume that the limit of S would be more like 20 years, not 7 months.

It may rely on the interval in my collection. If I use a collection of new user, it would be more short.

@Expertium
Copy link
Collaborator

I tried the LSTM (181 parameters). It's having trouble with my collection, just like FSRS.
image

@Expertium
Copy link
Collaborator

Expertium commented Jul 1, 2023

Also, it doesn't seem like running a model with 1000 parameters is possible, google colab runs out of RAM way before that

@L-M-Sherlock
Copy link
Member

Also, it doesn't seem like running a model with 1000 parameters is possible, google colab runs out of RAM way before that

I update the notebook. Now it shows the result of LSTM with 1489 weights.

I tried the LSTM (181 parameters). It's having trouble with my collection, just like FSRS.

Maybe we need to pay more attention to the data instead of the model. There would be inherent conflicts in your data.

@Expertium
Copy link
Collaborator

Maybe we need to pay more attention to the data instead of the model. There would be inherent conflicts in your data.

Well, we've tried in the past, but got nowhere.

@Expertium
Copy link
Collaborator

Expertium commented Jul 3, 2023

I noticed that the train and test loss are always the same for the neural net. I wonder if it's a bug (it happened before you updated it too).
image
EDIT: wrong graph. The graph above is before you updated it. The new one just crashes.
image

@L-M-Sherlock
Copy link
Member

I noticed that the train and test loss are always the same for the neural net. I wonder if it's a bug (it happened before you updated it too).

Because I removed the split in this notebook. The weights from different splits cannot be averaged in a rational way when we train the LSTM.

@Expertium
Copy link
Collaborator

Another low-priority idea is to make it so that if the user sets requestRetention below 0.75, it will be set to 0.75, and if he sets it above 0.97, it will be set to 0.97. This will prevent the user from shooting himself in the foot by setting very high or very low levels of R.

@Expertium
Copy link
Collaborator

I still think that "Free Days" should affect all cards, including cards in the (re)learning stage. Otherwise the name is misleading and the feature isn't doing what you would assume.

@L-M-Sherlock
Copy link
Member

I still think that "Free Days" should affect all cards, including cards in the (re)learning stage. Otherwise the name is misleading and the feature isn't doing what you would assume.

The due of (re)learning cards is formatted in a different way from review cards. So it's a technical issue. I don't want to dive into the complicated code around the (re)learning stages. It would induce more bugs.

@Expertium
Copy link
Collaborator

I update the notebook. Now it shows the result of LSTM with 1489 weights.

It seems to perform about the same, at least on your collection. I thought that such a big increase in the number of parameters would make it far more accurate.

@Expertium
Copy link
Collaborator

Expertium commented Jul 5, 2023

I tried it on your collection with 2181 parameters (n_hidden=20) and with 246 parameters (n_hidden = 5), it barely makes a difference. In fact, the larger model actually performed a little worse.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants