Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Separate retention rates for “learning” and “known” items #694

Open
aedoncassiel opened this issue Sep 28, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@aedoncassiel
Copy link

Essentially, I have had a ton of success dropping my desired retention rate far down so I can cram many new items without spending five hours per day on review. Then I can quickly get the items I turned out to have learned very easily far out of the way, and focus all my time in the beginning on the harder ones. The items that turn out to be easy and hard aren't always what I thought they would have been, so this is immensely valuable.

However: let's say I mastered the Japanese kana in a handful of days. Now, even if I keep that deck at 99% retention, this only asks me to review a single kana every few weeks—if that.

If I'm dealing with finite, well defined, separate decks, I can just up the retention on that deck. But what if I'm using a "Japanese" deck and it happens to include both kana and thousands of new kanji I'm learning? I can't set the whole deck to 99%, but setting it lower isn’t ideal for the kana because if I take a break from reading in Japanese, there's no reason not to make sure I review a kana every few weeks and keep kana reading perfectly fresh for practically zero cost.

I'm using a simplified example to get the general point across. In fact, for every one of us, all of our decks have a mixture of "kana" and "kanji": new items we're struggling through, and old items we've learned so well we could probably even put them at 99% retention with no downside.

The use case I have in mind here is particular, to be fair. In most cases most people want to learn something new they'll be using for a set number of years, and are fine with forgetting if they take up a different job or hobby, and so on. In my case, I want to maintain high literary proficiency over my lifespan with multiple new languages, even over periods where I'm not fitting much practice with that language in.

So I see this issue very clearly in my Spanish deck. I have always had, and like having one deck for this language. I am very comfortable with several thousand intermediate, non-cognate words. My optimal retention rate to spend minimum time on the deck in the next month? 70%. In the next six months? 70%. In the next year? 70%.

... and in the next decade? Well, suddenly optimal retention rate rockets up to 90%. I believe this is clear evidence of "mature" items being scheduled out too far because new and mature items do not have the same optimal retention.

Now I have a few options. I can lock this deck, up the retention rate, push every unseen card in the deck to Spanish2, and plan to keep this up every year or so with each language for years until my menu has Spanish 1-20, Japanese 1-20, and so on. Set Spanish2 to 70% and Spanish1 to 90%, and so on with Spanish3 and Spanisg 2 next year. Or, I can just set my one deck to 90% and lose tons of efficiency over-reviewing new items. Or, I can leave it at 70% and get new words down much more efficiently, while then wastefully forgetting too many of them after several more months pass, when a quick review every ~6 months, say, would have sufficed to keep all of them locked perfectly in memory. Of course I can also set a maximum interval, but still lose efficiency as I inevitably over- or under-guess a good baseline and lump every item into this another over-generalized standard.

I think a built-in ability to start at low retention and then raise the retention rate, per item, as the cost of keeping that item at high retention becomes trivial, could potentially be as groundbreaking as FSRS itself is.

@aedoncassiel aedoncassiel added the enhancement New feature or request label Sep 28, 2024
@brishtibheja
Copy link

L.M.Sherlock tried a system that varies desired retention for each individual card but it didn't go too far.

I think a built-in ability to start at low retention and then raise the retention rate, per item, as the cost of keeping that item at high retention becomes trivial, could potentially be as groundbreaking as FSRS itself is.

Agreed this can be good. But the UI for such a thing will be complex. But also, would not a lot of mature knowledge would already be deeply encoded semantically that you wouldn't need them per se? Not sure on that front.

@Expertium
Copy link
Collaborator

Expertium commented Sep 28, 2024

Essentially, I have had a ton of success dropping my desired retention rate far down so I can cram many new items without spending five hours per day on review.

Make sure it doesn't go below minimum recommended retention.
image

There two issues with your idea.

First, a greater cognitive burden for the user, who will have to configure two different values of desired retention instead of one, and people are already struggling with realizing that desired retention affects interval lengths. I'm not sure how many users know it, my pessimistic estimate would be 50%. In other words, I'd say about 50% of users have no idea that desired retention affects interval lengths. I'm saying this because I've been doing the Anki equivalent of tech support for about a year. Maybe 75% know it, if I'm being optimistic. Even fewer have ever touched "Compute minimum recommended retention" or used different values of desired retention for different presets. I think FSRS should remove options and settings rather than adding them, if we ever want FSRS to be used by anyone who isn't a complete nerd.

Second issue - defining what counts as "mature". It's arbitrary. In Anki a card is considered "mature" if its interval is >=21 days, but why not 20 or 22?

Side note: Jarrett has been working on a special "regime" for FSRS where it doesn't maintain a specific level of desired retention, and instead tries to make the memory stability as high as possible as fast (in terms of time spent on reviews) as possible, but it seems that it doesn't always work.

@aedoncassiel
Copy link
Author

aedoncassiel commented Sep 28, 2024

Second issue - defining what counts as "mature". It's arbitrary. In Anki a card is considered "mature" if its interval is >=21 days, but why not 20 or 22?

I think, not necessarily. Because we’re basing this off of optimal retention to spend minimal time, which is what makes FSRS so valuable to begin with.

So, if I take any deck with lots of new cards and lots of cards I've seen for months, and I separate those cards in different decks, the calculator is going to tell me (at least in my experience with several decks so far) that the optimal retention of the new cards for the next year is pretty low and the optimal retention for the old cards for the next decade is a lot higher. Remember that for this latter set, the data has actually had time to push these words out ~8 months and then see how many of them I do in fact recall.

So, this shows me that the calculator already knows that the optimal retention to spend minimum time is different for these two sets. I think this is simply because pushing known items out a full year until I forget perhaps 30% of them is very inefficient when perhaps one five second review in the many months prior might even have been enough to keep me at 99% retention for all these items.

Possibly, certainly at least in theory, a more advanced calculator could in and of itself determine what the most effective definition of “matured knowledge” is for different users in different decks by simulating different cut-off points across which to target different retentions, just like it simulates different retention rates to find the optimal retention now. (I can't even see how it would hurt to have this happen silently under the hood, without the user knowing anything different.)

Barring that, of course, I do think even a lazy and arbitrary single cut-off point somewhere would still go some way to address the reality that the optimal retention to spend minimum time spent is indeed different for, in a broad sense, “things you’re learning” and “things you know”. I struggle to imagine any way an imperfect but partial solution to this could make anything worse.

@Expertium
Copy link
Collaborator

Btw, why not just adjust max. interval?

@brishtibheja
Copy link

If time needed for R to be .99 is higher than max_interval then you'll be having really unoptimal scheduling.

@L-M-Sherlock
Copy link
Member

In fact, if we forget the optimum retention, and just to find the optimum intervals, we will get a gradually increasing retention:

image

The recall probability corresponding to optimal interval increases with half-life and decreases with difficulty, as shown in Figure 9(c). It means that the scheduler will instruct learners to review at a lower retrieval strength in the early stages of memorization, which may be a reflection of "desirable difficulties"[2]. As the half-life increases to the target value, the recall probability approaches 100%. According to the equation Δ𝑡 = −h · log2 𝑝 and the trend of 𝑝 on h, Δ𝑡 is first increasing and then decreasing where the peak emerges.

Source: my paper

@user1823 user1823 changed the title [Feature Request] A year into using FSRS, I’m convinced we could benefit greatly fromseparate retention rates for “learning” and “known” items. [Feature Request] Separate retention rates for “learning” and “known” items Oct 7, 2024
@JSchoreels
Copy link

First, thank you @aedoncassiel , you really put well in words what I was also starting to phrase in the discord about : "FSRS being a better prediction tool than SM2, but a worst learning tool".

As you said, FSRS goal is to make the best prediction possible about the retrievability of a card. But there is an underlying feature that users might want to have, is to know better with time what they learn, instead of stagnating to a specific percentage eventually.

With SM2, the ease_factor being a separate thing helped with that : It was a way to accommodate, on a card-level basis, how you would like interval to grow, and thus impacting the retention of those with higher/lower difficulty. But this was already an alteration of the initial design which was to expression different difficulty, instead of different desired retention.

Now, FSRS, by being better at predicting, doesn't really need that user input, so now it makes good prediction, but which won't allow the user to go above the desired retention. @L-M-Sherlock mentionned in discord the w_15 and w_16 parameters that help you tweak the stability based on HARD/EASY grade usage

image

But what I'm afraid is that, if you use HARD/EASY to alter the stability to increase your retention from 80% to 95% let's say, the next time you'll press "Optimize", the optimizer will adapt those 2 parameters to make them "Good again at predicting 80% retention", thus making them less and less useful.

The concept of "desirable difficulties" however seems extremely helpful to achieve that.

Thing is, does it has to be FSRS that solve this issue by changing from desired retention to desired difficulties, or as you @aedoncassiel described earlier, should we for now, keep the desired retention but allow Anki to have different Desired Retention based on function, card-level settings, etc ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants