Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cap steps #1548

Merged
merged 8 commits into from
Dec 15, 2021
Merged

Cap steps #1548

merged 8 commits into from
Dec 15, 2021

Conversation

RumovZ
Copy link
Collaborator

@RumovZ RumovZ commented Dec 10, 2021

Closes #1518.
I couldn't reproduce the DB error reported by the user. I got some overflow errors in Rust, but I think they only cause panics in dev builds? So I can only hope this fixes it.

change to_secs() to to_secs_capped()

This doesn't seem to be necessary as it already casts f32 to u32, which is saturating. Or did you mean something else?

Copy link
Member

@dae dae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this makes me a bit nervous :-) As you know, the due and original_due columns may store a timestamp, # of days, or a position. Increasing those fields to an i64 will ensure timestamps made after 2038 continue to work, but for the other two cases, it's hard to imagine more than 2 billion days/positions being required/not being a bug. And because any learning stamp that crosses a day boundary is automatically converted to a day span, we theoretically should not need to write a number over an i32 into those columns until 2038 comes around - for now just using an i64 for the days elapsed calculation (but not disk format) is probably sufficient.

Some Anki versions will not be able to open a collection that contains due/odue over an i32 IIRC, and there is code in the DB check that warns about and alters due numbers and positions over ~2 billion and 1 million respectively. I'm a bit concerned that if we bump the storage format up to i64 at the moment, we may end up introducing issues when numbers above an i32 accidentally get written into one of those columns due to a bug or unrestrained settings. So maybe we should put off the column change for now, and just focus on doing the calculation as an i64 so that a learning step of 30 years works today?

@@ -37,20 +37,20 @@ impl CardStateUpdater {
match interval {
IntervalKind::InSecs(secs) => {
self.card.queue = CardQueue::Learn;
self.card.due = self.fuzzed_next_learning_timestamp(secs);
self.card.due = dbg!(self.fuzzed_next_learning_timestamp(dbg!(secs)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look to have been missed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I wished there was a lint for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is one that we could turn on: rust-lang/rust-clippy#3723

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it practical to turn that on? Looks like it can only be passed when running clippy, and not enabled with a clippy.toml.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, my wish came true fast! Feels like Christmas already. 🎄😃

@RumovZ
Copy link
Collaborator Author

RumovZ commented Dec 13, 2021

And because any learning stamp that crosses a day boundary is automatically converted to a day span

Right, I kind of forgot about that.

The more I look at the code, the less I understand why there would be any problems. Do you have an idea where the DB error could stem from?

focus on doing the calculation as an i64 so that a learning step of 30 years works today?

Looks like not even that is necessary, because in apply_learning_state(), we first call maybe_as_days(), so for any subsequent calculations, seconds have an upper bound of 60 * 60 * 24.

@dae
Copy link
Member

dae commented Dec 13, 2021

I can't reproduce it now unfortunately. I saw the learning step and assumed that was the issue, but perhaps there was something else going on there as well, like the current state of the card, or maybe some change to our code in the interim has changed things. Apologies for the wild goose chase :-(

@RumovZ
Copy link
Collaborator Author

RumovZ commented Dec 13, 2021

No problem, at least some improvements could be made. 🙂

@dae
Copy link
Member

dae commented Dec 13, 2021

Ok, partly reproduced: set learning steps to 30950d, open the 'previous card info', then answer 'again' on a new card, and you get this in the console: anki.errors.DBError: DbError { info: "IntegralValueOutOfRange(4, -2674080000)", kind: Other }. Presumably that's because we're exceeding the u32 in maybe_as_days()

@RumovZ
Copy link
Collaborator Author

RumovZ commented Dec 13, 2021

Strange.

  • On main, I get thread '<unnamed>' panicked at 'attempt to multiply with overflow', rslib/src\scheduler\states\steps.rs:54:17 when I try to review a deck with these settings and new cards queued. This is fixed.
  • On this branch I get thread '<unnamed>' panicked at 'attempt to multiply with overflow', rslib/src\revlog\mod.rs:82:13. I'll push a fix and test some more.
  • With a binary, beta 2, I get nothing though. This is in accordance with my understanding that overflows are errors only in release builds.

Another potential issue is IntervalKind::as_seconds(). I'll fix that, too.
I don't know why I can't reproduce the DB errors, but they seem to indicate a failed deserialisation. Maybe we are trying to deserialise -2674080000 into a u32 or i32. I'll look into it.

@dae
Copy link
Member

dae commented Dec 13, 2021

You may be aware of the things below already, but just in case:

  • you can test a release build of the source easily with scripts/runopt - no need for a binary release
  • the fact that overflow checks are off on release builds can just push the error down the road - values may silently wrap, and if a negative number gets written out to the database, it may then fail to deserialize later

@RumovZ
Copy link
Collaborator Author

RumovZ commented Dec 13, 2021

you can test a release build of the source easily with scripts/runopt - no need for a binary release

Thanks for the reminder! So did you reproduce the error with a release build of this branch?

the fact that overflow checks are off on release builds can just push the error down the road - values may silently wrap, and if a negative number gets written out to the database, it may then fail to deserialize later

But a wrapped number is still a valid instance of its data type, an unsigned integer remains unsigned. I may be wrong, but I can only see this happening if we deserialise into a different data type than we serialised from.

@dae
Copy link
Member

dae commented Dec 13, 2021

For me it happens in both debug and release builds using this branch. With TRACESQL=1, it looks like it's the revlog we're overflowing:

sql: insert into revlog values (1639437905953,1639430610404,-1,1,-2674080000,-2674080000,0,3353,0)

@RumovZ
Copy link
Collaborator Author

RumovZ commented Dec 14, 2021

Thanks, that helped a lot, the penny finally dropped. I was testing on a v3 profile, where logging is done in Rust and values are wrapped. After reverting to the v2 scheduler, which is executing SQL statements in Python, I can reproduce the DB error.
It also explains why the cap in TS didn't solve it. 30950d fits in a u32, but in revlogs, intervals are stored in i32s, in which it doesn't fit.

Also replace some `as` with `from` and `try_from` as is recommended to
highlight potential issues.
Whereas large card intervals are converted to days, revlog intervals use
i32s to store large numbers of seconds.
@@ -98,9 +99,9 @@ impl Collection {
cid: card.id,
usn,
button_chosen: 0,
interval: card.interval as i32,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, but shouldn't it check if interval is in days or secs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this code is only run when manually rescheduling a card (which counts as a review), the assigned interval should always be day-based. We could make last_interval negative in the case if the card was currently a learning card, but probably not the highest priority?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess not, just wanted to point it out as I stumbled across it. 🙂

Copy link
Member

@dae dae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks as always Rumo!

@@ -98,9 +99,9 @@ impl Collection {
cid: card.id,
usn,
button_chosen: 0,
interval: card.interval as i32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this code is only run when manually rescheduling a card (which counts as a review), the assigned interval should always be day-based. We could make last_interval negative in the case if the card was currently a learning card, but probably not the highest priority?

@@ -5,6 +5,7 @@ const DEFAULT_SECS_IF_MISSING: u32 = 60;

#[derive(Clone, Copy, Debug, PartialEq)]
pub(crate) struct LearningSteps<'a> {
/// The steps in minutes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@dae dae merged commit 80ed94e into ankitects:main Dec 15, 2021
@RumovZ RumovZ deleted the cap-steps branch December 15, 2021 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

we should ignore crazy values in learning steps
2 participants