Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db/syncstrings: mark syncstrings with too large patches as "huge" #5818

Merged
merged 2 commits into from
Apr 1, 2022

Conversation

haraldschilly
Copy link
Contributor

@haraldschilly haraldschilly commented Mar 31, 2022

Description

if the patches of a syncstring are too large:

  • swallow the error, i.e. don't bubble it up via the callbacks and break the whole operation
  • instead, mark the symcstring as huge and from now on, ignore it

testing: well, this is not so trivial. what I did:

  • build database and hub, hub starts fine, ok ... that's no surprise
  • then I added a line raise new Error( ... ) right after the json stringify to simulate a problem. built it again … and then env MIN_AGE_DAYS=3 ./node_modules/.bin/cocalc-hub-maintenance-syncstrings … indeed, the "huge" field got filled up for a few syncstrings.
  • what's weird is, when I archived a few syncstrings (no error thrown), and tried to open it again, I got errors. INSERT INTO patches' ... error: null value in column \"time\" of relation \"patches\" violates not-null constraint". I hope this is just a side effect of me tinkering around. I'll try to see if I can reproduce this or understand why I saw this error.

Checklist:

  • Testing instructions are provided, if not obvious
  • Release instructions are provided, if not obvious

@@ -609,7 +609,7 @@ exports.extend_PostgreSQL = (ext) -> class PostgreSQL extends ext
dbg("determine inactive syncstring ids")
@_query
query : 'SELECT string_id FROM syncstrings'
where : [{'last_active <= $::TIMESTAMP' : misc.days_ago(opts.age_days)}, 'archived IS NULL']
where : [{'last_active <= $::TIMESTAMP' : misc.days_ago(opts.age_days)}, 'archived IS NULL', 'huge IS NOT FALSE']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be "huge IS NOT TRUE" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@haraldschilly
Copy link
Contributor Author

So, it took me a while to figure out what went wrong. I think this is something to look into. The problem is the following: in export_patches there is an epoch field, calculated in the database. Then, for each patch, the epoch is converted to a Date object. However, in my circumstances (node 14, on my own linux computer, and postgres 14) this epoch is suddenly a string. Hence

> new Date('1648746309968.000000')
Invalid Date

and from there time=null is in the serialized patch stream and well, un-archiving is broken.

I added a patch to check if this is a string and convert it to an integer first (there are no fractions). That works. a57cd96

However, I don't understand the dance from timeepochnew Date(epoch). Why not just taking the time field directly? There was certainly a reason on that made you write this.

Besides that, maybe we should add a check that the time is really a Date (or more general, verifying that all patches to be valid) before packaging up as a blob.

@williamstein
Copy link
Contributor

Why not just taking the time field directly?

Usually it is because JSON mangles Date objects, e.g.,

> JSON.parse(JSON.stringify(new Date()))
'2022-04-01T14:20:44.627Z'
> new Date()
2022-04-01T14:20:46.198Z

That said, in this particular case this might have just been a side effect of converting from RethinkDB to PostgreSQL, and not wanting to change too much code.

However, in my circumstances (node 14, on my own linux computer, and postgres 14) this epoch is suddenly a string.

How is it a string? Maybe we should figure that out... The query is:

SELECT extract(epoch from time)*1000 as epoch, * FROM patches

and according to the postgresql docs https://www.postgresql.org/docs/current/functions-datetime.html

extract(epoch from time)*1000 

is definitely a number, not a string. So it's weird that you're getting a string. That must have been painful to debug.

In any case, I'm going to merge this as is.

@williamstein williamstein merged commit 65251d7 into master Apr 1, 2022
@haraldschilly
Copy link
Contributor Author

I've investigated this, and you won't believe it. It's a matter of "integer" vs. "double precision" vs. "numeric" data types. I did a couple of tests, e.g. create table test (x1 int, x2 numeric); and inserting 123, 123.

Then in a node session using our database package: require("@cocalc/database") → db() → ...async_query({query: "select * from ..."}).then(...) and indeed:

[ { x1: 123, x2: '123' } ]

ok wow. this is brianc/node-pg-types#28 and technically, yes, it's not safe to convert them to plain javascript integers. Although I wonder, aren't there BigInts in Javascript 🤔

In any case, the postgres docs state:

  • version 13: extract ( field from timestamp ) → double precision
  • version 14: extract ( field from timestamp ) → numeric

also

  • select pg_typeof(extract(epoch from NOW()));numeric in pg 14
  • while in 13: → double precision

I searched for other uses of "extract" in cocalc's code, but I only found this one.

@williamstein
Copy link
Contributor

version 13: extract ( field from timestamp ) → double precision
version 14: numeric...

! That explains things. Gees. Incidentally, I was looking at an older version of the docs (first google hit) when I wrote my message, and only changed the link to the latest version at the last moment.

BigInts don't interoperate with anything else -- they're potentially even more annoying that a string. One surprise to me is that multiplication of huge BigInts in recent V8 is very fast, like in serious math software, due to them implementing serious algorithms (unlike Python).

@williamstein
Copy link
Contributor

I searched for other uses of "extract" in cocalc's code, but I only found this one.

I use this a lot in our code. Do a case insensitive search. It should be capital letters, and I did use them properly elsewhere...

@williamstein
Copy link
Contributor

I made this issue -- #5824

Hopefully you can fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants