-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP 500 errors during indexing #328
Comments
Hello, I can't reproduce this. Does this happen when you click on an identifier? Or on a source page? |
Wow, it’s working now... Presumably a transient issue then. It happened whenever I tried accessing |
The logs have traces:
This is somewhat known; but we have (1) no tracking issue and (2) no fix currently. I haven't digged deeper but it is surprising that it can resolve itself over time. IE we seem to get an error from data read from database, but it stops appearing over time. |
I did some research on this and it's not clear to me if Elixir is handling database concurrency correctly. Berkeley DB has this concept of "products" with solutions for different concurrency/availability requirements. See: https://docs.oracle.com/cd/E17276_01/html/programmer_reference/intro_products.html
It seems to me that Elixir currently does not use the Concurrent Data Store But also, from the products document:
So maybe these just issues with how databases are shared between threads in the update script. If neither of that is the problem, then perhaps the values in the database are not updated "atomically". What I mean is that it is possible that the update script sometimes writes a temporary value that does not have the correct format into defs/refs databases. Maybe the key is initially empty and that's why split with unpack fails. @tleb You probably know more about the update script than I do. tldr three things to try/investigate:
|
Hm, a somewhat simpler approach would be to avoid any kind of concurrent access. Current production server can handle a duplicate of the data from any project. Biggest project data is Linux at 31G, and prod server has 39G left. We could scale its disk as well, that is an option. That way we avoid any potential concurrency issue. That has a few consequences:
Question:
$ du -hc linux/data/* | sort -hr
31G total
15G linux/data/references.db
12G linux/data/versions.db
4.0G linux/data/definitions.db
196M linux/data/hashes.db
149M linux/data/blobs.db
113M linux/data/doccomments.db
102M linux/data/filenames.db
34M linux/data/compatibledts.db
5.2M linux/data/compatibledts_docs.db
8.0K linux/data/variables.db |
The same sort of HTTP 500 trace still appears on Oct 28, Oct 29, Nov 1, Nov 2 and Nov 4. All those dates are after the deploy of 0b8d735 (Oct 11). So the shared flag is not enough. I maintain the simpler approach is to do indexing on the side and update symlinks. Prod server storage size is even less of an issue now that I ran some |
See https://elixir.bootlin.com/linux/v6.10.9/source/block/blk-core.c:
This happens on all versions of the kernel I’ve tried, back to https://elixir.bootlin.com/linux/v2.6.39.4/source/block/blk-core.c at least.
The text was updated successfully, but these errors were encountered: