-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ed's pet PR with no flake retries #17831
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: edsantiago The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Debian Build barfing with:
...but all other Build tasks (Fedora) pass. What is different about Debian TOML parser? And if Debian is different in this regard, what other Debian differences are there, and maybe one of those explains #17607? |
Oh, never mind; this is probably because Debian gets 'engine ... = runc'. TOML is annoying. |
75d88a8
to
d2c2cf4
Compare
It works! It works! Lots of failures:
...but these happen in |
Oh, and @vrothberg, this minimizes the need for an envariable |
The bash here scares me, but LGTM |
That's new |
@mheon no, what I mean is, what do we do with this? Obviously we can't enable it, because sqlite isn't ready. Are these failures helpful to you? (And no, "database is locked" is not new, look at |
@edsantiago My understanding was that it is ready, and all tests should be passing. The fact that we're flaking frequently is news to me, but I haven't been spending much time in PRs this week. |
Per SQLite docs it looks like internal concurrency issue across the same DB connection, which I thought their locking would handle... Looking further. |
Just looking at the very first |
We have some pretty clear concurrency issues on display here. The UNIQUE constraint bits look like a DB creation race that I thought would be impossible, but should be fairly easy to code around. The DB locking one is far more problematic. @vrothberg From my reading here, we may need to start using separate DB connections within the same Podman process for each access, or otherwise introduce some sort of locking to prevent more than 1 access to the DB at a time, because the locking on the connection doesn't seem to be sufficient to prevent this sort of thing. |
I'm sorry. These errors are everywhere, constant, so I genuinely thought they were part of the WIP sqlite migration. Didn't even bother to file flakes for them. For the locking, would transactions help? |
The issue appears to be that SQLite is allowing a write transaction to fire at the same time as another thread is reading the same table, resulting in errors. I don't think we want to wrap all reads in transactions for perf reasons, unfortunately. The confusing bit is that this seems to contradict the sqlite docs, which say that by default all accesses through a single connection are sequential only. Simplest fix might just be added a mutex to control access and ensure that remains true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing against your perl script but maybe it is not very well know: containers.conf supports the common .d
config directory layout.
So instead of throwing this all in /etc/containers/containers.conf
create a new file in /etc/containers/containers.conf.d/
with this entry. Seems much simpler to me.
@vrothberg and I spent a while discussing conf options this week, including the |
Using |
Actually not. Whenever CONTAINERS_CONF is set, only this file will be loaded and all system/user configs will be ignored. |
We've been talking about CONTAINERS_CONF_D or something where all of the normal CONTAINERS_CONF processing would be done, but the value of then environment variable would be processed at the end, perhaps it is time to add that. |
I am working on that at the moment :^) |
0eb4f23
to
b9818cb
Compare
5e48ff5
to
c163a07
Compare
Signed-off-by: Ed Santiago <[email protected]>
Signed-off-by: Ed Santiago <[email protected]>
Signed-off-by: Ed Santiago <[email protected]>
Signed-off-by: Ed Santiago <[email protected]>
Signed-off-by: Ed Santiago <[email protected]>
...it's useless clutter. Signed-off-by: Ed Santiago <[email protected]>
There is no new version yet but we like to use the new code[1] to debug a flake[2] in the podman CI. It will not fix it but the new error might give us a better idea what is going on. [1] checkpoint-restore/go-criu#175 [2] containers#18856 Signed-off-by: Paul Holzinger <[email protected]>
This should make system tests work with the requested DB
Signed-off-by: Ed Santiago [email protected]