-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/record: use direct I/O for the WAL and MANIFEST #1159
Comments
I did testing a while ago which showed that direct I/O writes had similar sync latency to recycled log fsyncs. The latency was actually a few percent better for direct I/O, so there would be a perf win, but not a dramatic one. Direct I/O also uses less CPU. Direct I/O writes: the best way to improve your credit score is an interesting recent blog post on this topic. |
#41 (comment) is a useful comment on the intricacies of direct I/O. |
I've seen it mentioned a few times that in some systems a |
Also see the discussion and related links to other discussions on cockroachdb/cockroach#88442 (comment) |
There's an interaction with disk stall detection that may have been obvious to others but eluded me. When we write through the page cache, I/O may occur outside the context of Cockroach syscall. If a background kernel thread is performing the write back and stalls, Cockroach is oblivious. Direct I/O would ensure all Cockroach's I/O is directly timed. Maybe this distinction is insignificant, because eventually Cockroach should always issue a timed |
WAL and MANIFEST might be good candidates for using direct I/O. The
LogWriter
already handles organizing writes into contiguous blocks. I'm not sure what impact, if any, on performance direct I/O would have.I do think it would allow us to retry failed syncs during WAL and MANIFEST writes: DB.Apply Fatalf, logAndApply Fatalfs
Currently, errors in these codepaths are fatal because
fsync
s of OS-buffered files cannot be retried. The OS marks errored buffers as clean, meaning a retriedfsync
will not sync the buffer and the file's contents remain unchanged regardless of a retry. https://wiki.postgresql.org/wiki/Fsync_ErrorsThis was motivated by thinking about @sumeerbhola's automated ballast file suggestion for detecting out-of-disk conditions. It's a really nice solution. The one sticking point is that an
ENOSPC
may occur duringfsync
. I'm not sure under what conditionsENOSPC
may surface fromfsync
rather than the precedingwrite
, but I suspect it may happen when the filesystem needs to allocate new metadata blocks. I'm not sure but maybe on copy-on-write file systems all block allocations happen duringfsync
?Jira issue: PEBBLE-211
The text was updated successfully, but these errors were encountered: