-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
log: merge the implementation for file buffering and network buffering #72452
Labels
A-logging
In and around the logging infrastructure.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-observability
X-anchored-telemetry
The issue number is anchored by telemetry references.
Comments
knz
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
X-anchored-telemetry
The issue number is anchored by telemetry references.
A-logging
In and around the logging infrastructure.
labels
Nov 4, 2021
knz
added a commit
to rauchenstein/cockroach
that referenced
this issue
Nov 4, 2021
Release note: None
knz
added a commit
to rauchenstein/cockroach
that referenced
this issue
Nov 4, 2021
Release note: None
thtruo
added
T-observability-inf
and removed
T-server-and-security
DB Server & Security
labels
Nov 29, 2021
abarganier
added a commit
to abarganier/cockroach
that referenced
this issue
Mar 13, 2024
Informs: cockroachdb#72452 For some time, two buffering mechanisms have existed within the log package. One is controlled by `buffered-writes`, which is only used by the file sink. `buffered-writes` buffers writes until a buffer is full, after which it flushes the buffer to an underlying file. This is not an asynchronous operation. Separately, the `bufferedSink` was introduced to provide an async buffering mechanism. This was originally intended for network log sinks such as the `fluentSink` and `httpSink`. Originally, the `bufferedSink` was not allowed for use with the `fileSink` as we already had a buffering mechanism available. Today, we find ourselves in a situation where we aim to make CRDB even more resilient in the face of disk failures. However, some of these efforts like WAL failover are unable to achieve their goals if the logging subsystem isn't resilient to disk unavailability as well. Today, if a file sink is configured to write to an unavailable disk, any goroutine making a logging call to one of those sinks will hang as the underlying i/o operations used are unable to interact with the disk. This can have a severe negative impact on cluster performance and throughput, even when using `buffered-writes: true`. As a way to alleviate this pain, this patch experimentally enables the usage of the `bufferedSink` with the `fileSink`. The `bufferedSink` is used to "wrap" an underlying sink (such as a `fileSink`), and asynchronously outputs buffered data to said sink. It is especially good at shielding writers from any problems with the underlying sink, which makes it a great candidate to help us accomplish our resiliency goals when it comes to disk outage scenarios. This is gated by an environment variable during the experimental period, `COCKROACH_ENABLE_ASYNC_LOG_FILE_BUFFER`. With the env var enabled, users can leverage the `bufferedSink` by using the `buffering` config option. For example: ``` file-defaults: buffered-writes: false buffering: max-staleness: 1s flush-trigger-size: 256KiB max-buffer-size: 50MiB ``` Note that we don't allow both `buffered-writes` and `buffering` to be used together. This patch adds a validation step to ensure this. We recommend a `max-staleness: 1s` and `flush-trigger-size: 256KiB` when using this feature with file log sinks. Release note (ops change): Users have the ability to update their log configuration to enable asynchronous buffering of `file-group` log sinks via the `buffering` [configuration options](https://www.cockroachlabs.com/docs/stable/configure-logs#log-buffering-for-network-sinks). Note that this is an *experimental* feature, and therefore it is gated by the environment variable `COCKROACH_ENABLE_ASYNC_LOG_FILE_BUFFER`. Set `COCKROACH_ENABLE_ASYNC_LOG_FILE_BUFFER=true` to enable the feature. The `buffering` configuration can be applied to `file-defaults` or individual `file-groups` as needed. Note however that `buffering` and `buffered-writes` are incompatible for the time being. If you'd like to try the `buffering` option on a file log sink, it's necessary to explicitly set `buffered-writes: false`. For example: ``` file-defaults: buffered-writes: false buffering: max-staleness: 1s flush-trigger-size: 256KiB max-buffer-size: 50MiB ``` We recommend a `max-staleness: 1s` and `flush-trigger-size: 256KiB` when using this feature with file log sinks.
abarganier
added a commit
to abarganier/cockroach
that referenced
this issue
Mar 18, 2024
Informs: cockroachdb#72452 For some time, two buffering mechanisms have existed within the log package. One is controlled by `buffered-writes`, which is only used by the file sink. `buffered-writes` buffers writes until a buffer is full, after which it flushes the buffer to an underlying file. This is not an asynchronous operation. Separately, the `bufferedSink` was introduced to provide an async buffering mechanism. This was originally intended for network log sinks such as the `fluentSink` and `httpSink`. Originally, the `bufferedSink` was not allowed for use with the `fileSink` as we already had a buffering mechanism available. Today, we find ourselves in a situation where we aim to make CRDB even more resilient in the face of disk failures. However, some of these efforts like WAL failover are unable to achieve their goals if the logging subsystem isn't resilient to disk unavailability as well. Today, if a file sink is configured to write to an unavailable disk, any goroutine making a logging call to one of those sinks will hang as the underlying i/o operations used are unable to interact with the disk. This can have a severe negative impact on cluster performance and throughput, even when using `buffered-writes: true`. As a way to alleviate this pain, this patch experimentally enables the usage of the `bufferedSink` with the `fileSink`. The `bufferedSink` is used to "wrap" an underlying sink (such as a `fileSink`), and asynchronously outputs buffered data to said sink. It is especially good at shielding writers from any problems with the underlying sink, which makes it a great candidate to help us accomplish our resiliency goals when it comes to disk outage scenarios. Now, users can leverage the `bufferedSink` in `file-defaults` and `file-groups by using the `buffering` config option. For example: ``` file-defaults: buffered-writes: false buffering: max-staleness: 1s flush-trigger-size: 256KiB max-buffer-size: 50MiB ``` Note that we don't allow both `buffered-writes` and `buffering` to be used together. This patch adds a validation step to ensure this. We recommend a `max-staleness: 1s` and `flush-trigger-size: 256KiB` when using this feature with file log sinks. Release note (ops change): Users have the ability to update their log configuration to enable asynchronous buffering of `file-group` log sinks via the `buffering` [configuration options](https://www.cockroachlabs.com/docs/stable/configure-logs#log-buffering-for-network-sinks). The `buffering` configuration can be applied to `file-defaults` or individual `file-groups` as needed. Note however that `buffering` and `buffered-writes` are incompatible for the time being. If you'd like to try the `buffering` option on a file log sink, it's necessary to explicitly set `buffered-writes: false`. For example: ``` file-defaults: buffered-writes: false buffering: max-staleness: 1s flush-trigger-size: 256KiB max-buffer-size: 50MiB ``` We recommend a `max-staleness: 1s` and `flush-trigger-size: 256KiB` when using this feature with file log sinks.
craig bot
pushed a commit
that referenced
this issue
Mar 20, 2024
120428: pkg/util/log: enable bufferedSink usage with fileSink as experimental r=dhartunian,jbowens a=abarganier Informs: #72452 Epic: CRDB-35401 For some time, two buffering mechanisms have existed within the log package. One is controlled by `buffered-writes`, which is only used by the file sink. `buffered-writes` buffers writes until a buffer is full, after which it flushes the buffer to an underlying file. This is not an asynchronous operation. Separately, the `bufferedSink` was introduced to provide an async buffering mechanism. This was originally intended for network log sinks such as the `fluentSink` and `httpSink`. Originally, the `bufferedSink` was not allowed for use with the `fileSink` as we already had a buffering mechanism available. Today, we find ourselves in a situation where we aim to make CRDB even more resilient in the face of disk failures. However, some of these efforts like WAL failover are unable to achieve their goals if the logging subsystem isn't resilient to disk unavailability as well. Today, if a file sink is configured to write to an unavailable disk, any goroutine making a logging call to one of those sinks will hang as the underlying i/o operations used are unable to interact with the disk. This can have a severe negative impact on cluster performance and throughput, even when using `buffered-writes: true`. As a way to alleviate this pain, this patch experimentally enables the usage of the `bufferedSink` with the `fileSink`. The `bufferedSink` is used to "wrap" an underlying sink (such as a `fileSink`), and asynchronously outputs buffered data to said sink. It is especially good at shielding writers from any problems with the underlying sink, which makes it a great candidate to help us accomplish our resiliency goals when it comes to disk outage scenarios. Users can leverage the `bufferedSink` by using the `buffering` config option. For example: ``` file-defaults: buffered-writes: false buffering: max-staleness: 5s flush-trigger-size: 1.0MiB max-buffer-size: 50MiB ``` Note that we don't allow both `buffered-writes` and `buffering` to be used together. This patch adds a validation step to ensure this. Release note (ops change): Users have the ability to update their log configuration to enable asynchronous buffering of `file-group` log sinks via the `buffering` [configuration options](https://www.cockroachlabs.com/docs/stable/configure-logs#log-buffering-for-network-sinks). The `buffering` configuration can be applied to `file-defaults` or individual `file-groups` as needed. Note however that `buffering` and `buffered-writes` are incompatible for the time being. If you'd like to try the `buffering` option on a file log sink, it's necessary to explicitly set `buffered-writes: false`. For example: ``` file-defaults: buffered-writes: false buffering: max-staleness: 1s flush-trigger-size: 256KiB max-buffer-size: 50MiB ``` We recommend a `max-staleness: 1s` and `flush-trigger-size: 256KiB` when using this feature with file log sinks. 120725: streamingccl: add show tenant without pts test r=stevendanna a=msbutler This patch adds a test for the fix in #120434 Epic: none Release note: none Co-authored-by: Alex Barganier <[email protected]> Co-authored-by: Michael Butler <[email protected]>
ebembi-crdb
added a commit
to ebembi-crdb/cockroach
that referenced
this issue
Sep 28, 2024
Informs: cockroachdb#72452 For some time, two buffering mechanisms have existed within the log package. One is controlled by `buffered-writes`, which is only used by the file sink. `buffered-writes` buffers writes until a buffer is full, after which it flushes the buffer to an underlying file. This is not an asynchronous operation. Separately, the `bufferedSink` was introduced to provide an async buffering mechanism. This was originally intended for network log sinks such as the `fluentSink` and `httpSink`. Originally, the `bufferedSink` was not allowed for use with the `fileSink` as we already had a buffering mechanism available. Today, we find ourselves in a situation where we aim to make CRDB even more resilient in the face of disk failures. However, some of these efforts like WAL failover are unable to achieve their goals if the logging subsystem isn't resilient to disk unavailability as well. Today, if a file sink is configured to write to an unavailable disk, any goroutine making a logging call to one of those sinks will hang as the underlying i/o operations used are unable to interact with the disk. This can have a severe negative impact on cluster performance and throughput, even when using `buffered-writes: true`. As a way to alleviate this pain, this patch experimentally enables the usage of the `bufferedSink` with the `fileSink`. The `bufferedSink` is used to "wrap" an underlying sink (such as a `fileSink`), and asynchronously outputs buffered data to said sink. It is especially good at shielding writers from any problems with the underlying sink, which makes it a great candidate to help us accomplish our resiliency goals when it comes to disk outage scenarios. Now, users can leverage the `bufferedSink` in `file-defaults` and `file-groups by using the `buffering` config option. For example: ``` file-defaults: buffered-writes: false buffering: max-staleness: 1s flush-trigger-size: 256KiB max-buffer-size: 50MiB ``` Note that we don't allow both `buffered-writes` and `buffering` to be used together. This patch adds a validation step to ensure this. We recommend a `max-staleness: 1s` and `flush-trigger-size: 256KiB` when using this feature with file log sinks. Release note (ops change): Users have the ability to update their log configuration to enable asynchronous buffering of `file-group` log sinks via the `buffering` [configuration options](https://www.cockroachlabs.com/docs/stable/configure-logs#log-buffering-for-network-sinks). The `buffering` configuration can be applied to `file-defaults` or individual `file-groups` as needed. Note however that `buffering` and `buffered-writes` are incompatible for the time being. If you'd like to try the `buffering` option on a file log sink, it's necessary to explicitly set `buffered-writes: false`. For example: ``` file-defaults: buffered-writes: false buffering: max-staleness: 1s flush-trigger-size: 256KiB max-buffer-size: 50MiB ``` We recommend a `max-staleness: 1s` and `flush-trigger-size: 256KiB` when using this feature with file log sinks.
ebembi-crdb
added a commit
to ebembi-crdb/cockroach
that referenced
this issue
Sep 29, 2024
Informs: cockroachdb#72452 For some time, two buffering mechanisms have existed within the log package. One is controlled by `buffered-writes`, which is only used by the file sink. `buffered-writes` buffers writes until a buffer is full, after which it flushes the buffer to an underlying file. This is not an asynchronous operation. Separately, the `bufferedSink` was introduced to provide an async buffering mechanism. This was originally intended for network log sinks such as the `fluentSink` and `httpSink`. Originally, the `bufferedSink` was not allowed for use with the `fileSink` as we already had a buffering mechanism available. Today, we find ourselves in a situation where we aim to make CRDB even more resilient in the face of disk failures. However, some of these efforts like WAL failover are unable to achieve their goals if the logging subsystem isn't resilient to disk unavailability as well. Today, if a file sink is configured to write to an unavailable disk, any goroutine making a logging call to one of those sinks will hang as the underlying i/o operations used are unable to interact with the disk. This can have a severe negative impact on cluster performance and throughput, even when using `buffered-writes: true`. As a way to alleviate this pain, this patch experimentally enables the usage of the `bufferedSink` with the `fileSink`. The `bufferedSink` is used to "wrap" an underlying sink (such as a `fileSink`), and asynchronously outputs buffered data to said sink. It is especially good at shielding writers from any problems with the underlying sink, which makes it a great candidate to help us accomplish our resiliency goals when it comes to disk outage scenarios. Now, users can leverage the `bufferedSink` in `file-defaults` and `file-groups by using the `buffering` config option. For example: ``` file-defaults: buffered-writes: false buffering: max-staleness: 1s flush-trigger-size: 256KiB max-buffer-size: 50MiB ``` Note that we don't allow both `buffered-writes` and `buffering` to be used together. This patch adds a validation step to ensure this. We recommend a `max-staleness: 1s` and `flush-trigger-size: 256KiB` when using this feature with file log sinks. Release note (ops change): Users have the ability to update their log configuration to enable asynchronous buffering of `file-group` log sinks via the `buffering` [configuration options](https://www.cockroachlabs.com/docs/stable/configure-logs#log-buffering-for-network-sinks). The `buffering` configuration can be applied to `file-defaults` or individual `file-groups` as needed. Note however that `buffering` and `buffered-writes` are incompatible for the time being. If you'd like to try the `buffering` option on a file log sink, it's necessary to explicitly set `buffered-writes: false`. For example: ``` file-defaults: buffered-writes: false buffering: max-staleness: 1s flush-trigger-size: 256KiB max-buffer-size: 50MiB ``` We recommend a `max-staleness: 1s` and `flush-trigger-size: 256KiB` when using this feature with file log sinks.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-logging
In and around the logging infrastructure.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-observability
X-anchored-telemetry
The issue number is anchored by telemetry references.
This is follow up work to #70330.
Currently file sinks support buffering via
buffered-writes
and network sinks via
buffering
.We need to merge the config mechanisms somehow.
Also we want to merge the implementations somehow.
Epic CRDB-6669
Jira issue: CRDB-11160
The text was updated successfully, but these errors were encountered: