-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exporter/prometheusremotewrite: add a Write-Ahead-Log #3017
exporter/prometheusremotewrite: add a Write-Ahead-Log #3017
Conversation
Work in progress |
08c7661
to
e91c130
Compare
Converting to draft per author comment that this is WIP |
Thanks Bogdan! Just sending up some tests soon and then it’ll be ready.
…On Mon, Apr 26, 2021 at 10:41 AM Bogdan Drutu ***@***.***> wrote:
Converting to draft per author comment that this is WIP
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3017 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABFL3VZHTQZKRGFNSLJFTKLTKWQVPANCNFSM43Q5IXXA>
.
|
c6ba14f
to
d9a6ab1
Compare
Alrighty, tests added and ready for action! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an initiative to add generic persistent queues to all exporters: https://docs.google.com/document/d/1Y4vNthCGdYI61ezeAzL5dXWgiZ73y9eSjIDitk3zXsU/edit#
I do not think we need to have a custom solution just for this exporter unless we believe it is really necessary due to this exporter having unique requirements (in which case I would like to know what they are).
Thank you for the comment @tigrannajaryan! We talked about this a few times in a couple of meeting, both the Prometheus Working Group and the Agent working group. Basically we are taking a pragmatic approach to meet our deadlines, AWS want to get the capability in before a deadline that's coming close and whenever the generic WAL is added, the code can simply be moved. |
972869e
to
21b720e
Compare
c95cddb
to
087e159
Compare
Reviews addressed, please help me with another wave @rakyll @bogdandrutu @tigrannajaryan and thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, can be merged if other reviewers also approve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good, added a few comments. We can follow up with improvements.
return prwe.wal.WriteBatch(batch) | ||
} | ||
|
||
func (prwe *PrwExporter) readFromWALThenExport(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we flushing the WAL at a Shutdown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On shutdown, the WAL's .Close() method is invoked and that toggles a shutdown, before invoking this method in (*PrwExporter).turnOnWALIfEnabled(), we derive a cancellable context.Context and after the prwe.closeChan receives a value, it also invokes cancel() which ensures that this method will terminate alright.
080f211
to
19a51fc
Compare
@tigrannajaryan @rakyll @bogdandrutu addressed please take a look, thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase from main and make sure the build passes.
This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry#3597, which was split off from PR open-telemetry#3017. Note: there is very rare condition for which we can perhaps send the same data many times and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry#3017 Updates PR open-telemetry#3597 Fixes open-telemetry/prometheus-interoperability-spec#9
This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry#3597, which was split off from PR open-telemetry#3017. Note: there is very rare condition for which we can perhaps send the same data many times and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry#3017 Updates PR open-telemetry#3597 Fixes open-telemetry/prometheus-interoperability-spec#9
This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry#3597, which was split off from PR open-telemetry#3017. Note: there is very rare condition for which we can perhaps send the same data many times and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry#3017 Updates PR open-telemetry#3597 Fixes open-telemetry/prometheus-interoperability-spec#9
This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry#3597, which was split off from PR open-telemetry#3017. Note: there is very rare condition for which we can perhaps send the same data many times and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry#3017 Updates PR open-telemetry#3597 Fixes open-telemetry/prometheus-interoperability-spec#9
This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry#3597, which was split off from PR open-telemetry#3017. Note: there is very rare condition for which we can perhaps send the same data many times and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry#3017 Updates PR open-telemetry#3597 Fixes open-telemetry/prometheus-interoperability-spec#9
This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry#3597, which was split off from PR open-telemetry#3017. Note: there is very rare condition for which we can perhaps send the same data many times and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry#3017 Updates PR open-telemetry#3597 Fixes open-telemetry/prometheus-interoperability-spec#9
This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry#3597, which was split off from PR open-telemetry#3017. Note: there is very rare condition for which we can perhaps send the same data many times and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry#3017 Updates PR open-telemetry#3597 Fixes open-telemetry/prometheus-interoperability-spec#9
This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry#3597, which was split off from PR open-telemetry#3017. Note: there is very rare condition for which we can perhaps send the same data many times and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry#3017 Updates PR open-telemetry#3597 Fixes #3727 Fixes open-telemetry/prometheus-interoperability-spec#9
This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry#3597, which was split off from PR open-telemetry#3017. Note: there is very rare condition for which we can perhaps send the same data many times and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry#3017 Updates PR open-telemetry#3597 Fixes #3727 Fixes open-telemetry/prometheus-interoperability-spec#9
This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry/opentelemetry-collector#3597, which was split off from PR open-telemetry/opentelemetry-collector#3017. Note: there is very rare condition for which we can perhaps send the same data a couple of times, and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry/opentelemetry-collector#3017 Updates PR open-telemetry/opentelemetry-collector#3597 Fixes #4751 Fixes open-telemetry/prometheus-interoperability-spec#9
* exporter/prometheusremotewrite: glue up and use Write-Ahead-Log This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry/opentelemetry-collector#3597, which was split off from PR open-telemetry/opentelemetry-collector#3017. Note: there is very rare condition for which we can perhaps send the same data a couple of times, and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry/opentelemetry-collector#3017 Updates PR open-telemetry/opentelemetry-collector#3597 Fixes #4751 Fixes open-telemetry/prometheus-interoperability-spec#9 * fix go.mod Signed-off-by: Anthony J Mirabella <[email protected]> * exporter/prw: WALConfig should be exported to allow programmatic manipulation Signed-off-by: Anthony J Mirabella <[email protected]> * Add CHANGELOG entry Signed-off-by: Anthony J Mirabella <[email protected]> * fix lint error Signed-off-by: Anthony J Mirabella <[email protected]> * mod tidy Signed-off-by: Anthony J Mirabella <[email protected]> * tidy up WAL logic and comments Signed-off-by: Anthony J Mirabella <[email protected]> * Handle error from closing WAL Signed-off-by: Anthony J Mirabella <[email protected]> * Ensure WAL processor keeps running after error Signed-off-by: Anthony J Mirabella <[email protected]> * prwe/WAL: refactor WAL run loop Signed-off-by: Anthony J Mirabella <[email protected]> * make gotidy Signed-off-by: Anthony J Mirabella <[email protected]> * lint Signed-off-by: Anthony J Mirabella <[email protected]> * fix data races Signed-off-by: Anthony J Mirabella <[email protected]> * Ensure locking around entire WAL persistence routine Signed-off-by: Anthony J Mirabella <[email protected]> * Address PR feedback Signed-off-by: Anthony J Mirabella <[email protected]> * fix lint issues Signed-off-by: Anthony J Mirabella <[email protected]> * update read index inside WAL reader routine Signed-off-by: Anthony J Mirabella <[email protected]> * Update CHANGELOG.md * Undo unrelated go.mod changes Signed-off-by: Anthony J Mirabella <[email protected]> Co-authored-by: Emmanuel T Odeke <[email protected]> Co-authored-by: Alex Boten <[email protected]> Co-authored-by: Juraci Paixão Kröhling <[email protected]>
* exporter/prometheusremotewrite: glue up and use Write-Ahead-Log This change completes the WAL capability for the Prometheus Remote-Write exporter that allows recovery of data if the prior write requests weren't yet exported. This wires up a Write-Ahead-Log (WAL) that was implemented in PR open-telemetry/opentelemetry-collector#3597, which was split off from PR open-telemetry/opentelemetry-collector#3017. Note: there is very rare condition for which we can perhaps send the same data a couple of times, and it can happen if we haven't yet truncated the WAL, the RemoteWrite endpoint received the prior data but the process received a Ctrl+C, kill signal. However, this will be very rare and would have to be timed so fast and precisely. Replaces PR open-telemetry/opentelemetry-collector#3017 Updates PR open-telemetry/opentelemetry-collector#3597 Fixes open-telemetry#4751 Fixes open-telemetry/prometheus-interoperability-spec#9 * fix go.mod Signed-off-by: Anthony J Mirabella <[email protected]> * exporter/prw: WALConfig should be exported to allow programmatic manipulation Signed-off-by: Anthony J Mirabella <[email protected]> * Add CHANGELOG entry Signed-off-by: Anthony J Mirabella <[email protected]> * fix lint error Signed-off-by: Anthony J Mirabella <[email protected]> * mod tidy Signed-off-by: Anthony J Mirabella <[email protected]> * tidy up WAL logic and comments Signed-off-by: Anthony J Mirabella <[email protected]> * Handle error from closing WAL Signed-off-by: Anthony J Mirabella <[email protected]> * Ensure WAL processor keeps running after error Signed-off-by: Anthony J Mirabella <[email protected]> * prwe/WAL: refactor WAL run loop Signed-off-by: Anthony J Mirabella <[email protected]> * make gotidy Signed-off-by: Anthony J Mirabella <[email protected]> * lint Signed-off-by: Anthony J Mirabella <[email protected]> * fix data races Signed-off-by: Anthony J Mirabella <[email protected]> * Ensure locking around entire WAL persistence routine Signed-off-by: Anthony J Mirabella <[email protected]> * Address PR feedback Signed-off-by: Anthony J Mirabella <[email protected]> * fix lint issues Signed-off-by: Anthony J Mirabella <[email protected]> * update read index inside WAL reader routine Signed-off-by: Anthony J Mirabella <[email protected]> * Update CHANGELOG.md * Undo unrelated go.mod changes Signed-off-by: Anthony J Mirabella <[email protected]> Co-authored-by: Emmanuel T Odeke <[email protected]> Co-authored-by: Alex Boten <[email protected]> Co-authored-by: Juraci Paixão Kröhling <[email protected]>
…#3017) Bumps [setuptools](https://github.com/pypa/setuptools) from 67.6.1 to 67.7.1. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/CHANGES.rst) - [Commits](pypa/setuptools@v67.6.1...v67.7.1) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Follows [open-telemetry#88 (opentelemetry-log-collection)](open-telemetry/opentelemetry-log-collection#88) and [open-telemetry#3017 (opentelemetry-collector-contrib)](open-telemetry/opentelemetry-collector-contrib#3017)
Uses an off the shelf WAL implementation to add capabilities
to the Prometheus remote exporter using github.com/tidwall/wal.
By default the WAL will be on and to configure the WAL location,
please use this prometheusremotewrite exporter YAML configuration:
whose fields are quite similar to Prometheus' WAL fields per https://docs.google.com/document/d/1cCcoFgjDFwU2n823tKuMvrIhzHty4UDyn0IcfUHiyyI/edit#heading=h.mlf37ibqjgov
We are using an off-the-shelf WAL because:
By the time that we get OTLP metrics, we can convert to Prometheus proto,
but trying to implement the Prometheus storage interfaces would involve
either by-passing to-Prometheus-Proto and then save to Prometheus raw Go,
then retrieve Prometheus raw Go and then convert to Prometheus Proto.
It is much easier to get an off the shelf WAL implementation and add
it to the Prometheus implementation.
Fixes open-telemetry/prometheus-interoperability-spec#9