Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support concurrent file storage extensions #10915

Closed
pmm-sumo opened this issue Jun 13, 2022 · 11 comments · Fixed by #13418
Closed

Support concurrent file storage extensions #10915

pmm-sumo opened this issue Jun 13, 2022 · 11 comments · Fixed by #13418
Labels

Comments

@pmm-sumo
Copy link
Contributor

pmm-sumo commented Jun 13, 2022

Is your feature request related to a problem? Please describe.
The contrib has two storage extension implementations - dbstorage and filestorage, for which there are several use cases:

  • persistent buffering in queued_retry (essentially any exporter)
  • filelog receiver, which when storage extensions is enabled will automatically use it for persisting status of file tailing

Currently, only one instance of the storage extension can be used, which is limiting the flexibility to have separate config for each of the purposes. The limitation is likely to avoid referencing the storage ID and instead relying on using the only available one (example) without the need to reference it, though it is also somewhat limiting

It was brought during the review of enabling persistent queue capability by default in the collector build. The proposal was to include additional property, where the specific instance could be references, e.g.

  storage: file_storage/foo

...
extensions:
  file_storage/foo:
    directory: foo

Describe the solution you'd like

Deprecate the current automated attachment of file storage extension to filelog receiver and persistent_storage_enabled flag of queued_retry. Provide new configuration property storage which could be used to specify which file storage extension should be used when more than one is available.

The (draft) implementation for enabling persistent queue addresses this change for queued_retry

@pmm-sumo
Copy link
Contributor Author

Bringing this to wider audience @djaglowski @tigrannajaryan

@djaglowski
Copy link
Member

The limitation is likely to avoid referencing the storage ID and instead relying on using the only available one (example) without the need to reference it, though it is also somewhat limiting

Actually this was just a shortsighted way to make use of an extension. I was not aware at the time of how extension were typically specified to other components.

I think your proposal generally makes sense but I wonder if it is necessary to require that the extension ID be specified. Could the previous behavior be allowed when 1) no extension id is specified on the component, and 2) only one storage extension is configured? I believe this would make this a non-breaking change and therefore be less disruptive to current users.

@pmm-sumo
Copy link
Contributor Author

On the previous behavior, I am pretty sure we should keep it at least for the intermediate period (to make the change non-breaking)

I wonder if it is necessary to require that the extension ID be specified

This was my understanding of what @bogdandrutu was proposing Though I am not sure if we want to deprecate it in favor of explicitly specifying the storage extension ID or we should keep it as alternative way when only one storage extension is defined

@tigrannajaryan
Copy link
Member

I think your proposal generally makes sense but I wonder if it is necessary to require that the extension ID be specified. Could the previous behavior be allowed when 1) no extension id is specified on the component, and 2) only one storage extension is configured? I believe this would make this a non-breaking change and therefore be less disruptive to current users.

I would be in favour of this. Not only it is backwards compatible, but I think it is also a good user experience to require minimal configuration. When there is only one storage extension enable, and a component needs a storage I think it is reasonable to choose the only available extension without explicit config setting. I don't think there is anything surprising in this behavior. Of course the user is also free to explicitly specify the storage extension to use even if only one is available.

@pmm-sumo
Copy link
Contributor Author

I think it's a fair point @tigrannajaryan @djaglowski And in case two file storage extensions are available we should fail with a message saying that user must select which one to choose for a given component using storage: ... property?

@djaglowski
Copy link
Member

@pmm-sumo, that sounds correct to me.

@tigrannajaryan
Copy link
Member

@pmm-sumo sounds good to me.

@djaglowski djaglowski added the priority:p2 Medium label Jun 17, 2022
@andrzej-stencel
Copy link
Member

andrzej-stencel commented Jun 20, 2022

Hi folks,

I think my question is relevant to this issue, but please let me know if I should rather open a separate issue to discuss this:

Question:

How can the user disable persistence for a component that uses storage by default (like e.g. Filelog receiver)?

  1. Should the component accept a special value to the storage property, for example storage: none?
  2. Should the component provide a separate property like storage_enabled?
  3. Other?

Context:

In Przemek's exporterhelper pull request, persistence is not enabled by default, which is consistent with the behavior before this change. If I'm reading the code correctly, persistence is only enabled if the user sets persistent_storage_enabled: true and/or sets storage property to the name of a configured storage extension.

The Filelog receiver is different in that it uses storage by default, as long as there is one configured. I assume in future the Filelog receiver should also provide the storage property that will allow the user to specify which storage extension to use with a specific Filelog receiver. I think it should also be possible to disable persistent storage for such a component, that's where the question comes from.

I'm just implementing persistence for another component and would like to use configuration that would be consistent across various OpenTelemetry components.

@tigrannajaryan
Copy link
Member

  1. Should the component accept a special value to the storage property, for example storage: none?

I think this is the right way. We can use empty string and/or null yaml value to indicate the storage is disabled.

@djaglowski
Copy link
Member

Based on additional discussion on (collector)#5784, I'm proposing that we standardize on Przemek's original proposal of requiring explicit configuration to attach a storage extension to another component. This means deprecating the current "autoattach" behavior on the stanza-based receivers. These are still marked as alpha, so a breaking change to configuration is technically allowable.

@djaglowski
Copy link
Member

A summary of a conversation about this topic in the Collector SIG today is here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment