Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extension/healthcheckv2] Ability to access the status.Aggregator from another extension #34692

Closed
blakerouse opened this issue Aug 14, 2024 · 8 comments · Fixed by #35648
Closed
Labels

Comments

@blakerouse
Copy link
Contributor

Component(s)

extension/healthcheckv2

Is your feature request related to a problem? Please describe.

Currently the opampextension uses the original healthcheckextension and it determines if the collector is healthy by performing an HTTP request to its own process to determine if the collector is health. https://github.com/open-telemetry/opamp-go/blob/ed38d5f4bf930b57e04581919fbfb676aaa5a5af/internal/examples/supervisor/supervisor/supervisor.go#L435

This is not great from the standpoint of requiring the healthcheck endpoint to be exposed on the host just to get health of the current process that the opampextension is also running in. If you don't want any endpoint exposed then opampextension is unable to determine health. It is also very inefficient to determine status as it requires the a HTTP connection from its own process to the same process to just get status information. Another possible issue is IP table rules that prevent localhost port access prevent the ability to get this information (rare, but I have see it happen). It would be much better if this could be done in process.

It would be better if an extension could access the healthcheckv2extension internal status.Aggregator so it can determine the status of the collector without having to perform HTTP/GRPC requests and expose an endpoint on the host.

Describe the solution you'd like

It would be great if something like the following could be done from inside another extension to access the aggregator:

func (hc *exampleExtension) Start(ctx context.Context, host component.Host) error {
	extensions := host.GetExtensions()
	component, ok := extensions[component.MustNewID("healthcheckv2")]
	if !ok {
		return fmt.Errorf("failed to find healthcheckv2 extension; required dependent")
	}
	aggregatorAccess, ok := component.(healthcheckv2extension.AggregatorAccess)
	if !ok {
		return fmt.Errorf("failed to find healthcheckv2 extension; unable to convert to AggregatorAccess interface")
	}
	aggregator := aggregatorAccess.GetAggregator()
	// can now call aggregator.AggregateStatus or aggregator.Subscribe
	return nil
}

Another option would just to expose AggregateStatus and Subscribe directly on the healthcheckv2extension through an interface without exposing the entire function interface to the caller because RecordStatus should not really be called externally.

Describe alternatives you've considered

Another alternative would allow an extension to register a sub-component inside of the healthcheckv2extension. This would allow another extension to expose a different way of getting the health check information versus only HTTP and/or GRPC. This could be done using the same approach:

func (hc *exampleExtension) Start(ctx context.Context, host component.Host) error {
	extensions := host.GetExtensions()
	component, ok := extensions[component.MustNewID("healthcheckv2")]
	if !ok {
		return fmt.Errorf("failed to find healthcheckv2 extension; required dependent")
	}
	addComponent, ok := component.(healthcheckv2extension.AddSubcomponentInterface)
	if !ok {
		return fmt.Errorf("failed to find healthcheckv2 extension; unable to convert to AddSubcomponentInterface interface")
	}
        component := createNewSubComponent(ctx)
	return addComponent.AddComponent(component)
}

Downside of this approach is if the extension sets the healthcheckv2extension as a Dependent() then it will already be started (I believe) before Start(...) is called on the extension. The AddComponent function would then need to start the sub-component at the time of registering. I also believe this will require exposing more internal interfaces and types to extensions that might not be desired.

Additional context

No response

@blakerouse blakerouse added enhancement New feature or request needs triage New item requiring triage labels Aug 14, 2024
@blakerouse blakerouse changed the title Ability to access the status.Aggregator from another extension [extension/healthcheckv2] Ability to access the status.Aggregator from another extension Aug 14, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@blakerouse
Copy link
Contributor Author

blakerouse commented Aug 15, 2024

I would be interested in working on this once an agreement is made on the design and if this is something that is wanted.

@jpkrohling
Copy link
Member

cc @mwear

@mwear
Copy link
Member

mwear commented Sep 25, 2024

Thanks for the detailed writeup @blakerouse. I don't have any strong objections to exposing the status aggregator to other extensions, but I will also mention another use case that I believe we eventually want to address, and an additional alternative to consider. I believe it would be useful for the OpAMP extension to use the status aggregator for the component status message. I haven't looked into the feasibility of this just yet, but it's been on my radar. If we extracted the status aggregator into opentelemetry-collector-contrib/pkg than any extension could depend on it and implement the componentstatus.Watcher interface. The upside, is that any extension that needs a status.Aggregator can use the package. The downside, is if you really wanted to use say, the healthcheckv2extension, the opampextension, and a custom extension, all would have their own aggregator. I think this would be useful to drop into #otel-collector-dev if anyone has any thoughts on the best direction.

@mwear
Copy link
Member

mwear commented Sep 26, 2024

After thinking about this a little more, I do think that exposing an in-process healthcheck API, from the healthcheckv2extension makes sense. I'd be more inclined to expose an API interface that includes the AggregateStatus and Subscribe methods of the underlying aggregator, rather than the aggregator itself.

@blakerouse
Copy link
Contributor Author

blakerouse commented Sep 27, 2024

@mwear I am open to either. I am leaning more towards your original comment of just placing the status aggregator in the opentelemetry-collector-contrib/pkg. That would allow other extensions to easily implement the componentstatus.Watcher and just use the aggregator.

If going down the path of just exposing the methods under the healthcheckv2extension does mean from a user standpoint that they need to configure the healthcheckv2extension as well as the extension that depends on it. If that extension is just trying to get internal health information about the collector, just being able to do that internally using the status aggregator would be very nice.

Really the status aggregator doesn't need to be exposed in opentelemetry-collector-contrib/pkg, just making it importable from the healthcheckv2extension module would be enough for others to use it.

If your open to it I could do a PR to just expose the status aggregator under the healthcheckv2extension module.

@atoulme atoulme removed the needs triage New item requiring triage label Oct 2, 2024
@mwear
Copy link
Member

mwear commented Oct 3, 2024

I think it would be preferable to move the aggregator to pkg rather than exporting it from the extension. Ideally we will keep the public API of the extension as small as possible. I think having the aggregator in pkg should work for the OpAMP extension, and it sounds like it'd work for your use case as well.

@blakerouse
Copy link
Contributor Author

@mwear Thanks for the feedback. I will work on a PR to move it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants