Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/tanzuobservability] Make metrics stanza optional in config. #9098

Merged
merged 4 commits into from
May 5, 2022

Conversation

keep94
Copy link
Contributor

@keep94 keep94 commented Apr 5, 2022

Description:
Fixed bug. The collector failed to start if the tanzuobservability exporter was configured with traces only and not metrics and the traces endpoint had anything other than localhost. This is a bug because traces came first, and metrics came later. If a user has only traces configured for tanzuobservability exporter, they should be able to start the collector even if they have upgraded to a version that supports metrics too.

To fix, I made the Metrics field in our config struct be a pointer field instead of a struct field, so that it will be nil if it is missing. If metrics aren't configured in the config file, we don't send any metrics.

Link to tracking Issue: Not available to public.

Testing:.

  1. Unit testing.
  2. Test that I can start the collector when the traces endpoint is http://192.168.XX.XX:30001 and metrics are not configured and that traces are actually sent.
  3. Test that metrics get sent when the metrics endpoint is set to http://192.168.XX.XX:2878.

Documentation: Changes to README.md

@keep94 keep94 requested review from a team and codeboten April 5, 2022 21:49
@keep94 keep94 force-pushed the bug27996 branch 3 times, most recently from f5ed447 to f23c8c3 Compare April 9, 2022 00:29
@keep94
Copy link
Contributor Author

keep94 commented Apr 15, 2022

It appears that unit tests in the exporter/prometheusremotewriteexporter are failing, which is a different exporter than what this PR is for. What can be done to resolve?

@jpkrohling
Copy link
Member

Failed tests restarted.

@keep94
Copy link
Contributor Author

keep94 commented Apr 19, 2022

Even though the tests were restarted it looks like: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusremotewriteexporter is still failing. I am not sure why. If possible, we would like to have this merged before the next big release because it fixes a severe backward compatibility bug with the config file.

@keep94
Copy link
Contributor Author

keep94 commented Apr 20, 2022

Hi. Thank you for restarting the tests. It looks like the build for this PR is now green. Are there any upcoming comments for this PR? As a gentle reminder we want to get this in a release as soon as possible because it fixes a major backward compatibility bug. Thank you.

@keep94
Copy link
Contributor Author

keep94 commented Apr 22, 2022

Hello. This code has been approved my my colleagues and the build is green. Is anything else needed for this code to be reviewed? This PR contains a fix for a backward compatibility bug in the configuration for this exporter. We would like it to be merged and released sooner than later. Thank you for your time. Hope to hear from you soon.

@keep94
Copy link
Contributor Author

keep94 commented Apr 25, 2022

@codeboten Hello. This PR, #9098, still needs to be reviewed by an owner. I notice that your name is on this PR to review it. Is there anything you need from us to facilitate review of this code? As a gentle reminder, this PR fixes a major bug related to backward compatibility issues in the config file; therefore, we would like this to be merged to the main branch as soon as possible. Thank you for your attention.

CHANGELOG.md Outdated Show resolved Hide resolved
}
tracesURL = &url.URL{}
*tracesURL = *metricsURL
tracesURL.Host = hostWithPort(metricsURL.Host, 30001)
Copy link
Member

@TylerHelmuth TylerHelmuth Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused. If the user does not configure traces, this is adding traces configuration and using the supplied metrics endpoint right? I think the part I'm missing is why the traces settings need filled out in order for the exporter to work if it is only defined in a metrics pipeline? Or is the issue that users are adding the exporter to the trace pipeline but not supplying trace configuration?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I think the intention (please correct me @keep94) was to be able to separately specify metrics configuration and traces configuration and when only one is available, assume it's like the other but on a different (default) port.

Perhaps that's something tied to tanzuobservability but if not, I would suggest to consider approach similar to used in OTLP/HTTP. Either a single endpoint can be configured (in this case, this would be the hostname) or specifically traces_endpoint/metrics_endpoint/logs_endpoint.

Also, if endpoint for the signal was not defined and it's used in the pipeline it should just trigger an error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmm-sumo you are correct. If only the traces configuration is present, we assume the metrics configuration is like it but on a different port. Initially I wanted to throw an error if our tanzuobservability exporter is used in the metrics pipeline but no metrics configuration is defined, but I wasn't able to figure out how to do this in the code. I mean how can I tell if my exporter is used in a metrics pipeline from within the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't verified this yet, but perhaps the createMetricsExporter function only gets called when our exporter is used in a metrics pipeline. Perhaps all I have to do is throw an error in createMetricsExporter if there is no metrics configuration. But, I just remembered that for tanzuobservability we have a unique requirement. Our traces exporter uses both the metrics and traces endpoint. Before we added the metrics exporter, we hardcoded the metrics endpoint to be the same as the traces endpoint except use port 2878. Now that we have both the metrics and traces exporter, we want the traces exporter to use the same metrics endpoint as the metrics exporter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I wanted to throw an error if our tanzuobservability exporter is used in the metrics pipeline but no metrics configuration is defined, but I wasn't able to figure out how to do this in the code.

Unfortunately, Validate() can be only used in context of Config, without awareness of which pipeline it is. You might still throw an error when e.g. createTraces is called and not all config is available.

Another option is that you might prefer hostname and port numbers. The port numbers would have default values, so they would always point somewhere. This has the limitation of not supporting url's with paths.

Looking at the ideas, I would circle back to how otlphttp exporter does it. Perhaps you could have three fields:

  • hostname - if set, default port numbers are used and you can support both metrics and traces
  • metrics_endpoint - by default: "https://<hostname>:2878" but could be overwritten here if needed
  • traces_endpoint - be default: "https://<hostname>:30001" but could be overwritten here if needed

So a valid configuration would be:

hostname: myserver.example.com

or:

hostname: myserver.example.com
metrics_endpoint: https://myotherserver.example.com:2878

or:

traces_endpoint: https://myserver1.example.com:30001
metrics_endpoint: https://myotherserver.example.com:2878

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this idea a lot. I think it will simplify a lot of config.go, since you can be sure there is always an endpoint to use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pmm-sumo and @TylerHelmuth: I'm a maintainer of this exporter (in addition to @keep94), and I had question. I'll let Travis consider your suggestions and reply on his own.

Your config changes are backwards-incompatible with our existing yaml. Is there an otel-collector procedure or policy for introducing breaking changes like this? Whether or not we go down that path in this PR, it'd be good to know how to go about breaking changes in the future. I'm pretty sure I've seen other exporters change or deprecate config options, so I do think it's possible. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @oppegard, I think this config change could be done in a non-breaking way. E.g. the new configuration fields could be added. When user uses the old one (essentially "endpoint"), a deprecation warning shows up in the logs.
After some time, the old configuration fields can be removed then.

Ultimately, I think it's your call since this is your exporter :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example of deprecated config:

// Legacy support for deprecated Metrics config

Agree with @pmm-sumo, its your exporter so you can handle this however you want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pmm-sumo and @TylerHelmuth!

@keep94 keep94 force-pushed the bug27996 branch 2 times, most recently from 374fac7 to 860e98f Compare April 29, 2022 00:01
@TylerHelmuth
Copy link
Member

Just did a quick review of this and so far I find this iteration easier to interpret. If I am reading things correctly, both a trace endpoint and metrics endpoint can be supplied, but they must have the same hostname. Trace endpoint must be supplied, but metrics endpoint is optional. If metrics endpoint is not supplied, a default port is used with the trace endpoint. If a metrics endpoint is supplied then the port on the supplied endpoint is used as the metrics port.

My only question right now is did you mean to remove your updates to the README.md? That was a beautiful README file with some great detail.

@keep94
Copy link
Contributor Author

keep94 commented Apr 29, 2022

Hello. Based on what was discussed, I now have brand new code. I will reach out again once my team has had a chance to approve the PR.

@keep94
Copy link
Contributor Author

keep94 commented Apr 29, 2022

@TylerHelmuth Thanks for pointing out the missing README.md, I will work on getting those changes back in this PR. As for your comment, both the traces and metrics endpoints are optional, neither have to be supplied, but if there is a traces pipeline, the traces endpoint must be supplied. If there is a metrics pipeline, the metrics endpoint must be supplied.

@keep94
Copy link
Contributor Author

keep94 commented May 1, 2022

Hello everyone. My tech lead approved these changes yesterday. You may resume reviewing this PR. Thank you.

Copy link
Contributor

@pmm-sumo pmm-sumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks OK, need a rebasing though (we have released v0.50.0 in the meantime and this impacts changelog)

@TylerHelmuth
Copy link
Member

@TylerHelmuth Thanks for pointing out the missing README.md, I will work on getting those changes back in this PR. As for your comment, both the traces and metrics endpoints are optional, neither have to be supplied, but if there is a traces pipeline, the traces endpoint must be supplied. If there is a metrics pipeline, the metrics endpoint must be supplied.

I either missed themetrics_exporter.go file or I took a look too soon. Looked over things again today and I continue to like this iteration more than the original.

@keep94
Copy link
Contributor Author

keep94 commented May 3, 2022

Hello. I just rebased this PR to resolve the conflicts in CHANGELOG.md

@keep94
Copy link
Contributor Author

keep94 commented May 3, 2022

It looks like CHANGELOG.md became out of data even since this morning when I rebased. Is there a way we can coordinate a time for when I should rebase and then somebody can merge immediately after I rebase?

@keep94
Copy link
Contributor Author

keep94 commented May 4, 2022

Just rebased to resolve conflicts with CHANGELOG.md

@keep94
Copy link
Contributor Author

keep94 commented May 4, 2022

Just now rebased to resolve conflicts in CHANGELOG.md. Please consider merging before CHANGELOG.md changes again. There are failing load tests, but I believe these failures are transient because the only changes made to this PR since it was approved was resolving conflicts in CHANGELOG.md

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmm-sumo please approve if your comments have been addressed 👍

@pmm-sumo
Copy link
Contributor

pmm-sumo commented May 4, 2022

@pmm-sumo please approve if your comments have been addressed 👍

Sure, will take a look later today

Copy link
Contributor

@pmm-sumo pmm-sumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits that are not blocking from my end

@pmm-sumo
Copy link
Contributor

pmm-sumo commented May 4, 2022

It looks like CHANGELOG.md became out of data even since this morning when I rebased. Is there a way we can coordinate a time for when I should rebase and then somebody can merge immediately after I rebase?

I think that such conflicts are actually trivial to resolve and I think usually this is done by one of the maintainers when merging the PR. So no need to worry about these :)

@codeboten codeboten added the ready to merge Code review completed; ready to merge by maintainers label May 5, 2022
@codeboten codeboten merged commit 78da6cd into open-telemetry:main May 5, 2022
djaglowski pushed a commit to djaglowski/opentelemetry-collector-contrib that referenced this pull request May 10, 2022
…open-telemetry#9098)

* [exporter/tanzuobservability] Make metrics stanza optional in config.

* Minor changes from pmm-sumo.

* Add distribution port for delta histograms.

Co-authored-by: Alex Boten <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants