-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Metricbeat] Simplify testing http Metricbeat modules #10648
[Metricbeat] Simplify testing http Metricbeat modules #10648
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the overall idea, it looks like a quick win for some http modules. I also like the idea of making things similar between filebeat and metricbeat modules.
But I think that it is going to be complicated to extend it for the future points you mention, if we want to mock some services we are going to need to add some logic for different endpoints, or different responses in the same endpoint, or different protocols. If we do all this in a single generic place this is going to grow in complexity.
It works reasonably well in filebeat because there most of the times we read from files, but it starts to be tricky even there when we want to test with other inputs (#8140) or configs (#10182).
Other possible approach could be to mimic httptest
, and have some kind of helper to create mock servers, like a generalization of this for HTTP, and other similar things for other protocols. We'd still need code to define tests, but I think we are still going to need go code for most of metricbeat modules.
(And btw, +1000 to the idea of bringing to go the check for documented fields).
I take this to mean that the idea here is that the Docker environments would still be used in the CI as I (think?) they are now and that this new technique would just be something lighter and faster that a developer could use directly? |
@cachedout Exactly. To test most cases a Taking the Elasticsearch module as an example here, I expect us long term run the tests against many different versions to detect edge cases which might not be covered in the JSON files. These tests would not run as part of each PR but only on master or even a separate testing environment and report in case it finds some edge cases. Each module would then provide a tests matrix config against which versions things should be tested. |
@jsoriano The changes here definitively got heavily influenced by https://github.com/elastic/beats/blob/master/metricbeat/module/rabbitmq/mtest/server.go and I expect us to end up with something similar to that but in a more generic way. I initially started this PR with a more complex config for the tests but then threw it all away as I realised that we can probably cover 80% of the http module with a much simpler approach. What about the other 20%? Either we can extend the framework to fit, build something specific for these or keep it in go code. All of this fine as long all test frameworks stay reasonably simple. In general I agree with all the abstraction requests you made above and I'm pretty sure at one stage we will need to implement them. But I prefer to implement them when we get to the point that we need them and have a test require it. So instead of having one massive PR I rather have 20 small ones improving the tests and get them in quickly. I hope that works for you. |
Currently most modules are tested against a docker container. This leads to long setup times and potentially flakyness. Also it requires additional setup to test actual changes on a module without running CI. The goal of this PR is to reduce this overhead, make it possible to easily test new data sets without having to write go code. Expected files were added to verify that changes had no effect on the generated data. The tests with the environment are still needed but should become less critical during development. The structure and logic is inspired by the testing of the Filebeat modules. So far 3 metricsets were convert to test the implementation. It's all based on conventions: * Tests outputs from a JSON endpoint must go int `_meta/testdata` * A `testdata/config.yml` file must exists to specify url under which the testdata should be served * A golden files is generated by adding `-expected.json`. For a metricset to be converted it must have the reporter interface, be http and json based and only have 1 endpoint requested at the time. All metricsets should be converted to the reporter interface. As there is now a more global view on the testing of a metricset, this code can potentially also take over the check to make sure that all fields are documented or at least the generated files can be used to do these checks. To support metricsets which generate one or multiple events the output is always an array of JSON objects. These arrays can also contain errors, meaning also invalid data can be tested. The `data.json` we had so far was hard to update and changed every time it was updated because it was pulled from a life instance. For the metricsets that are switched over to this testing, it's not the case anymore. The `data.json` is generated from the first event in the `docs.json`. This is by convention and allows to have a `docs.json` with a specially interesting event. This should make condition checks for which event should be shown also partially obsolete. Future work: * Support multiple endpoints: Elasticsearch metricsets do not work with the above model yet as they need multiple endpoints to be available at the same time. Config options for this could be introduced. * Support more then .json: Currently only .json is supported. More config options could be added to support other data formats for example for the apache module * Support other protocols then http: Not all modules are http based, 2-3 other comments protocols could be added. * Extend with additional config options: Some metricsets need additional config options to be set for testing. It should be possible to pass these as part of the config.yml file. The overall goal of all the above is to have Metricbeat modules more and more config based instead of golang code based.
66cb518
to
976e12c
Compare
jenkins, test this |
This change is based on #10648 to migrate to golden files instead of the dynamically generated data files. This adds also support for query params to the testing framework.
Currently most modules are tested against a docker container. This leads to long setup times and potentially flakyness. Also it requires additional setup to test actual changes on a module without running CI. The goal of this PR is to reduce this overhead, make it possible to easily test new data sets without having to write go code. Expected files were added to verify that changes had no effect on the generated data. The tests with the environment are still needed but should become less critical during development.
The structure and logic is inspired by the testing of the Filebeat modules. So far 3 metricsets were convert to test the implementation. It's all based on conventions:
_meta/testdata
testdata/config.yml
file must exists to specify url under which the testdata should be served-expected.json
.For a metricset to be converted it must have the reporter interface, be http and json based and only have 1 endpoint requested at the time. All metricsets should be converted to the reporter interface.
As there is now a more global view on the testing of a metricset, this code can potentially also take over the check to make sure that all fields are documented or at least the generated files can be used to do these checks.
To support metricsets which generate one or multiple events the output is always an array of JSON objects. These arrays can also contain errors, meaning also invalid data can be tested.
The
data.json
we had so far was hard to update and changed every time it was updated because it was pulled from a life instance. For the metricsets that are switched over to this testing, it's not the case anymore. Thedata.json
is generated from the first event in thedocs.json
. This is by convention and allows to have adocs.json
with a specially interesting event. This should make condition checks for which event should be shown also partially obsolete.Future work:
data_test.go
file. This works for now but potentially should be automated.The overall goal of all the above is to have Metricbeat modules more and more config based instead of golang code based.