Skip to content
This repository has been archived by the owner on Jun 28, 2024. It is now read-only.

Add tracing test #1255

Closed
jodh-intel opened this issue Feb 28, 2019 · 9 comments
Closed

Add tracing test #1255

jodh-intel opened this issue Feb 28, 2019 · 9 comments
Assignees

Comments

@jodh-intel
Copy link
Contributor

As shown on kata-containers/runtime#1277, we broke runtime tracing but didn't notice.

Clearly, we need a test to ensure tracing works that does atleast the following:

  • Install the Jaeger docker image.
  • Enable tracing for all components.
  • Run a docker workload.
  • Check that Jaeger allows us to query a "reasonable number" of trace spans / component:
    • The minimum number of expected spans is going to differ per component.
    • We'd always expect >0.
    • The actual figures are a little difficult to tie down. We are going to require some magic numbers alas (shudder) since as the code changes, so do the number of trace spans generated.
@jodh-intel
Copy link
Contributor Author

Hi @chavafg - any thoughts on how best to arrange the pieces here? The plan is to run a basic busybox true docker test initially and check we get some traces so I was thinking of doing something like this:

  • Create a script to install Jaeger for trace collection + display (".ci/install_jaeger.sh"?)
  • Enable all tracing options in configuration.toml (".ci/configure_tracing_for_kata.sh enable"?)
  • Main shell script test (Where should this live?)
    • Run "docker run busybox true"
    • Run the test to ensure Jaeger has collected a number of trace spans.
  • Disable all tracing options in configuration.toml (".ci/configure_tracing_for_kata.sh disable"?)

So we'd always install Jaeger for a CI run, but only actually use it for this single test (which would then undo the config to stop tracing). Alternatively, we could create custom Jenkins jobs and run all CI tests with Jaeger, but I think that's overkill tbh.

@jodh-intel
Copy link
Contributor Author

@chavafg chavafg self-assigned this Mar 4, 2019
@chavafg
Copy link
Contributor

chavafg commented Mar 4, 2019

Hi @jodh-intel, seems like your approach should work. I'll check this.

@chavafg
Copy link
Contributor

chavafg commented Mar 4, 2019

Hi @jodh-intel,
I tried to collect some traces but I got the following issue:
BTW, I am using: kata-containers/runtime#1317

[fuentess@centos-prll kata-containers]$ sudo docker run -ti busybox true
docker: Error response from daemon: OCI runtime create failed: Shim tracing requires disable_new_netns for Jaeger agent communication: unknown.

I went and modify the disable_new_netns = true, then I got:

[fuentess@centos-prll kata-containers]$ sudo docker run -ti busybox true
docker: Error response from daemon: OCI runtime create failed: config disable_new_netns only works with 'none' internetworking_model: unknown.

Modified the internetworking model to none and got the next error:

[fuentess@centos-prll kata-containers]$ sudo docker run -ti busybox true
docker: Error response from daemon: OCI runtime create failed: exit status 1: stdout: , stderr: time="2019-03-04T20:47:29Z" level=fatal msg="failed to add interface veth6d874f2 to sandbox: error renaming interface \"veth6d874f2\" to \"eth0\": file exists": unknown.
[fuentess@centos-prll kata-containers]$ sudo docker run -ti busybox true                                                                                                                                           
docker: Error response from daemon: OCI runtime create failed: exit status 1: stdout: , stderr: time="2019-03-04T20:48:11Z" level=fatal msg="failed to add interface veth0ac47c9 to sandbox: error renaming interface \"veth0ac47c9\" to \"eth0\": file exists": unknown.

Any idea if I am doing something wrong?

@jodh-intel
Copy link
Contributor Author

Hi @chavafg - oh thanks! I was happy to add this test, but not a problem if you want to ;)

I've just checked and it looks like shim tracing is broken now too ;(( I've raised kata-containers/shim#148 for that.

As such, I'd just enable tracing for the runtime for now to gather some traces. I'll clean up my scripts to check for traces programatically and hand them over later...

@jodh-intel
Copy link
Contributor Author

Hi @chavafg - here's a basic script that performs some basic Jaeger tests. It might need a bit of refactoring (had to add .png to allow the file to be attached :-)

ci-test-jaeger-tracing sh

That's a cut-down version of the script I've got to test agent tracing.

A few other points:

  • It should probably be using set -e.
  • It assumes debug and tracing is enabled for the runtime.

@chavafg
Copy link
Contributor

chavafg commented Mar 5, 2019

Thanks @jodh-intel

@jodh-intel
Copy link
Contributor Author

@chavafg - I keep forgetting that you need to run docker with --net=none when using the none internetworking model. It's a pita but that information isn't encoded into the config.json so we cannot produce a sensible error message and just fall over when tweaking network stuff. I've just double-checked and shim tracing works fine with that final piece of the puzzle in place ;)

@chavafg
Copy link
Contributor

chavafg commented Mar 6, 2019

oh, good to know, thanks @jodh-intel. I'll work on this today.

chavafg added a commit to chavafg/tests-1 that referenced this issue Mar 7, 2019
This tracing test runs as follows:
1. Enables tracing on runtime and shim components.
2. Runs a Jaeger container that will receive the trace spans.
3. A kata-container is executed.
4. Verifies that Jaeger has collected a number of spans.
5. Disables tracing.

Fixes: kata-containers#1255.

Signed-off-by: Salvador Fuentes <[email protected]>
chavafg added a commit to chavafg/tests-1 that referenced this issue Mar 7, 2019
This tracing test runs as follows:
1. Enables tracing on runtime and shim components.
2. Runs a Jaeger container that will receive the trace spans.
3. A kata-container is executed.
4. Verifies that Jaeger has collected a number of spans.
5. Disables tracing.

Fixes: kata-containers#1255.

Signed-off-by: Salvador Fuentes <[email protected]>
chavafg added a commit to chavafg/tests-1 that referenced this issue Mar 8, 2019
This tracing test runs as follows:
1. Enables tracing on runtime and shim components.
2. Runs a Jaeger container that will receive the trace spans.
3. A kata-container is executed.
4. Verifies that Jaeger has collected a number of spans.
5. Disables tracing.

Fixes: kata-containers#1255.

Signed-off-by: Salvador Fuentes <[email protected]>
chavafg added a commit to chavafg/tests-1 that referenced this issue Mar 8, 2019
This tracing test runs as follows:
1. Enables tracing on runtime and shim components.
2. Runs a Jaeger container that will receive the trace spans.
3. A kata-container is executed.
4. Verifies that Jaeger has collected a number of spans.
5. Disables tracing.

Fixes: kata-containers#1255.

Signed-off-by: Salvador Fuentes <[email protected]>
chavafg added a commit to chavafg/tests-1 that referenced this issue Mar 8, 2019
This tracing test runs as follows:
1. Enables tracing on runtime and shim components.
2. Runs a Jaeger container that will receive the trace spans.
3. A kata-container is executed.
4. Verifies that Jaeger has collected a number of spans.
5. Disables tracing.

Fixes: kata-containers#1255.

Signed-off-by: Salvador Fuentes <[email protected]>
chavafg added a commit to chavafg/tests-1 that referenced this issue Mar 11, 2019
This tracing test runs as follows:
1. Enables tracing on runtime and shim components.
2. Runs a Jaeger container that will receive the trace spans.
3. A kata-container is executed.
4. Verifies that Jaeger has collected a number of spans.
5. Disables tracing.

Fixes: kata-containers#1255.

Signed-off-by: Salvador Fuentes <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants