Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV output: vu and iter system tags not output after v0.40.0 #2827

Closed
leonyork opened this issue Dec 20, 2022 · 10 comments
Closed

CSV output: vu and iter system tags not output after v0.40.0 #2827

leonyork opened this issue Dec 20, 2022 · 10 comments

Comments

@leonyork
Copy link
Contributor

Brief summary

When upgrading from version 0.40.0 to 0.41.0 the system tags vu and iter are no longer populated in outputs.

This happens whether the --system-tags command line argument is used or the tags are specified in the test javascript options.

I found this when trying to upgrade my own xk6-timestream-output to K6 v0.41.0/v0.42.0. I've replicated the same issue with the CSV output.

k6 version

0.41.0

OS

macOS

Docker version and image (if applicable)

loadimpact/k6:0.41.0

Steps to reproduce the problem

Replication here are for the CSV output.

  1. Clone the repo demonstrating the issue and cd into the folder.
  2. Run make. This will run the same test with v0.40.0 and v0.41.0 of K6, then diff the vu and iter columns of the outputted CSV.

The issue is not fixed in v0.42.0 - run make on the 0.42.0 branch.

The issue also occurs when using cli arguments rather than specifying the system tags in the javascript options - run make on the cli-arg branch.

Expected behaviour

The contents of the vu and iter columns of the CSV output by versions 0.40.0 and 0.41.0 should match.

Actual behaviour

In the CSV output by K6 v0.41.0, the vu and iter columns are empty. This does not match the values correctly output by K6 v0.40.0.

@leonyork leonyork added the bug label Dec 20, 2022
@leonyork
Copy link
Contributor Author

leonyork commented Dec 20, 2022

Just read the release notes on this:

The vu and iter system tags ... various outputs may emit them differently or ignore them completely

It does look like the CSV output should include the vu and iter tags in the metadata column though. However this column is empty (if you look at the raw CSVs generated from running make in the repo demonstrating the issue).

(Also I've managed to "fix" my Timestream output by including the metadata in the output)

@na--
Copy link
Member

na-- commented Dec 21, 2022

Yeah, there was a breaking change in k6 v0.41.0 and vu and iter are not normal tags anymore. Instead, they are non-indexed high-cardinality metadata, so that part is expected.

However, you are also right about the fact that the csv output seems to have a bug and doesn't emit them even when it should 😞 Thank you for reporting this! 🙇

@na-- na-- added the high prio label Dec 21, 2022
@na-- na-- added this to the v0.43.0 milestone Dec 21, 2022
@na--
Copy link
Member

na-- commented Dec 21, 2022

The problem is actually even worse, not only doesn't the csv output include vu and iter, it doesn't include them twice 😅 😭

A simple example that demonstrates the issue is running k6 run --out csv --system-tags url,vu,iter github.com/grafana/k6/samples/http_get.js, the result of cat file.csv | column -t -s, will be something like this:

metric_name               timestamp   metric_value  iter  url                      vu  extra_tags  metadata
http_reqs                 1671608269  1.000000            https://test-api.k6.io/                  
http_req_duration         1671608269  137.798250          https://test-api.k6.io/                  
http_req_blocked          1671608269  375.694105          https://test-api.k6.io/                  
http_req_connecting       1671608269  146.392095          https://test-api.k6.io/                  
http_req_tls_handshaking  1671608269  206.278935          https://test-api.k6.io/                  
http_req_sending          1671608269  0.042768            https://test-api.k6.io/                  
http_req_waiting          1671608269  137.650450          https://test-api.k6.io/                  
http_req_receiving        1671608269  0.105032            https://test-api.k6.io/                  
http_req_failed           1671608269  0.000000            https://test-api.k6.io/                  
data_sent                 1671608269  446.000000                                                   
data_received             1671608269  21364.000000                                                 
iteration_duration        1671608269  513.650476                                                   
iterations                1671608269  1.000000 

Notice that not only is the metadata column empty, but the csv output explicitly adds a vu and iter columns, because they were explicitly enabled with --system-tags. However, they are empty too, because they are no longer actual tags... 😞

The problem are the resTags:

k6/output/csv/output.go

Lines 46 to 52 in 074b532

func newOutput(params output.Params) (*Output, error) {
resTags := []string{}
ignoredTags := []string{}
tags := params.ScriptOptions.SystemTags.Map()
for tag, flag := range tags {
if flag {
resTags = append(resTags, tag)

k6/output/csv/output.go

Lines 214 to 217 in 074b532

writeTag := func(tag, val string) bool {
if IsStringInSlice(resTags, tag) || IsStringInSlice(ignoredTags, tag) {
return true // continue
}

The whole output probably needs to be refactored, it's buggy and has quite sub-optimal performance anyway. We should have rewritten it in #2654 / #2727 instead of just adding this TODO:

k6/output/csv/output.go

Lines 205 to 210 in 074b532

// TODO: optimize all of this - do not use tags.Map(), flip resTags, fix the
// for loops, get rid of IsStringInSlice(), etc.
sampleTags := sample.Tags.Map()
for ind, tag := range resTags {
row[ind+3] = sampleTags[tag]
}

@leonyork
Copy link
Contributor Author

Hey!

Thanks for the quick response. Yeah, you're exactly right it's all around that resTags variable - the change in this draft PR manages to get the vu and iter into the extra_tags column, but not the metadata column. It also removes the vu and iter columns that are empty anyway. The results of that change are that you get an output like:

metric_name               timestamp   metric_value  extra_tags   metadata
http_reqs                 1671580418  1.000000      vu=1&iter=0
http_req_duration         1671580418  103.763000    vu=1&iter=0
http_req_blocked          1671580418  221.097000    vu=1&iter=0
http_req_connecting       1671580418  105.392000    vu=1&iter=0
http_req_tls_handshaking  1671580418  113.879000    vu=1&iter=0
http_req_sending          1671580418  0.081000      vu=1&iter=0
http_req_waiting          1671580418  103.586000    vu=1&iter=0
http_req_receiving        1671580418  0.096000      vu=1&iter=0
http_req_failed           1671580418  0.000000      vu=1&iter=0
data_sent                 1671580418  438.000000    iter=0&vu=1
data_received             1671580418  16962.000000  vu=1&iter=0
iteration_duration        1671580418  825.789910    vu=1&iter=0
iterations                1671580418  1.000000      vu=1&iter=0

But it doesn't fully fix the issue and adds more tech debt.

I'm up for taking a look at how we could refactor that output, but:

  1. I can't commit to doing it for the 0.43.0 release
  2. I'm not sure if you'd prefer to tackle it internally at Grafana?
  3. Would like to see some integration tests around it - so would need some help with that :)

I'm gonna update the title of this issue just to make it clear that this is only for the CSV output.

@leonyork leonyork changed the title vu and iter system tags not output after v0.40.0 CSV output: vu and iter system tags not output after v0.40.0 Dec 21, 2022
@codebien codebien self-assigned this Dec 21, 2022
@na--
Copy link
Member

na-- commented Dec 23, 2022

@leonyork, thanks again for finding and reporting the issue, and for the draft attempt to try and fix it!

If you want to continue working on the fix, we'll try to support you. Absolutely no pressure though - if you don't have the time, we will pick this up and get a fix before v0.43.0, which is scheduled to be released in February. Since this was a recent regression, even if it's a relatively minor bug, we decided we'd like to fix it in the next k6 release.

We still don't know if we will just go for a fix or for a complete rewrite of the output ourselves, it depends on the time and complexity, to be decided by whoever picks the issue 😅 But yeah, a few integration tests are a must in either case! We already have some for the json output (e.g. this one), so we should add some for the csv output as well. Though we have an open PR that refactors the integration tests quite a bit (#2821), so probably after it is merged would be simpler.

@leonyork
Copy link
Contributor Author

Hey @na--, turns out the fix does actually work. column on Mac removes empty fields when formatting. So, if there's no value in a field, it'll shift all the fields left by one in the columns to the right...

i.e.

col1,col2,col3
f1.1,f1.2,f1.3
f2.1,,f2.3

becomes

col1  col2  col3
f1.1  f1.2  f1.3
f2.1  f2.3

The results in my previous comment were based on formatting the CSV results with column on Mac.

🤦

Anyway I updated #2829 with the tests I ran using a version of column that doesn't have the "empty field removal" feature.

I'm not sure how close you are with #2821, but a plan of attack could be:

  1. Fix the issue (e.g. what's in fix: csv output for non indexable metrics #2829)
  2. Once Refactor integration tests #2821 is merged, then add some integration tests around the CSV output.
  3. Once the integration tests are in, it'll be a much safer refactor of the CSV output.

@na--
Copy link
Member

na-- commented Jan 3, 2023

Sounds like a great plan of attack ❤️ As you have seen, we are currently in the midst of some heavy refactoring of the integration tests and some other core parts of k6... Most of that should hopefully be wrapped up by some time next week 🤞

So I'd say that it will be safe enough to make a PR with csv output integration tests once #2821, #2800, #2803, and #2810 are merged. The rest of the PR chain that culminates in #2815 should not cause any conflicts, I think 🤔

@leonyork
Copy link
Contributor Author

Excellent! I'll try and keep an eye on those PRs and move mine to ready for review.

@na--
Copy link
Member

na-- commented Jan 12, 2023

We've been somewhat delayed with the integration test refactoring, so I think it's fairly safe to merge your PR without an integration test and add it later 🤔

Besides, as @codebien has commented in #2829 (review), it might even be possible to easily edit the regular unit tests to cover this behavior. So yeah, ignore the integration tests and the PRs I linked in my previous comment for now 😅

@codebien
Copy link
Contributor

This issue has been fixed by #2829, thanks to @leonyork's contribution.

@codebien codebien removed their assignment Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants