Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support filtering of info files? #48

Closed
wsnyder opened this issue May 31, 2020 · 19 comments · Fixed by #49
Closed

Support filtering of info files? #48

wsnyder opened this issue May 31, 2020 · 19 comments · Fixed by #49

Comments

@wsnyder
Copy link
Contributor

wsnyder commented May 31, 2020

I have been using fastcov -X to make a master .info file, I also use fastcov -C to join infos. Thanks!

I had then in later stages been using a mix of lcov and scripts to process that master info, but am having several problems with lcov, and rather than rewriting my own filter would ideally like to use the filtering already in fastcov.

Basically I'm looking to read the info, process source exclusions, and write an info, something like

fastcov.py -f totaled_coverage.info -c sources --exclude /usr -o filtered.info

But as far as I can tell from peeking at sources there's no way to read an info except for -C, and -C doesn't do source processing. I tried replacing most of the guts of processGcdas with:

    for filename in coverage_files:
        fastcov_jsons.append(parseInfo(filename))

but this hack fails and the error suggests that parseInfo doesn't have quite the same internal format as what gcovWorker creates. (if that worked was eventually going to have this conditional on a filename ending in .info. do the parseInfo, then strip that filename from list of files for gcov processing).

I would also be willing to use json format instead of info format if that helps, but don't think that works either.

The naive way I was thinking this worked (until I looked at sources) is you would be able to feed in any number of gcov, info or json, they get gcov'ed (as appropriate), joined (any format), filtered (unless -X), and finally written as .info or .json.

Is there a way to do what I want, or perhaps you would be willing to add it? Thanks

@RPGillespie6
Copy link
Owner

the error suggests that parseInfo doesn't have quite the same internal format as what gcovWorker creates

Yes, you are correct. It's a little confusing, but there are actually 2 json formats - gcov json and fastcov json. Gcov json is the json that comes directly from gcov. The coverage structure is array-based, meaning performing searches/deletions on files/functions/lines/branches requires iterating over the array to find what you are looking for (or binary search, assuming you sort the arrays first). I wanted to avoid all that complexity, so for that reason I convert it as soon as possible to "fastcov json" which is very similar to gcov json, except it is dictionary-based instead of array-based. This makes the resulting python much simpler - I can check if a source/function/line/branch exists, or delete a source/function/line/branch, etc. without iterating over a list since they are now essentially keys in a map.

So, long story short, filtering is currently performed on gcov json, because there is no point in converting/processing it if the user doesn't even want it in the final report.

However, I agree with you that it would be nice to allow filtering on existing info files. This would not be hard to add. Basically I'll change this:

def combineCoverageFiles(args):
    logging.info("Performing combine operation")
    fastcov_json = parseAndCombine(args.combine)
    dumpFile(fastcov_json, args)

to this:

def combineCoverageFiles(args):
    logging.info("Performing combine operation")
    fastcov_json = parseAndCombine(args.combine)
    applyFilters(fastcov_json, args)
    dumpFile(fastcov_json, args)

I will probably have some time this coming week to implement applyFilters.

@RPGillespie6
Copy link
Owner

RPGillespie6 commented May 31, 2020

The naive way I was thinking this worked (until I looked at sources) is you would be able to feed in any number of gcov, info or json, they get gcov'ed (as appropriate), joined (any format), filtered (unless -X), and finally written as .info or .json.

Currently there is no support for .gcov files, however, you can feed in any number of info or (fastcov) json files to a combine operation. It would be very easy to add support for combine operations on gcov json files, but it would be a little more difficult to support .gcov files since they are text files that would need to be custom parsed.

Hopefully, after I add filtering support to combine operations, it will now work as you expect. You'd be able to do:

$ fastcov -C report1.info report2.info report3.json --exclude /usr/include -o --lcov report_final.info

This would combine the 3 reports, remove coverage on files coming from /usr/include, and then write out the final report

@wsnyder
Copy link
Contributor Author

wsnyder commented May 31, 2020

Makes sense. I didn't mean to suggest adding reading .gcov files, I was referring to the current calling of gcov when you find a .gcno/.gcda.

RPGillespie6 added a commit that referenced this issue Jun 3, 2020
Description:
- Fix #48
- Filter options now apply during combine operations
@RPGillespie6
Copy link
Owner

I have created an initial implementation on the branch bpg/add_combine_filtering. I need to add a few more functional tests to cover these new scenarios before I open PR, but should be done soonish.

@wsnyder
Copy link
Contributor Author

wsnyder commented Jun 3, 2020

Not sure if this was intended to be tested yet, but gave it a try.

Do I use -f to specify files? Seems most obvious but had problems.

$ ls -l app_total.info
... 11287283 ... app_total.info

# There's sources in it
$ fgrep SF: app_total.info
SF:src/V3Active.cpp
...

$ fastcov.py -f app_total.info --exclude /usr -s . -o app_total_f.info
[0.003s] [info]: Found 1 coverage files (user provided)
[0.003s] [info]: Spawned 1 gcov processes, each processing at most 5 coverage files
[0.005s] [info]: Processed 0 .gcov files (0 total, 0 skipped)
{}  // Didn't expect 0 above, this is value of fastcov_json["sources"]
[0.000s] [info]: Spawned 0 threads each scanning at most 5 source files
[0.000s] [info]: Scanned 0 source files for exclusion markers
[0.000s] [info]: Created fastcov json file 'app_total_f.info'

# The info wasn't read.
$ ls -l app_total_f.info
... 15 ... app_total_f.info

# Or, am I supposed to use -C?
$ fastcov.py -C app_total.info --exclude /usr -s . -o app_total_f.info
[0.002s] [info]: Performing combine operation
[0.180s] [info]: Setting app_total.info as base report
Traceback (most recent call last):
  File "fastcov.py", line 695, in <module>
    main()
  File "fastcov.py", line 655, in main
    combineCoverageFiles(args)
  File "fastcov.py", line 557, in combineCoverageFiles
    filterFastcov(fastcov_json, args)
  File "fastcov.py", line 189, in filterFastcov
    for source in fastcov_json["sources"].keys():
RuntimeError: dictionary changed size during iteration

@RPGillespie6
Copy link
Owner

RPGillespie6 commented Jun 3, 2020

Oops, sorry, I just let you know the branch name so you can keep an eye on the changes/progress if you are interested (feel free to test as well, but might not be stable). It's intended to be used with -C:

$ fastcov.py -C app_total.info --exclude /usr -s . -o app_total_f.info

Should be a valid command. In this case the issue is:

for source in fastcov_json["sources"].keys():

should be

for source in list(fastcov_json["sources"].keys()):

I haven't worked out all the kinks yet; I'll open a PR when I think it is ready.

@RPGillespie6
Copy link
Owner

RPGillespie6 commented Jun 3, 2020

-f is when you want to specify which .gcda files you want (those are the files generated when you run unit tests - by default fastcov does a recursive glob to discover all of them). Maybe that's counter-intuitive, I'm not sure what -f does under lcov...

RPGillespie6 added a commit that referenced this issue Jun 3, 2020
Description:
- Fix #48
- Filter options now apply during combine operations
RPGillespie6 added a commit that referenced this issue Jun 4, 2020
Description:
- Fix #48
- Filter options now apply during combine operations
@RPGillespie6
Copy link
Owner

RPGillespie6 commented Jun 4, 2020

@wsnyder Can you confirm the latest fastcov.py on master resolves the issue? I've tested with --exclude, --include, and --source-files filters. If so I will tag v1.7, otherwise I will re-open this issue.

Run with, i.e. $ fastcov.py -C app_total.info --exclude /usr -s . -o app_total_f.info

@wsnyder
Copy link
Contributor Author

wsnyder commented Jun 4, 2020

--include seems to work as expected.
--exclude seems to work as expected.

# Here's a source in my info
$ fgrep SF: app_total.info
SF:src/V3Active.cpp
...
# And the source exists with exclusions
$ grep EXCL src/V3Active.cpp
 UINFO(2, __FUNCTION__ << ": " << endl);  //code_coverage: // LCOV_EXCL_LINE LCOV_EXCL_BR_LINE
# Let's filter
$ fastcov.py -C app_total.info -s src/V3AstNodes.cpp -lcov -o app_total_f.info
[0.002s] [info]: Performing combine operation
[0.163s] [info]: Setting app_total.info as base report
[0.006s] [info]: Created fastcov json file 'coverage.json'
# But app_total_f.info is empty?

I made patches I'll put in a pull in a moment that I humbly suggest makes the flow closer to what I expected.

The remaining problem after applying that fix is I have

fastcov.py -C app_total.info  --exclude '/usr/*' --exclude '*/verilog.c' -s . -o app_total_f.info
[0.002s] [info]: Performing combine operation
[0.171s] [info]: Setting app_total.info as base report
Exception in thread Thread-19:
[0.092s] [info]: Spawned 30 threads each scanning at most 8 source files
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/svaha/wsnyder/SandBox/homecvs/v4/verilator/nodist/fastcov.py", line 280, in exclMarkerWorker
    for i, line in enumerate(getSourceLines(source, fallback_encodings), 1):
  File "/svaha/wsnyder/SandBox/homecvs/v4/verilator/nodist/fastcov.py", line 266, in getSourceLines
    with open(source, encoding=encoding) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'src/obj_dbg/verilog.c'
[0.022s] [info]: Scanned 234 source files for exclusion markers
[0.252s] [info]: Created fastcov json file 'app_total_f.info'
travis_fold:end:filter

Note I asked for "*/verilog.c" to get excluded but then get an error on it. Trying to exclude src/obj_dbg/verilog.c still doesn't work.

@wsnyder
Copy link
Contributor Author

wsnyder commented Jun 4, 2020

Here's the three issues I see, hopefully all minor as seems close!

  • Posted Patch for filtering #50 with the code that I think fixes EXCL parsing.
  • As described in Patch for filtering #50 you need to decide how --skip-exclusion-markers/--exclusion-markers should work with -C if you care about backwards compatibility (maybe doesn't matter?).
  • The */verilog.c issue mentioned above.

@wsnyder
Copy link
Contributor Author

wsnyder commented Jun 4, 2020

Realize I lost a filterFastcov in there. It could go back into combineCoverageFiles, or maybe is better in main? Leave it to your preference.

@RPGillespie6
Copy link
Owner

RPGillespie6 commented Jun 4, 2020

A few things - fastcov does not currently use regex for filtering, just substring matching. So this would likely work:

fastcov.py -C app_total.info  --exclude /usr/ /verilog.c -s . -o app_total_f.info

(note that because it is substring matching, you need to provide enough to prevent false positives. So if you have a path .../src/usr/../file.c then /usr/ would match. But if you passed --exclude /usr/include/ it would not match.)

If I understand your PR correctly, you want fastcov -C to simply switch the source of coverage (gcov vs. import .info/.json), but still apply the rest of the fastcov pipeline (filtering, EXCL scanning, etc)?

@RPGillespie6
Copy link
Owner

I don't really use the combine operation feature, so I'm not a typical user, but if you think it is more intuitive that way, I can certainly make that change.

I should also probably update the README so it is more clear what the fastcov pipeline is, and how -C would alter it.

@RPGillespie6 RPGillespie6 reopened this Jun 4, 2020
@RPGillespie6
Copy link
Owner

Okay, I've made a new branch bpg/fastcov_flow. This should address everything in the PR. It passes the test fixture, however, I need to still update the README before I open a PR.

Feel free to test.

@wsnyder
Copy link
Contributor Author

wsnyder commented Jun 4, 2020

Looks like it's working and EXCLs also properly handled unlike lcov - much thanks.
Only catch is I have lower coverage then I got under the lcov flow but probably something in my script changes.

@wsnyder
Copy link
Contributor Author

wsnyder commented Jun 4, 2020

P.S. would suggest instead of throwing an exception when a source file isn't thrown just print a "normal" warning without a backtrace, as think it's a better user experience - a backtrace suggests your code is broken which it isn't.

@RPGillespie6
Copy link
Owner

RPGillespie6 commented Jun 4, 2020

Was it branch coverage that dropped, or line coverage? If the former, it could be because you are passing --branch-coverage (attempt auto-remove "useless" branches) instead of --exceptional-branch-coverage (include all branches).

If it is line coverage, please determine if there is any loss of correctness. fastcov does not (or rather, should not) alter the line coverage it gets from gcov (other than with EXCL comments). If you can demonstrate a loss of correctness case, I will definitely fix it (and more importantly, update the test fixture to catch future regressions).

And yes, fastcov is very light on try/except - usually I just let the error bubble out if fastcov hits an exceptional situation. I can make fastcov print a warning if an EXCL source file isn't found instead of letting error bubble out.

@wsnyder
Copy link
Contributor Author

wsnyder commented Jun 4, 2020

Fully working. I had a bug in my script making the coverage delta, all resolved.

@RPGillespie6
Copy link
Owner

Closing this issue. I created a new issue #53 to track the exclusion warnings enhancement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants