Use `process_time` instead of just `time` for measuring test performance. #1413

mergezalot · 2022-10-30T15:53:46Z

I hope this removes a handful of random test fails on supposedly busy cloud runners. At least it is worth a try.

thread_time https://docs.python.org/3/library/time.html#time.thread_time could help with parallel test runs, too, but that only is available since Python 3.7.

process_time is available since Python 3.3.

This PR is a follow up to

…nce. I hope this removes a handful of random test fails on supposedly busy cloud runners. `thread_time` https://docs.python.org/3/library/time.html#time.thread_time could help with parallel test runs, too, but that only is available since Python 3.7. process_time is available since Python 3.3.

codecov · 2022-10-30T16:03:02Z

Codecov Report

Base: 94.19% // Head: 94.19% // No change to project coverage 👍

Coverage data is based on head (e73032e) compared to base (613b370).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1413   +/-   ##
=======================================
  Coverage   94.19%   94.19%           
=======================================
  Files          28       28           
  Lines        5085     5085           
  Branches      968      968           
=======================================
  Hits         4790     4790           
  Misses        176      176           
  Partials      119      119

Impacted Files	Coverage Δ
PyPDF2/_page.py	`92.16% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

pubpub-zz · 2022-10-30T16:17:18Z

This is a Just in time fix!
I faced the issue about one hour ago pushing my PR : will use your solution😍

pubpub-zz · 2022-10-30T17:31:31Z

@mergezalot,
I've tried your solution but still not good on my PR under Python 11. Any ideas?

Simply skip on python < 3.7

mergezalot · 2022-10-31T19:03:53Z

@pubpub-zz educated guess: either parallel tests, super busy runners, or low performance runners.

I tried a different approach now:
Use thread_time on python > 3.7. Skip otherwise. If that still fails, I suggest we drop my test. It is not that likely to catch future problems, and it has many false positives.

The test before was to brittle. We need to keep an open eye to the benchmarks in future, but also be careful with interpreting the numbers. Credits to mergezalot in PR #1413

MartinThoma · 2022-11-01T12:43:36Z

I've added this test as a benchmark: https://py-pdf.github.io/PyPDF2/dev/bench/

In the following test you can see that the performance is very different from run to run:

You will get one datapoint for every future commit in main:

MartinThoma · 2022-11-01T13:17:54Z

@mergezalot Thank you for your work on this topic. time.process_time was new to me 🙏

However, I think timing is only suitable for distinguishing orders of magnitude (e.g. as for test_do_not_get_stuck_on_large_files_without_start_xref introduced in #808 . There the difference was more than 5 minutes for the old solution vs less than a second for the new one. @dsk7 chose to make the test fail if the time is over one minute - that worked so far very well.

MartinThoma · 2022-11-01T13:18:09Z

Would it be ok to close this PR?

mergezalot · 2022-11-01T15:28:07Z

@MartinThoma yes, let's close this PR. Benchmarking is much better. Thank you.

pubpub-zz added a commit to pubpub-zz/pypdf that referenced this pull request Oct 30, 2022

merge py-pdf#1413

b3c5505

Use time.thread_time due time tests due to parallel tests

138972e

Simply skip on python < 3.7

use getattr properly

e73032e

MartinThoma mentioned this pull request Nov 1, 2022

DEV: Modify read_string_from_stream to a benchmark #1415

Merged

mergezalot closed this Nov 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `process_time` instead of just `time` for measuring test performance. #1413

Use `process_time` instead of just `time` for measuring test performance. #1413

mergezalot commented Oct 30, 2022

codecov bot commented Oct 30, 2022 •

edited

Loading

pubpub-zz commented Oct 30, 2022

pubpub-zz commented Oct 30, 2022

mergezalot commented Oct 31, 2022 •

edited

Loading

MartinThoma commented Nov 1, 2022

MartinThoma commented Nov 1, 2022

MartinThoma commented Nov 1, 2022

mergezalot commented Nov 1, 2022

Use process_time instead of just time for measuring test performance. #1413

Use process_time instead of just time for measuring test performance. #1413

Conversation

mergezalot commented Oct 30, 2022

codecov bot commented Oct 30, 2022 • edited Loading

Codecov Report

pubpub-zz commented Oct 30, 2022

pubpub-zz commented Oct 30, 2022

mergezalot commented Oct 31, 2022 • edited Loading

MartinThoma commented Nov 1, 2022

MartinThoma commented Nov 1, 2022

MartinThoma commented Nov 1, 2022

mergezalot commented Nov 1, 2022

Use `process_time` instead of just `time` for measuring test performance. #1413

Use `process_time` instead of just `time` for measuring test performance. #1413

codecov bot commented Oct 30, 2022 •

edited

Loading

mergezalot commented Oct 31, 2022 •

edited

Loading