Add performance tracking machinery #1509

serramatutu · 2024-11-05T21:14:44Z

Summary

This PR introduces some new testing functionality that we can use to avoid performance regressions. From a user perspective, the performance benchmarking workflow is:

Add a new function you want to start tracking to the list in performance_helpers.py
Create a new test in tests_metricflow/performance/ that does something we want to track performance for. You can use the following fixtures:

measure_compilation_performance: measure the compilation of some SQL based on a query spec and a manifest.
perf_tracker: a "lower level" fixture that you can usd to create measurements and sessions (more on that later)
... : we can create more performance-tracking fixtures as needed, all using the perf_tracker base thing.

Run make perf to generate a performance-report.json
Run make perf-compare A=base-report.json B=other-report.json to generate a Markdown file with a report on changes.

The performance report

The performance-report.json file contains the "raw" data for a given run. It looks like this:

{
    "sessions": {
        "test_simple_manifest.py::test_simple_query": {
            "session_id": "test_simple_manifest.py::test_simple_query",
            "contexts": {
                "DataflowPlanBuilder.build_plan": {
                    "context_id": "DataflowPlanBuilder.build_plan",
                    "cpu_ns_average": 5089000,
                    "cpu_ns_median": 5089000,
                    "cpu_ns_max": 5089000,
                    "wall_ns_average": 5204000,
                    "wall_ns_median": 5204000,
                    "wall_ns_max": 5204000
                },
                "global": {
                    "context_id": "global",
                    "cpu_ns_average": 10156000,
                    "cpu_ns_median": 10156000,
                    "cpu_ns_max": 10156000,
                    "wall_ns_average": 10343000,
                    "wall_ns_median": 10343000,
                    "wall_ns_max": 10343000
                }
            }
        },
        ...
    }
 }

Each "report" contains a list of "sessions". A session is a run where multiple measurements have been taken. Each session contains a variety of "contexts", which are aggregations (avg, mean, max) of CPU and wall time measurements based on a context_id such as DataflowPlanBuilder.build_plan. We can add memory tracking later using psutil.

Report comparison

I created a script called compare_reports.py which takes in two different reports files and calculates the difference between them (in absolute values and also percent). It then generates a markdown file called performance-comparison.md with a nice little table like the one on my first comment.

How it works

When you run make perf, the perf_tracker fixture uses the track_performance context manager, which

Instantiates a PerformanceTracker
Monkeypatches all class methods in _tracking_class_methods and all its occurrences if it's already been imported by other methods. I used unittest.mock.patch for this.
Yields the performance tracker

The tests then all run, and at the end of it all the fixture compiles all reports by calling get_report_set(). This will generate the ContextReports, SessionReports and SessionReportSet based on the raw list of Calls to the tracked methods.

After this, it writes the report to the JSON file.

Then, when you run make perf-compare the script loads both the JSON files and uses the compare() methods in each of the report classes. These will generate Comparison classes with the percent and absolute differences. The script then renders these classes to markdown and writes it to a file.

Next steps

Make a Github Actions bot that runs this on PRs to compare performance against main.

This commit adds the basic functionality needed for measuring performance. This includes: - classes to measure performance in different sessions using context managers - a context manager that can be used to instrument certain methods with performance tracking machinery - test fixtures for automatically generating reports using these classes

This commit adds `Comparison` classes which contain comparisons between performance measurements taken at two different points in time. It uses these classes in a new script that can be called to generate a markdown report of what changed regarding performance.

github-actions · 2024-11-05T21:15:00Z

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

serramatutu · 2024-11-05T21:15:21Z

Performance comparison

Comparing performance-report.json against performance-report-2.json

Worst performance hit: -3.93% in test_simple_manifest.py::test_simple_query

`test_simple_manifest.py::test_simple_query_2` 🚀

context	CPU avg	CPU median	CPU max	Wall avg	Wall median	Wall max
`DataflowPlanBuilder.build_plan`	-0.0318ms (-7.27%)	-0.0318ms (-7.27%)	-0.0318ms (-7.27%)	-0.0383ms (-8.76%)	-0.0383ms (-8.76%)	-0.0383ms (-8.76%)
`global`	-0.0583ms (-5.97%)	-0.0583ms (-5.97%)	-0.0583ms (-5.97%)	-0.0642ms (-6.56%)	-0.0642ms (-6.56%)	-0.0642ms (-6.56%)

`test_simple_manifest.py::test_simple_query` 🚀

context	CPU avg	CPU median	CPU max	Wall avg	Wall median	Wall max
`DataflowPlanBuilder.build_plan`	-0.0377ms (-7.41%)	-0.0377ms (-7.41%)	-0.0377ms (-7.41%)	-0.0281ms (-5.40%)	-0.0281ms (-5.40%)	-0.0281ms (-5.40%)
`global`	-0.0572ms (-5.63%)	-0.0572ms (-5.63%)	-0.0572ms (-5.63%)	-0.0406ms (-3.93%)	-0.0406ms (-3.93%)	-0.0406ms (-3.93%)

plypaul · 2024-11-05T21:42:30Z

metricflow-semantics/metricflow_semantics/test_helpers/performance_helpers.py

+        start_wall = time.time_ns()
+        start_cpu = time.process_time_ns()
+
+        yield self


High level question - what's the advantage of doing this ourself instead of using the output of cProfile?

Honestly I just didn't really know how cProfile worked and this looked easier. Happy to change how PerformanceTracker works to use cProfile under the hood. We can reuse the thing that compares reports though.

serramatutu added 2 commits November 5, 2024 21:58

cla-bot bot added the cla:yes label Nov 5, 2024

serramatutu requested review from plypaul and courtneyholcomb November 5, 2024 21:14

plypaul reviewed Nov 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add performance tracking machinery #1509

Add performance tracking machinery #1509

serramatutu commented Nov 5, 2024 •

edited

Loading

github-actions bot commented Nov 5, 2024

serramatutu commented Nov 5, 2024

plypaul Nov 5, 2024

serramatutu Nov 6, 2024

Add performance tracking machinery #1509

Are you sure you want to change the base?

Add performance tracking machinery #1509

Conversation

serramatutu commented Nov 5, 2024 • edited Loading

Summary

The performance report

Report comparison

How it works

Next steps

github-actions bot commented Nov 5, 2024

serramatutu commented Nov 5, 2024

Performance comparison

test_simple_manifest.py::test_simple_query_2 🚀

test_simple_manifest.py::test_simple_query 🚀

plypaul Nov 5, 2024

Choose a reason for hiding this comment

serramatutu Nov 6, 2024

Choose a reason for hiding this comment

serramatutu commented Nov 5, 2024 •

edited

Loading

`test_simple_manifest.py::test_simple_query_2` 🚀

`test_simple_manifest.py::test_simple_query` 🚀