Perf checker #1840

beroy · 2023-10-31T15:48:23Z

Early version of the profiler.

Changes: #1607

Notes for Reviewer:

codecov-commenter · 2023-11-01T20:21:20Z

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

see 75 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

atolopko-czi · 2023-11-03T17:13:07Z

profiler/ann_data.py

+import tiledbsoma as soma
+
+census_S3_latest = dict(census_version="latest")
+census_local_copy = dict(uri="/Users/brobatmili/projects/census_data/")


rm user path, will require accessing on S3. ideally, to eliminate network variability impact on profiling, the census should be copied locally first, but I don't think host will have sufficient disk space and it's not worth copying the whole census for the single eye tissue-based test. So S3 access is probably best for now, but should deal with network variability by averaging a few runs (maybe that could be built into the profiler as a feature)

since query times can change with census release versions, due to ever increasing data sizes and schema changes, all profiling comparisons need to be done against a single census data release. So the data release should be fixed. It should be updated on a reasonable schedule, such as when a new LTS data release is made available.

atolopko-czi · 2023-11-03T17:14:54Z

.github/workflows/profiler.yml

+      - "main"
+
+paths:
+      - ".github/workflows/profiler"


was this intended to be indented under on?

atolopko-czi · 2023-11-03T17:17:00Z

profiler/ann_data.py

+
+def main():
+    t1 = perf_counter()
+    with cellxgene_census.open_soma(**census_local_copy) as census:


I don't think we should use Census API here. We can do the equivalent with just TileDBSOMA API.

atolopko-czi · 2023-11-03T17:19:57Z

profiler/setup.py

@@ -4,5 +4,5 @@
    name="soma-profiler",
    version="1.0",
    packages=find_packages(),
-    requires=["gitpython", "psutil"],
+    requires=["gitpython", "comacore", "psutil", "tiledbsoma", "cellxgene_census"],


somacore. Though that shouldn't be needed since tiledbsoma is required (transitive dependency)

atolopko-czi · 2023-11-03T17:21:07Z

profiler/top_profiler.py

+
+db = data.FileBasedProfileDB()
+actual_max_ts = 0
+dt = db.find("python ann_data.py")


Would parameterize the profile run name(s) to run this. Only perf_checker.sh should know what profiling scripts are being run.

atolopko-czi · 2023-11-03T17:22:03Z

profiler/top_profiler.py

Would rename to profile_report.py, since top-like behavior implies that is monitoring the profiler while it runs.

atolopko-czi · 2023-11-03T17:36:09Z

profiler/top_profiler.py

+last_two = dt[-2:]
+c = 0
+
+for s in dt:


Suggested change

for s in dt:

for s in last_two:

I think this is the intention...

atolopko-czi · 2023-11-03T17:36:27Z

profiler/top_profiler.py

+    L[1] = dt[1].user_time_sec
+    for i in range(0, len(dt)):
+        print(f"{i} dt[{i}].user_time_sec = {dt[i].user_time_sec} ts {dt[i].timestamp}")
+    print(f"L0 = {L[0]} L1 {L[1]}")


Suggested change

print(f"L0 = {L[0]} L1 {L[1]}")

print(f"Prev = {L[0]}, Curr = {L[1]}")

atolopko-czi · 2023-11-03T17:37:46Z

profiler/top_profiler.py

+
+    L = [1, 2]
+    L[0] = dt[0].user_time_sec
+    L[1] = dt[1].user_time_sec


maybe elapsed time? or both.

atolopko-czi · 2023-11-03T17:37:56Z

profiler/top_profiler.py

+for s in dt:
+    new_db = sorted(dt, key=lambda ProfileData: ProfileData.timestamp)
+
+    L = [1, 2]


Suggested change

L = [1, 2]

L = []

also, no need to capitalize

atolopko-czi · 2023-11-03T17:38:56Z

profiler/top_profiler.py

+
+    if threshold * float(L[1]) < float(L[0]) or float(L[1]) > threshold * float(L[0]):
+        raise SystemExit(f"Potential performance degradation detected {L[0]} va {L[1]}")
+    print("No recent performance degradation detected")


report should also output the previous and current tiledbsoma versions

atolopko-czi

A few lower level comments. But per recent meeting, consensus is that we should avoid running this on GHA at all, unless we can do so on an instance with consistent specs. Also, the more expensive the operations that are being profiled, the more informative the results will be. So we might consider GHA large runners, if we can get work the paid-plan details between the CZI and TileDB organizations.

beroy · 2023-11-13T20:41:12Z

I think we may still have the spec issue with GHA large instances. Maybe infra can help by providing us with a fixed instance. Another possible option is to use an AWS bare metal instance

beroy · 2024-01-19T18:16:41Z

Closing the profiler harness will go to the census repo

johnkerl · 2024-05-28T20:55:07Z

Noting the above

Closing the profiler harness will go to the census repo

-- closing this PR

beroy force-pushed the perf_checker branch 2 times, most recently from 36ffb65 to d673b9f Compare November 1, 2023 20:13

beroy requested review from atolopko-czi and ebezzi November 2, 2023 17:38

atolopko-czi reviewed Nov 3, 2023

View reviewed changes

profiler/top_profiler.py Outdated

L = [1, 2]

L[0] = dt[0].user_time_sec

L[1] = dt[1].user_time_sec

Copy link

Member

atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe elapsed time? or both.

atolopko-czi reviewed Nov 3, 2023

View reviewed changes

atolopko-czi requested changes Nov 3, 2023

View reviewed changes

beroy mentioned this pull request Nov 6, 2023

[ci] Deploy profiler in GitHub using large runner #1607

Open

beroy added 3 commits November 10, 2023 08:03

Minor fixes to profiler

13e739e

[Python] Early version of the profiler github actions

154c0c2

[profiler] apply most of the review comments

5339583

beroy force-pushed the perf_checker branch from d673b9f to 5339583 Compare November 10, 2023 19:14

johnkerl closed this May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf checker #1840

Perf checker #1840

beroy commented Oct 31, 2023 •

edited

Loading

codecov-commenter commented Nov 1, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023

atolopko-czi Nov 3, 2023

atolopko-czi Nov 3, 2023

atolopko-czi Nov 3, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023

atolopko-czi Nov 3, 2023

atolopko-czi Nov 3, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023

atolopko-czi Nov 3, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023

atolopko-czi left a comment

beroy commented Nov 13, 2023 •

edited

Loading

beroy commented Jan 19, 2024

johnkerl commented May 28, 2024

	print(f"L0 = {L[0]} L1 {L[1]}")
	print(f"Prev = {L[0]}, Curr = {L[1]}")

Perf checker #1840

Perf checker #1840

Conversation

beroy commented Oct 31, 2023 • edited Loading

codecov-commenter commented Nov 1, 2023 • edited Loading

Codecov Report

atolopko-czi Nov 3, 2023 • edited Loading

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023 • edited Loading

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023 • edited Loading

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023 • edited Loading

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023 • edited Loading

Choose a reason for hiding this comment

atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

atolopko-czi left a comment

Choose a reason for hiding this comment

beroy commented Nov 13, 2023 • edited Loading

beroy commented Jan 19, 2024

johnkerl commented May 28, 2024

beroy commented Oct 31, 2023 •

edited

Loading

codecov-commenter commented Nov 1, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023 •

edited

Loading

atolopko-czi Nov 3, 2023 •

edited

Loading

beroy commented Nov 13, 2023 •

edited

Loading