Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf checker #1840

Closed
wants to merge 3 commits into from
Closed

Perf checker #1840

wants to merge 3 commits into from

Conversation

beroy
Copy link
Collaborator

@beroy beroy commented Oct 31, 2023

Early version of the profiler.

Changes: #1607

Notes for Reviewer:

@beroy beroy force-pushed the perf_checker branch 2 times, most recently from 36ffb65 to d673b9f Compare November 1, 2023 20:13
@codecov-commenter
Copy link

codecov-commenter commented Nov 1, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

see 75 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

@beroy beroy requested review from atolopko-czi and ebezzi November 2, 2023 17:38
import tiledbsoma as soma

census_S3_latest = dict(census_version="latest")
census_local_copy = dict(uri="/Users/brobatmili/projects/census_data/")
Copy link
Member

@atolopko-czi atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm user path, will require accessing on S3. ideally, to eliminate network variability impact on profiling, the census should be copied locally first, but I don't think host will have sufficient disk space and it's not worth copying the whole census for the single eye tissue-based test. So S3 access is probably best for now, but should deal with network variability by averaging a few runs (maybe that could be built into the profiler as a feature)

Copy link
Member

@atolopko-czi atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since query times can change with census release versions, due to ever increasing data sizes and schema changes, all profiling comparisons need to be done against a single census data release. So the data release should be fixed. It should be updated on a reasonable schedule, such as when a new LTS data release is made available.

- "main"

paths:
- ".github/workflows/profiler"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this intended to be indented under on?


def main():
t1 = perf_counter()
with cellxgene_census.open_soma(**census_local_copy) as census:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should use Census API here. We can do the equivalent with just TileDBSOMA API.

@@ -4,5 +4,5 @@
name="soma-profiler",
version="1.0",
packages=find_packages(),
requires=["gitpython", "psutil"],
requires=["gitpython", "comacore", "psutil", "tiledbsoma", "cellxgene_census"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somacore. Though that shouldn't be needed since tiledbsoma is required (transitive dependency)


db = data.FileBasedProfileDB()
actual_max_ts = 0
dt = db.find("python ann_data.py")
Copy link
Member

@atolopko-czi atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would parameterize the profile run name(s) to run this. Only perf_checker.sh should know what profiling scripts are being run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would rename to profile_report.py, since top-like behavior implies that is monitoring the profiler while it runs.

last_two = dt[-2:]
c = 0

for s in dt:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for s in dt:
for s in last_two:

I think this is the intention...

L[1] = dt[1].user_time_sec
for i in range(0, len(dt)):
print(f"{i} dt[{i}].user_time_sec = {dt[i].user_time_sec} ts {dt[i].timestamp}")
print(f"L0 = {L[0]} L1 {L[1]}")
Copy link
Member

@atolopko-czi atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(f"L0 = {L[0]} L1 {L[1]}")
print(f"Prev = {L[0]}, Curr = {L[1]}")


L = [1, 2]
L[0] = dt[0].user_time_sec
L[1] = dt[1].user_time_sec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe elapsed time? or both.

for s in dt:
new_db = sorted(dt, key=lambda ProfileData: ProfileData.timestamp)

L = [1, 2]
Copy link
Member

@atolopko-czi atolopko-czi Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
L = [1, 2]
L = []

also, no need to capitalize


if threshold * float(L[1]) < float(L[0]) or float(L[1]) > threshold * float(L[0]):
raise SystemExit(f"Potential performance degradation detected {L[0]} va {L[1]}")
print("No recent performance degradation detected")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

report should also output the previous and current tiledbsoma versions

Copy link
Member

@atolopko-czi atolopko-czi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few lower level comments. But per recent meeting, consensus is that we should avoid running this on GHA at all, unless we can do so on an instance with consistent specs. Also, the more expensive the operations that are being profiled, the more informative the results will be. So we might consider GHA large runners, if we can get work the paid-plan details between the CZI and TileDB organizations.

@beroy
Copy link
Collaborator Author

beroy commented Nov 13, 2023

I think we may still have the spec issue with GHA large instances. Maybe infra can help by providing us with a fixed instance. Another possible option is to use an AWS bare metal instance

@beroy
Copy link
Collaborator Author

beroy commented Jan 19, 2024

Closing the profiler harness will go to the census repo

@johnkerl
Copy link
Member

Noting the above

Closing the profiler harness will go to the census repo

-- closing this PR

@johnkerl johnkerl closed this May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants