-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make mp4 parsing **a lot** faster & tremendously lower memory overhead #7860
Conversation
We still copy each sample when we enqueue a chunk, which feels like it should not be necessary. It would be nice to have direct access to the video data arrow buffer in decoders, so that we can index into it there, but I wasn't 100% sure how to make that work here so I left it out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to the commentary on the parent PR I'm super happy about this being so low-hanging, but also like there the distinction of "what data is this" needs more love. In partiulcar VideoData
is already a type and now we're passing a lot of video_data
which is not related to it.
Ideally we'd use some reference counted object from the guts of Blob, but I haven't looked into whether that's viable (we really don't want to end up passing around raw arrow buffers, infecting everything with that dependency)
If we advertise this in the changelog, we should have before after perf & memory numbers. Just give it a quick spin with one of the larger videos on our internal video repo, thank you! :)
(trying to measure this by adding a benchmark) |
Some rough numbers:
Measured on Total Rerun memory usage for a 297 MiB video (the entire Sintel movie):
|
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome!
.unwrap() | ||
.parent() | ||
.ancestors() | ||
.nth(3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤯
### What This is now feasible thanks to @jprochazk's * #7860 ![image](https://github.com/user-attachments/assets/0cd063d9-7e44-4892-a751-022784e12d5d) ### Checklist * [x] I have read and agree to [Contributor Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and the [Code of Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md) * [x] I've included a screenshot or gif (if applicable) * [x] I have tested the web demo (if applicable): * Using examples from latest `main` build: [rerun.io/viewer](https://rerun.io/viewer/pr/7869?manifest_url=https://app.rerun.io/version/main/examples_manifest.json) * Using full set of examples from `nightly` build: [rerun.io/viewer](https://rerun.io/viewer/pr/7869?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json) * [x] The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG * [x] If applicable, add a new check to the [release checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)! * [x] If have noted any breaking changes to the log API in `CHANGELOG.md` and the migration guide - [PR Build Summary](https://build.rerun.io/pr/7869) - [Recent benchmark results](https://build.rerun.io/graphs/crates.html) - [Wasm size tracking](https://build.rerun.io/graphs/sizes.html) To run all checks from `main`, comment on the PR with `@rerun-bot full-check`.
What
Some rough numbers:
Measured on
Big_Buck_Bunny_1080_10s_av1.mp4
from our test assets (downloaded viatests/assets/download_test_assets.py
)Total Rerun memory usage for a 297 MiB video (the entire Sintel movie):
Checklist
main
build: rerun.io/viewernightly
build: rerun.io/viewerCHANGELOG.md
and the migration guideTo run all checks from
main
, comment on the PR with@rerun-bot full-check
.