-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use context manager for multiprocessing in the ingestion server #1057
Conversation
Full-stack documentation: https://docs.openverse.org/_preview/1057 Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again. You can check the GitHub pages deployment action list to see the current status of the deployments. |
7cf5b20
to
3f91633
Compare
Based on this guide and the multiprocessing docs, I think the flow and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on my machine, works great!
Co-authored-by: Dhruv Bhanushali <[email protected]>
Bumps [ipython](https://github.com/ipython/ipython) from 8.11.0 to 8.12.0. - [Release notes](https://github.com/ipython/ipython/releases) - [Commits](ipython/ipython@8.11.0...8.12.0) --- updated-dependencies: - dependency-name: ipython dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Bhanushali <[email protected]> Co-authored-by: sarayourfriend <[email protected]>
Bumps [orjson](https://github.com/ijl/orjson) from 3.8.8 to 3.8.9. - [Release notes](https://github.com/ijl/orjson/releases) - [Changelog](https://github.com/ijl/orjson/blob/master/CHANGELOG.md) - [Commits](ijl/orjson@3.8.8...3.8.9) --- updated-dependencies: - dependency-name: orjson dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: openverse-bot <null>
* Move the svgs for radiomark and check to components * Add files to tsconfig and fix types * Mock report service in the unit test * Type svg?inline as vue Component * Better License code type checking * Update frontend/src/components/VFilters/VFilterChecklist.vue * Revert unnecessary changes * Update frontend/src/components/VFilters/VFilterChecklist.vue Co-authored-by: Zack Krida <[email protected]> * Rename `ownValue` to `value_` --------- Co-authored-by: Zack Krida <[email protected]>
* Convert VPill story to MDX * Convert VItemGroup story to mdx * Fixing argTypes issue and fixing the headers
* Rename button sizes and apply some styles only to 'old' buttons * Rename the snapshot tests to v-button-old * Fix VTab focus style * Small fixes (large-old, border, group/button) * Revert VTab focus changes Moved to a different PR * Revert "Revert VTab focus changes" This reverts commit ec9312d. * Use only focus-visible for consistency
Bumps [boto3](https://github.com/boto/boto3) from 1.26.99 to 1.26.105. - [Release notes](https://github.com/boto/boto3/releases) - [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst) - [Commits](boto/boto3@1.26.99...1.26.105) --- updated-dependencies: - dependency-name: boto3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…1122) * Add more docs for Plausible and auto-initialise custom event names * Update existing docs * Add caveat that it is not necessary to run Plausible if not working on custom events * Fix ToC
* Add new VButton sizes and variants * Add new Storybook tests * Add border to transparent- buttons * Update bordered and transparent buttons * Update stories * Update snapshots * Remove pressed variants * Add missing snapshots * Fix transparent buttons * Update paddings In accordance with #860 (comment) * Update snapshots * Update frontend/src/components/VButton.vue Co-authored-by: Zack Krida <[email protected]> --------- Co-authored-by: Zack Krida <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works well for me locally, but I don't understand why 😅.
with multiprocessing.Pool(processes=num_workers) as pool: | ||
log.info(f"Starting {len(jobs)} cleaning jobs") | ||
|
||
for result in pool.starmap(_clean_data_worker, jobs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since pool.starmap
is blocking (like pool.map
), as per this StackOverflow answer I don't think the close()
and join()
should be needed at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the calls to close()
and join()
, the ingestion test was failing in CI. Probably because the audio ingestion was throwing an error about the index already existing.
Size Change: -2 kB (0%) Total Size: 846 kB
ℹ️ View Unchanged
|
* 🔄 synced local '.github/workflows/new_issues.yml' with remote '.github/workflows/new_issues.yml' * 🔄 synced local '.pre-commit-config.yaml' with remote 'templates/.pre-commit-config.yaml.jinja' * Add ruff config, fix noqa settings --------- Co-authored-by: openverse-bot <null> Co-authored-by: Madison Swain-Bowden <[email protected]>
Follow up on #904
Description
This PR adds a context manager to the multiprocessing pool in the ingestion server. I originally added it in #904, but without
.join()
it was causing the tests to fail.I am not sure about the placement of
results
saving: putting it after the call tojoin
might mean that there is too much data in memory (all of the data that has been updated from several child processes), but if we handle the results as they come, will there be a racing condition with the files?Testing Instructions
Checklist
Update index.md
).main
) ora parent feature branch.
errors.
Developer Certificate of Origin
Developer Certificate of Origin