Skip to content

Commit

Permalink
Use context manager for multiprocessing in the ingestion server (#1057)
Browse files Browse the repository at this point in the history
* Fix issues in the workflow simplifications of #1054 (#1058)

* Retry `up` recipe in case port is occupied (#990)

* Fix typo in docs building on `main` (#1067)

* Restore Django Admin views (#1065)

* Update other references of media count to 700 million (#1098)

* Dispatch workflows instead of regular reuse to show deployment runs (#1034)

* Use label.yml to determine required labels (#1063)

Co-authored-by: Dhruv Bhanushali <[email protected]>

* Add `GITHUB_TOKEN` to GitHub CLI step (#1103)

* Pass actor for staging deploys with the `-f` flag (#1104)

* Bump ipython from 8.11.0 to 8.12.0 in /api (#1113)

Bumps [ipython](https://github.com/ipython/ipython) from 8.11.0 to 8.12.0.
- [Release notes](https://github.com/ipython/ipython/releases)
- [Commits](ipython/ipython@8.11.0...8.12.0)

---
updated-dependencies:
- dependency-name: ipython
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Absorb `actionlint` into pre-commit (#1028)

Co-authored-by: Dhruv Bhanushali <[email protected]>
Co-authored-by: sarayourfriend <[email protected]>

* Bump orjson from 3.8.8 to 3.8.9 in /api (#1114)

Bumps [orjson](https://github.com/ijl/orjson) from 3.8.8 to 3.8.9.
- [Release notes](https://github.com/ijl/orjson/releases)
- [Changelog](https://github.com/ijl/orjson/blob/master/CHANGELOG.md)
- [Commits](ijl/orjson@3.8.8...3.8.9)

---
updated-dependencies:
- dependency-name: orjson
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add Sentry to the ingestion server (#1106)

* Add a wait to filter button test to fix CI (#1124)

* Bump boto3 from 1.26.100 to 1.26.104 in /ingestion_server (#1110)

* Bump sentry-sdk from 1.17.0 to 1.18.0 in /api (#1112)

* Bump pillow from 9.4.0 to 9.5.0 in /api (#1115)

* Update redis Docker tag to v4.0.14 (#1109)

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* 🔄 synced file(s) with WordPress/openverse-infrastructure (#1127)

Co-authored-by: openverse-bot <null>

* Update other references of media count to 700 million (#1100)

* Fix prod deployment workflow dispatch call (#1117)

* Add a Slack notification job to the CI + CD workflow (#1066)

* Fix types in VFilters and VContentReport (#1030)

* Move the svgs for radiomark and check to components

* Add files to tsconfig and fix types

* Mock report service in the unit test

* Type svg?inline as vue Component

* Better License code type checking

* Update frontend/src/components/VFilters/VFilterChecklist.vue

* Revert unnecessary changes

* Update frontend/src/components/VFilters/VFilterChecklist.vue

Co-authored-by: Zack Krida <[email protected]>

* Rename `ownValue` to `value_`

---------

Co-authored-by: Zack Krida <[email protected]>

* Convert VPill and VItemGroup stories to mdx (#1092)

* Convert VPill story to MDX
* Convert VItemGroup story to mdx
* Fixing argTypes issue and fixing the headers

* Update ci to use github.token (#1123)

* Add `SLACK_WEBHOOK_TYPE` env var to reporting job (#1131)

* Add consent decision-making process documentation (#887)

* Prepare VButton for updates (#1002)

* Rename button sizes and apply some styles only to 'old' buttons

* Rename the snapshot tests to v-button-old

* Fix VTab focus style

* Small fixes (large-old, border, group/button)

* Revert VTab focus changes

Moved to a different PR

* Revert "Revert VTab focus changes"

This reverts commit ec9312d.

* Use only focus-visible for consistency

* Bump boto3 from 1.26.99 to 1.26.105 in /api (#1133)

Bumps [boto3](https://github.com/boto/boto3) from 1.26.99 to 1.26.105.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](boto/boto3@1.26.99...1.26.105)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add more docs for Plausible and auto-initialise custom event names (#1122)

* Add more docs for Plausible and auto-initialise custom event names

* Update existing docs

* Add caveat that it is not necessary to run Plausible if not working on custom events

* Fix ToC

* Add new buttons variants and sizes (#1003)

* Add new VButton sizes and variants

* Add new Storybook tests

* Add border to transparent- buttons

* Update bordered and transparent buttons

* Update stories

* Update snapshots

* Remove pressed variants

* Add missing snapshots

* Fix transparent buttons

* Update paddings

In accordance with #860 (comment)

* Update snapshots

* Update frontend/src/components/VButton.vue

Co-authored-by: Zack Krida <[email protected]>

---------

Co-authored-by: Zack Krida <[email protected]>

* Pass `GITHUB_TOKEN` to deploy docs (#1134)

* Add context manager and join()

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Dhruv Bhanushali <[email protected]>
Co-authored-by: Krystle Salazar <[email protected]>
Co-authored-by: Madison Swain-Bowden <[email protected]>
Co-authored-by: sarayourfriend <[email protected]>
Co-authored-by: Tomvth <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Adarsh Rawat <[email protected]>
Co-authored-by: Dhruv Bhanushali <[email protected]>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Openverse (Bot) <[email protected]>
Co-authored-by: Zack Krida <[email protected]>
Co-authored-by: Sepehr Rezaei <[email protected]>
Co-authored-by: Sumit Kashyap <[email protected]>
  • Loading branch information
14 people authored Apr 5, 2023
1 parent eeff3e0 commit cf9febd
Showing 1 changed file with 9 additions and 10 deletions.
19 changes: 9 additions & 10 deletions ingestion_server/ingestion_server/cleanup.py
Original file line number Diff line number Diff line change
Expand Up @@ -351,16 +351,15 @@ def clean_image_data(table):
cleanable_fields_for_table,
)
)
pool = multiprocessing.Pool(processes=num_workers)
log.info(f"Starting {len(jobs)} cleaning jobs")

results = pool.starmap(_clean_data_worker, jobs)

for result in results:
batch_cleaned_counts = save_cleaned_data(result)
for field in batch_cleaned_counts:
cleaned_counts_by_field[field] += batch_cleaned_counts[field]
pool.close()
with multiprocessing.Pool(processes=num_workers) as pool:
log.info(f"Starting {len(jobs)} cleaning jobs")

for result in pool.starmap(_clean_data_worker, jobs):
batch_cleaned_counts = save_cleaned_data(result)
for field in batch_cleaned_counts:
cleaned_counts_by_field[field] += batch_cleaned_counts[field]
pool.close()
pool.join()

num_cleaned += len(batch)
batch_end_time = time.time()
Expand Down

0 comments on commit cf9febd

Please sign in to comment.