Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up uploads by uploading chunks in parallel #263

Open
daviddavis opened this issue Jun 3, 2021 · 6 comments
Open

Speed up uploads by uploading chunks in parallel #263

daviddavis opened this issue Jun 3, 2021 · 6 comments
Labels
feature request New feature request (template-set)

Comments

@daviddavis
Copy link
Contributor

Right now chunk uploads happen sequentially but we could speed them up running them in parallel.

@daviddavis daviddavis changed the title Speed up uploads by uploading chunk in parallel Speed up uploads by uploading chunks in parallel Jun 3, 2021
@mdellweg mdellweg added feature request New feature request (template-set) triaged labels Jun 7, 2021
gerrod3 added a commit to gerrod3/pulp-cli that referenced this issue Aug 18, 2021
@gerrod3 gerrod3 self-assigned this Aug 19, 2021
@gerrod3 gerrod3 removed their assignment Aug 31, 2022
@lubosmj lubosmj self-assigned this Jul 10, 2024
@lubosmj
Copy link
Member

lubosmj commented Jul 10, 2024

There was another request from the services team to make the uploading happen in parallel. I am a bit sceptical about this idea because we usually saturate the uplink with uploading in serial. However, there might be data centers where this saturation cannot be reached because of reverse-proxy configurations disallowing a larger chunk size.

I am going to introduce an option to the upload/artifact command that enables parallelism (e.g., --parallel).

@daviddavis
Copy link
Contributor Author

You may want to make the number of parallel threads or processes configurable somehow as I imagine different Pulp instances could support different levels of throughput.

@mdellweg
Copy link
Member

Doing uploads in parallel requires adding some notion of parallel execution into the cli codebase (that has been unprecedented). I would like to see some clear statements with numbers before gauging the need for adding this kind of complexity to the project. I.e. I cannot say, whether pulp-glue is thread safe. Do we want to rewrite it in async python, using aiohttp instead of requests?

@lubosmj
Copy link
Member

lubosmj commented Jul 11, 2024

I made a couple of experiments locally (using oci_env) and here are the results. It appears that uploading chunks in parallel improved the performance by 50%. I did not spend much time writing quality code or using any optimization techniques besides splitting an uploaded file into 4 chunks and then uploading those chunks in sub-chunks in parallel.


Test 1: With creating an artifact (db reset between runs, uploading one commit in a tarball, 717.7MB, 10MB chunks)

SERIAL (current implementation):
(venv) [lmjachky@lmjachky-thinkpadt14gen4 services]$ time pulp ostree repository import-all --name fedora-iot --file a9598e5a-1f0c-48b8-abda-14915a4d051a-commit.tar --repository_name repo --chunk-size 10MB
........................................................................Upload complete.
Creating artifact.
Started background task /pulp/api/v3/tasks/0190a1b6-19eb-7fe1-9b36-c2faf44e516e/
.....Done.

real	0m55.106s
user	0m13.641s

PARALLEL (4 processes for chunked uploading):
(venv) [lmjachky@lmjachky-thinkpadt14gen4 services]$ time pulp ostree repository import-all-parallel --name fedora-iot --file a9598e5a-1f0c-48b8-abda-14915a4d051a-commit.tar --repository_name repo --chunk-size 10MB
.....................................................................Upload complete.
.Upload complete.
.Upload complete.
.Upload complete.
Creating artifact.
Started background task /pulp/api/v3/tasks/0190a1b7-caae-759b-b9cc-81da6ca042b8/
.....Done.

real	0m30.569s
user	0m18.488s

Test 2: Without creating an artifact (db reset between runs, uploading one commit in a tarball, 717.7MB, 10MB chunks)

SERIAL (current implementation):
(venv) [lmjachky@lmjachky-thinkpadt14gen4 services]$ time pulp ostree repository import-all --name fedora-iot --file a9598e5a-1f0c-48b8-abda-14915a4d051a-commit.tar --repository_name repo --chunk-size 10MB
........................................................................Upload complete.

real	0m49.346s
user	0m14.368s

PARALLEL (4 processes for chunked uploading):
(venv) [lmjachky@lmjachky-thinkpadt14gen4 services]$ time pulp ostree repository import-all-parallel --name fedora-iot --file a9598e5a-1f0c-48b8-abda-14915a4d051a-commit.tar --repository_name repo --chunk-size 10MB
.....................................................................Upload complete.
.Upload complete.
.Upload complete.
.Upload complete.

real	0m22.449s
user	0m16.966s

Changes made to pulp-glue: https://gist.github.com/lubosmj/1d736226c1816fb019430e7fb78cdd55. Changes made to pulp-cli-ostree: https://gist.github.com/lubosmj/3bc14338713ab9a55343359ff49829b1. I used processes (https://pypi.org/project/multiprocess/ for easier function pickling) to perform the action.

@lubosmj
Copy link
Member

lubosmj commented Jul 11, 2024

TCP congestion control is designed to manage the flow of data to prevent network congestion and ensure fairness among multiple connections. However, this mechanism primarily operates on a per-connection basis. This is what we are trying to bypass by uploading in parallel, right? Multiple TCP connections from a single host can then easily saturate the uplink.

@lubosmj
Copy link
Member

lubosmj commented Jul 11, 2024

The following experiment supports that theory. When uploading commits to staging, I am getting amazing results. Almost 4-times better performance, seeing the speed of uploads and used uplink.


Test 1: Serial uploading (1MB chunk, 1 TCP connection, 1.3GB in total)

(venv) [lmjachky@lmjachky-thinkpadt14gen4 services]$ time pulp ostree repository import-all --name rhivos-test-non-parallel --file "auto-osbuild-aws-autosd9-cki-ostree-x86_64-1368897263.017a82ff.repo.tar" --repository_name "auto-osbuild-aws-autosd9-cki-ostree-x86_64-1368897263.017a82ff.repo" --chunk-size 1MB
Uploading file auto-osbuild-aws-autosd9-cki-ostree-x86_64-1368897263.017a82ff.repo.tar
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................Upload complete.
Creating artifact.
Started background task /api/pulp/default/api/v3/tasks/0190a2cd-235c-7a9f-adf2-10aa8529519d/
..........................................................................................................................................................................................Done.
Started background task /api/pulp/default/api/v3/tasks/0190a2d1-1cd9-76aa-8c75-07f7ef5f418a/
...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................Done.

real	42m34.483s
user	0m41.393s
  • 26 minutes uploading chunks (pulpcore), 5 minutes assembling chunks (pulpcore), 11 minutes running import-commit (pulp-ostree)
  • average upload speed: 2MB/s

image

Test 2: Parallel uploading (1MB chunk, 4 parallel processes, 4 TCP connections, 1.4GB in total)

(venv) [lmjachky@lmjachky-thinkpadt14gen4 services]$ time pulp ostree repository import-all --name rhivos-test-parallel --file "auto-osbuild-qemu-autosd9-qa-ostree-x86_64-1368897263.017a82ff.repo.tar" --repository_name "auto-osbuild-qemu-autosd9-qa-ostree-x86_64-1368897263.017a82ff.repo" --chunk-size 1MB --parallel
Uploading file auto-osbuild-qemu-autosd9-qa-ostree-x86_64-1368897263.017a82ff.repo.tar
.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................Upload complete.
........Upload complete.
.......Upload complete.
....Upload complete.
Creating artifact.
Started background task /api/pulp/default/api/v3/tasks/0190a2f1-332f-7c86-8f4c-735a63265275/
..............................................................................................................................................................................Done.
Started background task /api/pulp/default/api/v3/tasks/0190a2f4-fb30-7bb8-81e9-456a7ebc16f2/
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
ERROR, SOMEONE RESTARTED GATEWAY!!! BUT WE DO NOT CARE!
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: https://XXXXXXXX.com/api/pulp/default/api/v3/tasks/0190a2f4-fb30-7bb8-81e9-456a7ebc16f2/

real	26m27.961s
user	0m43.025s
  • 7 minutes uploading chunks (pulpcore), 4 minutes assembling chunks (pulpcore), 15+ minutes running import-commit (pulp-ostree)
  • average upload speed: 8MB/s

image


Tested with the following changes applied on the respective main branches: lubosmj@8d57381, lubosmj/pulp-cli-ostree@0c4f3ae. OSTree commits were taken from https://autosd.sig.centos.org/AutoSD-9/nightly/ostree-repos/.

@lubosmj lubosmj removed their assignment Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature request (template-set)
Projects
None yet
Development

No branches or pull requests

4 participants