Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] how to set use_virtual_addressing for pyarrow S3FileSystem #36827

Closed
windpiger opened this issue Jul 23, 2023 · 12 comments
Closed

[C++] how to set use_virtual_addressing for pyarrow S3FileSystem #36827

windpiger opened this issue Jul 23, 2023 · 12 comments

Comments

@windpiger
Copy link

windpiger commented Jul 23, 2023

Currently pyarrow.fs.S3FileSystem init S3Client by using this logic to set use_virtual_addressing :

const bool use_virtual_addressing = options_.endpoint_override.empty();

use_virtual_addressing);

But If I want to set endpointOverride to "https://mytest.endpoint" which is not empty, use_virtual_addressing will be set to false because use_virtual_addressing = options_.endpoint_override.empty();

So How can I set use_virtual_addressing to true if I also want to set a non-empty value for endpointOverride?

Maybe we should de-couple these two configs and set them independently

Component(s)

C++

@windpiger windpiger added the Type: usage Issue is a user question label Jul 23, 2023
@windpiger
Copy link
Author

@pitrou could you pls help me to review this, thank you!

@pitrou pitrou changed the title how to set use_virtual_addressing for pyarrow S3FileSystem [C++] how to set use_virtual_addressing for pyarrow S3FileSystem Jul 24, 2023
@pitrou
Copy link
Member

pitrou commented Jul 24, 2023

@windpiger When writing this code, I thought that virtual adressing would be incompatible with a custom endpoint, but it turns out the two should work together (the bucket name would be prepended to the custom hostname: http(s)://<bucket>.endpoint).

@pitrou
Copy link
Member

pitrou commented Jul 24, 2023

Would you like to submit a PR for this?

@pitrou
Copy link
Member

pitrou commented Jul 24, 2023

Also cc @felipecrv if you want an entrypoint into the filesystem subsystem.

@windpiger
Copy link
Author

windpiger commented Jul 25, 2023

Would you like to submit a PR for this?

ok, it's my pleasure~
I have another question that how can I use docker-compose to build a distribution wheel on my MacOS?
docker-compose python-wheel-manylinux-2014 ? @pitrou

@pitrou
Copy link
Member

pitrou commented Jul 25, 2023

I have another question that how can I use docker-compose to build a distribution wheel on my MacOS?
docker-compose python-wheel-manylinux-2014 ? @pitrou

I have no idea. @raulcd Do you know?

@raulcd
Copy link
Member

raulcd commented Jul 25, 2023

I have usually reproduced the python wheel generation on our CI with for example:

$ PYTHON=3.11 docker-compose build --no-cache --progress plain python-wheel-manylinux-2014
$ PYTHON=3.11 docker-compose run --rm python-wheel-manylinux-2014

You can also run our wheel tests (once generated) with:

$ docker-compose build --no-cache --progress plain python-wheel-manylinux-test-unittests
$ docker-compose run --rm python-wheel-manylinux-test-unittests

I haven't tested those right now but I have used them in the past.

@windpiger
Copy link
Author

thank you, guys, I have done with docker-compose python-wheel-manylinux-2014 @pitrou @raulcd

@mapleFU
Copy link
Member

mapleFU commented Sep 20, 2023

thank you, guys, I have done with docker-compose python-wheel-manylinux-2014

@windpiger would you mind push to github and create a patch?

@pitrou
Copy link
Member

pitrou commented Dec 7, 2023

This was done in #38858.

@pitrou pitrou closed this as completed Dec 7, 2023
@MissiontoMars
Copy link

This PR only modifies the C++ layer, so when using the Python API, the "force_virtual_addressing" parameter cannot be passed. such as:

fs = pyarrow.fs.S3FileSystem(access_key='aaa', secret_key='bbb', endpoint_override='https://xxx', force_virtual_addressing=True)
@pitrou

@pitrou
Copy link
Member

pitrou commented Jan 24, 2024

Yes, sorry. I should have added a separate issue to expose it in Python.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants