Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we use Azure::Storage::Blobs::BlobClient::CopyFromUri() to copy a file on hierarchical namespace supported storage #5542

Closed
2 tasks done
kou opened this issue Apr 17, 2024 · 4 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. issue-addressed Workflow: The Azure SDK team believes it to be addressed and ready to close. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)

Comments

@kou
Copy link
Contributor

kou commented Apr 17, 2024

Query/Question

I'm an Apache Arrow https://github.com/apache/arrow developer. We're developing a file system interface on Azure Blob Storage: https://github.com/apache/arrow/blob/main/cpp/src/arrow/filesystem/azurefs.cc

We want to support both of Azure Blob Storage and Azure Data Lake Storage Gen2. We're using Azure::Storage::Blobs::BlobClient::CopyFromUri() to implement copy operation: https://github.com/apache/arrow/blob/63c91ff32b8547f0bfd6ff827d7a6901d9e7ca5c/cpp/src/arrow/filesystem/azurefs.cc#L2865-L2891

It works with Azure Blob Storage but doesn't work with Azure Data Lake Storage Gen2 with hierarchical namespace support. #41095 is our issue for this.

Can we use Azure::Storage::Blobs::BlobClient::CopyFromUri() to copy a file on hierarchical namespace supported Azure Data Lake Storage Gen2 storage? Or do we need different API?

Why is this not a Bug or a feature Request?

I'm not sure whether we can use the API for copy on hierarchical namespace supported Azure Data Lake Storage Gen2 storage or not. So I can't judge whether this is a bug nor a feature request.

Setup (please complete the following information if applicable):

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • Query Added
  • Setup information Added
@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files) labels Apr 17, 2024
Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @EmmaZhu @Jinming-Hu @vinjiang.

@Jinming-Hu Jinming-Hu self-assigned this Apr 18, 2024
@Jinming-Hu
Copy link
Member

Hi @kou , I'm looking at the code you referenced, it seems the source URL doesn't contain any authentication information.

You can use SAS token to authenticate the copy source. Below is a working example.

  const std::string datalakeConnectionString = "";
  const std::string blobConnectionString = "";

  auto srcDataLakeFileClient
      = Azure::Storage::Files::DataLake::DataLakeFileClient::CreateFromConnectionString(
          datalakeConnectionString, "sample-file-system", "sample-directory/sample-file");

  std::string sasToken;
  {
    Azure::Storage::Sas::DataLakeSasBuilder builder;
    builder.ExpiresOn = std::chrono::system_clock::now() + std::chrono::minutes(60);
    builder.FileSystemName = "sample-file-system";
    builder.Path = "sample-directory/sample-file";
    builder.Resource = Azure::Storage::Sas::DataLakeSasResource::File;
    builder.SetPermissions(Azure::Storage::Sas::DataLakeSasPermissions::Read);


    auto keyCredential
        = Azure::Storage::_internal::ParseConnectionString(datalakeConnectionString).KeyCredential;

    sasToken = builder.GenerateSasToken(*keyCredential);
  }

  auto srcUrl = srcDataLakeFileClient.GetUrl() + sasToken;


  auto destBlobClient = Azure::Storage::Blobs::BlobClient::CreateFromConnectionString(
      blobConnectionString, "sample-container", "copyDestBlob");

  destBlobClient.CopyFromUri(srcUrl);

@Jinming-Hu Jinming-Hu added the issue-addressed Workflow: The Azure SDK team believes it to be addressed and ready to close. label Apr 18, 2024
@github-actions github-actions bot removed the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Apr 18, 2024
Copy link

Hi @kou. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text "/unresolve" to remove the "issue-addressed" label and continue the conversation.

@kou
Copy link
Contributor Author

kou commented Apr 18, 2024

Thanks! It works!

@kou kou closed this as completed Apr 18, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jul 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. issue-addressed Workflow: The Azure SDK team believes it to be addressed and ready to close. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

No branches or pull requests

2 participants