Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 utility methods for reading and writing manifests, bundles, and logs in Jenkins #367

Merged
merged 54 commits into from
Sep 7, 2021

Conversation

setiah
Copy link
Contributor

@setiah setiah commented Sep 1, 2021

Description

This PR aims to add utility methods for reading/writing to s3. It provides below methods

  • download_folder()
  • download_file()
  • upload_file()

These methods will serve as base for

  1. Downloading built maven dependencies from s3
  2. Downloading the min and full distribution bundles from s3 for spinning up local test cluster
  3. Uploading the test results and test logs to s3.

This PR overrides #316 .

Issues Resolved

#260

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

meghasaik and others added 30 commits August 25, 2021 08:57
…verting an op file to HTML file and then calling another function to put it inside an S3 bucket.
…t files to be stored in the path given by user.
@VachaShah
Copy link
Contributor

You can use mypy-boto3-s3 to resolve the mypy errors and then use the specific client and resource:

self.s3_resource: S3ServiceResource = boto3.resource("s3")
self.s3_client: S3Client = boto3.client("s3")

@owaiskazi19
Copy link
Member

You can use mypy-boto3-s3 to resolve the mypy errors and then use the specific client and resource:

self.s3_resource: S3ServiceResource = boto3.resource("s3")
self.s3_client: S3Client = boto3.client("s3")

mypy requires stubbed version of the library to check for type. This should be present in Pipfile https://pypi.org/project/boto3-stubs/ to resolve errors related to boto3

@VachaShah
Copy link
Contributor

You can use mypy-boto3-s3 to resolve the mypy errors and then use the specific client and resource:

self.s3_resource: S3ServiceResource = boto3.resource("s3")
self.s3_client: S3Client = boto3.client("s3")

mypy requires stubbed version of the library to check for type. This should be present in Pipfile https://pypi.org/project/boto3-stubs/ to resolve errors related to boto3

Yeah, will require changes to Pipfile as well: I have made changes here: a8a58c2 to resolve the errors.

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot cleaner and a much more sane interface, don't you think?

Some comments/improvements + tests and this is good to go.

else os.environ.get("AWS_ROLE_SESSION_NAME")
)
assumed_role_cred = self.__assume_role()
self.__s3_client = boto3.client(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would save __ assumed_role_cred and create __s3_client and __s3_resource on demand the first time you need it.

target = Path(local_dir) / Path(file_name)
return s3bucket.__file_download_helper(bucket, key, str(target))

@staticmethod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have this to wrap ClientError into S3DownloadFailureException. Since I can also call bucket.download_file externally myself, I would get a ClientError there. I think it's surprising to get different exceptions depending on whether I call a static helper or the implementation in it, so I think you should remove this code and do the except inside download_file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__file_download_helper exists to wrap the logic common to both download_folder and download_file. Not sure how you want to replace this?

Also, the S3DownloadFailureException wraps theClientError. The caller can see it in the stack trace. The reason for having the wrapper is to have a high level exception class catch all S3 download failures. We can revisit this later to add things like retryability, and make this more robust.

"""
s3bucket = cls(bucket_name, role_arn, role_session_name)
try:
s3bucket.__s3_client.upload_file(source, bucket_name, key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't be reaching into a private method of s3bucket. Add a s3bucket#upload_file method.

raise S3UploadFailureException(e)


class S3DownloadFailureException(Exception):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Python convention is to call these things S3DownloadFailureError vs. Exception, but they all derive from the Exception class.

I would create a S3FileError as a base class and store the key in it. Then S3UploadError and S3DownloadError would switch the error message.

DurationSeconds=3600,
)["Credentials"]
except Exception as e:
print("Assume role failed due to ", str(e.__repr__()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap in a specialized STSError.

Himanshu Setia added 2 commits September 3, 2021 09:58
Signed-off-by: Himanshu Setia <[email protected]>
Signed-off-by: Himanshu Setia <[email protected]>
@setiah setiah marked this pull request as ready for review September 7, 2021 14:21
@codecov-commenter
Copy link

codecov-commenter commented Sep 7, 2021

Codecov Report

Merging #367 (61c443c) into main (035bb47) will increase coverage by 2.14%.
The diff coverage is 96.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #367      +/-   ##
==========================================
+ Coverage   56.89%   59.04%   +2.14%     
==========================================
  Files          37       38       +1     
  Lines        1051     1111      +60     
==========================================
+ Hits          598      656      +58     
- Misses        453      455       +2     
Impacted Files Coverage Δ
bundle-workflow/src/aws/s3_bucket.py 96.66% <96.66%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 035bb47...61c443c. Read the comment docs.

@setiah
Copy link
Contributor Author

setiah commented Sep 7, 2021

thanks @dblock for reviewing the PR. If this looks good, can we merge it?

Signed-off-by: Himanshu Setia <[email protected]>
mock_sts = MagicMock()
mock_s3_resource = MagicMock()
mock_s3_client = MagicMock()
bucket_name = "unitTestBucket"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to reuse these globals instead of just doing MagicMock() in the side effect code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, somehow the mock_s3_resource wasn't picking the same mock object with MagicMock inside side effect. Will revisit this later and fix.

Copy link
Member

@owaiskazi19 owaiskazi19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@setiah setiah merged commit 45bae98 into opensearch-project:main Sep 7, 2021
@setiah setiah mentioned this pull request Sep 8, 2021
3 tasks
@dblock dblock mentioned this pull request Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants