-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copy manifest and bundle from S3 to jenkins #316
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This needs to be called by something before we merge.
- Dependencies need to be added into Pipfile.
- It needs some OO, create a class that represents a bucket/destination and implement put/get methods on that. Maybe a
Builder
-like pattern, maybePublisher
that we use to collect results? - The code swallows exceptions and fails silently, that can't be good.
- Add tests and documentation of where/how this is done in README's.
Could the use cases for this code described in Jenkins pipeline DSL? There is a built-in function that is being used by the build system for many common tasks so they don't need to be reimplemented https://www.jenkins.io/doc/pipeline/steps/pipeline-aws/ |
…verting an op file to HTML file and then calling another function to put it inside an S3 bucket.
Change made for pulling and adding few changes
|
||
return args | ||
|
||
def get_S3files(self, args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's name it as download_files_from_S3
bundle-workflow/src/read_write_s3.py
Outdated
import boto3 | ||
|
||
|
||
class Read_Write_Files: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class name should be like ReadWriteFiles.
Ref: https://www.python.org/dev/peps/pep-0008/#class-names
bundle-workflow/Pipfile
Outdated
@@ -9,6 +9,7 @@ requests = "~=2.26" | |||
isort = "~=5.9" | |||
flake8 = "~=3.9" | |||
pytest = "*" | |||
boto3 = "*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assign it with the latest stable version and not *
?
@@ -0,0 +1,36 @@ | |||
# !/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name this file as download_files_from_S3
or something similar following the same naming convention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow naming convention as mentioned here: https://www.python.org/dev/peps/pep-0008/#prescriptive-naming-conventions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The surface area of this pull request has moved considerable from its initial revision. By separating many changes into smaller pull requests this will allow for faster and better feedback to get approval on the pull requests.
I would recommend the follow kinds of pull requests for the work I am seeing done here:
- Overall workflow updates - when where is this code called and how does it tie back to the test systems, codifies the high level requirements, e.g. aws permissions, data in/out contracts
- Remote File interactions
- File parsing mechanics
- Reporting UX
- Transformation of local data into reporting format
bundle-workflow/src/read_write_s3.py
Outdated
# get_S3Objects:Used for getting the files from S3 | ||
# bucket_name: Name of the bucket you want to get the file from | ||
# role_arn: Amazon Resource Name (ARN) of the role that you want to assume | ||
# role_session_name: Name for the session | ||
# bucket_bundle: Object path from where you want to get the bundle | ||
# bucket_manifest: Object path from where you want to get the manifest | ||
# path_to_save_bundle: Path where you want to save the bundle at | ||
# path_to_save_manifest: Path where you want to save the manifest at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add these comments as pydoc
"""
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added them in the
"""
"""
pydoc block.
bundle-workflow/src/read_write_s3.py
Outdated
# put_S3Objects:Used for putting the files into S3 | ||
# file_name: The file that you want to put in the S3 bucket | ||
# file_path: The Object path where you want to save the file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add these comments as pydocs
|
||
|
||
import argparse | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove extra space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs tests and documentation to show how the functionality can be used.
bundle-workflow/src/read_write_s3.py
Outdated
self.role_arn = role_arn | ||
self.role_session_name = role_session_name | ||
|
||
def get_S3_objects(self, bucket_name, bucket_bundle, bucket_manifest, path_to_save_bundle, path_to_save_manifest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is getting the manifest mandatory for this method? What if I just have an S3 path and want to get the files that are in that bucket, for example for DependencyInstaller. Currently the code here is assuming that all the user passes in all the parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified the method to take only the bundle parameter where through the S3 path we can get the files that are in that bucket
bundle-workflow/src/read_write_s3.py
Outdated
client.download_file(bucket_name, bucket_bundle, path_to_save_bundle) | ||
client.download_file(bucket_name, bucket_manifest, path_to_save_manifest) | ||
except: | ||
print("An exception occurred. No file is present in the bucket.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be other valid exceptions which need to be raised, currently for all the exceptions, this is printing a "no file present" error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified it by printing out the exception that has occurred as either of the three exceptions might occur which are 1.bucket name does not exist 2. Not a directory for the object path 3. local file path when not given properly
bundle-workflow/src/read_write_s3.py
Outdated
try: | ||
client.upload_file(file_name, bucket_name, file_path) | ||
except: | ||
print("An exception occurred. No file found for uploading.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my comment for exceptions above.
bundle-workflow/src/read_write_s3.py
Outdated
self.role_arn = role_arn | ||
self.role_session_name = role_session_name | ||
|
||
def get_S3_objects(self, bucket_name, bucket_bundle, bucket_manifest, path_to_save_bundle, path_to_save_manifest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make this more generic to achieve two use cases -
- Download a particular S3 object
- Download all maven dependencies from a given s3 location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified it now, as the bundle_path arg will be used for downloading the entire file/folder.
bundle-workflow/src/read_write_s3.py
Outdated
def __init__(self, role_arn, role_session_name): | ||
self.role_arn = role_arn | ||
self.role_session_name = role_session_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the role parameters optional. Abstract the role passing part from caller, by using defaults from environment vars (for now) to read/write from s3. By keeping these optional, the caller can choose to pass these for specific use cases, while the defaults work for most
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The role parameters will be optional and the default values will there for these parameters to take their values from.
… name, added comments and also made changes to download the folder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- needs tests
bundle-workflow/src/read_write_s3.py
Outdated
import boto3 | ||
|
||
|
||
class ReadWriteFiles: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should represent a thing, e.g. S3
or S3Bucket
and take bucket as a parameter since it's required in both put/get.
bundle-workflow/src/read_write_s3.py
Outdated
self.role_arn = role_arn | ||
self.role_session_name = role_session_name | ||
|
||
def get_S3_objects(self, bucket_name, bucket_bundle, path_to_save_bundle=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to get
or download
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameters bucket_bundle
and path_to_save_bundle
don't make sense in S3 context. Should be prefix
and dest
.
What's a None
dest? I would just require it.
bundle-workflow/src/read_write_s3.py
Outdated
continue | ||
my_bucket.download_file(obj.key, target) | ||
except Exception as e: | ||
print(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't swallow exception, let it get raised.
bundle-workflow/src/read_write_s3.py
Outdated
file_name: The file that you want to put in the S3 bucket | ||
file_path: The Object path where you want to save the file | ||
""" | ||
def put_S3_objects(self, bucket_name, file_name, file_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to put
or upload
.
|
||
try: | ||
client.upload_file(file_name, bucket_name, file_path) | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't swallow exception.
bundle-workflow/src/read_write_s3.py
Outdated
RoleSessionName=self.role_session_name, | ||
DurationSeconds=3600)['Credentials'] | ||
|
||
boto3.client('s3', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is the same between get
and put
. Extract into the constructor that creates a new client
for a bucket.
This is replaced with #367 |
Description
[Describe what this change achieves]
This change basically achieves two things:
Issues Resolved
[List any issues this PR will resolve]
Issue No. 260
#260
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.