Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable workflow for pull requests in dockerimage.yml #32

Closed
wants to merge 31 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
6f81234
Enable workflow for pull requests in dockerimage.yml
jaamarks Feb 26, 2024
79b0ab5
rtibiocloud -> jessemarks
jaamarks Feb 26, 2024
030bdd1
create a new tool to test our github actions
jaamarks Feb 26, 2024
c42cd38
Delete .github/actions/build-image directory
jaamarks Feb 26, 2024
c5840a1
Delete .github/workflows directory
jaamarks Feb 26, 2024
388e72c
Merge pull request #1 from jaamarks/test-github-actions-01
jaamarks Feb 26, 2024
f0bb7e0
Revert "create a new tool to test our github actions"
jaamarks Feb 26, 2024
ac1fc03
Merge pull request #2 from jaamarks/revert-1-test-github-actions-01
jaamarks Feb 26, 2024
eb48297
test github actions with this copied docker tool
jaamarks Feb 26, 2024
2f6cea0
cleanup empty directory in cov-ldsc
jaamarks Feb 26, 2024
128e9f1
change from alpine 12 to 11 for testing purposes
jaamarks Feb 26, 2024
90e12d8
alpine to slim. apt install fails with alpine
jaamarks Feb 26, 2024
21ae8aa
set env var instead of deprecated save-output: and test
jaamarks Feb 26, 2024
4fe9f06
try with env.files
jaamarks Feb 26, 2024
45a104b
test 001
jaamarks Feb 26, 2024
0af9781
test 002
jaamarks Feb 26, 2024
3065a57
convert to environmental variables
jaamarks Feb 26, 2024
f4f7fa8
test 003
jaamarks Feb 26, 2024
11ba035
use env vars test
jaamarks Feb 26, 2024
98c54a2
test004
jaamarks Feb 26, 2024
25b56d5
test 005
jaamarks Feb 26, 2024
313704c
Merge pull request #3 from jaamarks/test-branch
jaamarks Feb 26, 2024
5cf9c90
change diff-tree to diff
jaamarks Feb 26, 2024
f4e7c3f
Revert "change diff-tree to diff"
jaamarks Feb 26, 2024
93d36ab
Merge branch 'master' of github.com-rti:jaamarks/biocloud_docker_tools
jaamarks Feb 26, 2024
7397fe9
test 006
jaamarks Feb 26, 2024
76ddec8
test 007
jaamarks Feb 26, 2024
5ecca1d
test008
jaamarks Feb 26, 2024
212d0f8
09
jaamarks Feb 26, 2024
d3899e0
test010
jaamarks Feb 26, 2024
d29dce9
test 011
jaamarks Feb 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions .github/actions/build-image/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@ branding:
color: 'green'
description: 'Builds the specified Dockerfile and pushes the image to Docker Hub.'
inputs:
changed_files:
description: 'The files changed in the triggering commit.'
required: true
username:
description: 'The login username for the registry'
required: true
Expand Down
9 changes: 5 additions & 4 deletions .github/actions/build-image/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@ function main() {
sanitize "${INPUT_USERNAME}" "username"
sanitize "${INPUT_PASSWORD}" "password"
sanitize "${INPUT_ORGANIZATION}" "organization"
sanitize "${INPUT_CHANGED_FILES}" "changed_files"
sanitize "${files}" "changed_files"

# CHANGED_FILES=$(git diff-tree --no-commit-id --name-only -r ${GITHUB_SHA}) # dfe37af2c9a8c753fcd6392ea2f5e711a04b38e1
CHANGED_FILES="${INPUT_CHANGED_FILES}"
CHANGED_FILES="${files}"

# Can only build 1 Docker image in 1 actions run/commit
if [[ $(echo $CHANGED_FILES | tr " " "\n" | grep -c "Dockerfile") -gt 1 ]]; then
Expand Down Expand Up @@ -92,9 +92,10 @@ function main() {

push

echo "::set-output name=tag::${FIRST_TAG}"
# Write the outputs to environment variables
echo "tag=${FIRST_TAG}" >> "$GITHUB_ENV"
DIGEST=$(docker inspect --format='{{index .RepoDigests 0}}' ${DOCKERNAME})
echo "::set-output name=digest::${DIGEST}"
echo "digest=${DIGEST}" >> "$GITHUB_ENV"

docker logout
}
Expand Down
13 changes: 10 additions & 3 deletions .github/workflows/dockerimage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,14 @@ on:
- '.gitignore'
- 'README.md'
- '*/*/README.md'
pull_request:
branches:
- master
paths-ignore:
- '.github/**'
- '.gitignore'
- 'README.md'
- '*/*/README.md'

jobs:
build:
Expand All @@ -21,11 +29,10 @@ jobs:
- name: get changed files
id: getfile
run: |
echo "::set-output name=files::$(git diff-tree --no-commit-id --name-only -r ${{ github.sha }} | xargs)"
echo "files=$(git diff --no-commit-id --name-only ${{ github.event.before }} ${{ github.sha }} | xargs)" >> "$GITHUB_ENV"
- name: Build, Tag, Publish Docker
uses: ./.github/actions/build-image
with:
organization: rtibiocloud
changed_files: ${{ steps.getfile.outputs.files }}
organization: jessemarks
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
1 change: 0 additions & 1 deletion cov_ldsc/v1/cov-LDSC/cov-ldsc
Submodule cov-ldsc deleted from 8cd5ab
26 changes: 26 additions & 0 deletions s3_presigned_url_generator_jesse/v1/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Use an official Python runtime as the base image
FROM python:3.11-slim

# Add Container Labels
LABEL maintainer="Jesse Marks <[email protected]>"
LABEL description="A script to generate presigned URLs to upload to S3."

# Install System Dependencies
RUN apt-get update && apt-get install -y \
vim \
&& rm -rf /var/lib/apt/lists/*

# Set the working directory in the container
WORKDIR /opt/

# Copy the script and requirements file to the container
COPY s3_presigned_upload.py requirements.txt ./

# Install the required dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Set the entry point command
ENTRYPOINT ["python", "s3_presigned_upload.py"]

# Set the default command arguments
CMD ["--help"]
179 changes: 179 additions & 0 deletions s3_presigned_url_generator_jesse/v1/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
# S3 Presigned URL Generator

A command-line interface (CLI) tool to generate a bash script containing `curl` commands with presigned URLs for uploading files to Amazon S3. This tool enables external collaborators to upload their files to your S3 bucket securely using presigned URLs, eliminating the need for separate AWS accounts.

<br>


[Click here to go to the recommended docker usage example](#docker-anchor)

<br>






## Usage

```shell
python s3_presigned_upload.py \
--infile <input_file> \
--outfile <output_file> \
--bucket <bucket_name> \
--key-prefix <key_prefix> \
--expiration-days <expiration_days> \
--aws-access-key <access_key> \
--aws-secret-access-key <secret_access_key>
```


Replace the following placeholders with the appropriate values:

- `<input_file>`: Path to the input file containing a list of file names to generate presigned URLs for.
- `<output_file>`: Path to the output bash script file that will contain the generated curl commands.
- `<bucket_name>`: Name of the S3 bucket where the files will be uploaded.
- `<key_prefix>`: Prefix to be prepended to each file name as the S3 object key.
- `<expiration_days>`: Expiration duration in days for the generated presigned URLs.
- `<access_key>`: AWS access key ID for authentication.
- `<secret_access_key>`: AWS secret access key for authentication.

* _Note_: you can typically find your access keys in your AWS CLI Configuration Files (`~/.aws/credentials`)

Example:

Let's assume you have an input file named `file_list.txt` containing the following filenames:

```
file1.txt
file2.jpg
file3.pdf
```

You want to generate a bash script named `upload_script.sh` that will contain the curl commands with presigned URLs for uploading these files to the S3 bucket `my-bucket` with the key prefix `uploads/` and a URL expiration of 7 days.

You can execute the script as follows:

```shell
python s3_presigned_upload.py \
--infile file_list.txt \
--outfile upload_script.sh \
--bucket my-bucket \
--key-prefix uploads/ \
--expiration-days 7 \
--aws-access-key YOUR_ACCESS_KEY \
--aws-secret-access-key YOUR_SECRET_ACCESS_KEY
```

The generated `upload_script.sh` will contain the curl commands necessary to upload the files using presigned URLs. Share the `upload_script.sh` with the external collaborators, and they can execute it in the same folder as their files to upload them to your S3 account.

<br>






## Docker usage <a id="docker-anchor"></a>
**This is the recommended approach.**<br>
Here is a toy example of how you can use this script with just a docker command.
```
docker run --rm -v $PWD/:/data/ rtibiocloud/s3_presigned_url_generator:v1_23c8ea4 \
--infile /data/file_list.txt \
--outfile /data/upload_script3.sh \
--bucket rti-cool-project \
--key-prefix scratch/some_rti_user/ \
--expiration-days 7 \
--aws-access-key AKIACCESSkeyEXAMPLE \
--aws-secret-access-key qFyQSECRECTaccessKEYexample
```
* _Note_: check the DockerHub rtibiocloud repository for the latest tag (i.e., replace `v1_23c8ea4` if necessary), and don't forget to change the access keys in this toy example.

<br>






## Using the Upload Script

The generated `upload_script.sh` contains the necessary `curl` commands to upload files to the S3 location using presigned URLs. To use the script, follow these steps:

1. Ensure that you have the `upload_script.sh` and the files you want to upload in the same directory.
2. Open a terminal and navigate to the directory containing the `upload_script.sh` and the files.
3. Make the `upload_script.sh` file executable.`chmod +x upload_script.sh`
4. Execute the `upload_script.sh` script. `./upload_script.sh`

The script will start executing the `curl` commands, uploading each file to the specified S3 location using the presigned URLs.

_Note_: Depending on the number and size of the files, the upload process may take some time. Monitor the progress in the terminal.
Once the script finishes executing, all the files should be successfully uploaded to the S3 bucket and location specified in the script.

<br>





## Communicating with Collaborators

To ensure the successful upload of files by external collaborators, it is recommended to communicate with them and provide necessary instructions. Here's a template for an email you can send to collaborators:

<details>
<summary>mock email</summary>

<br>

**Subject**: Uploading files to [Your Project Name] - Action Required

Dear Collaborator,

We are excited to work with you on [Your Project Name]. As part of our collaboration, we kindly request you to upload your files to our Amazon S3 bucket using the provided presigned URLs. This process ensures secure and efficient file transfers without requiring separate AWS accounts.

Here are the steps to upload your files:

1. Place the attached `upload_script.sh` file in the same directory as the files you want to upload.

2. Open a terminal and navigate to the directory containing the `upload_script.sh` and your files.

3. Execute the `upload_script.sh` script:
```shell
bash upload_script.sh
```

This will start the upload process. The script will automatically upload your files to our S3 bucket using presigned URLs.
Once the upload is complete, please reply to this email with the MD5 checksum for each uploaded file. This will allow us to verify the integrity of the transferred files.

If you encounter any issues or have any questions during the upload process, please feel free to reach out to us. We are here to assist you.

Thank you for your collaboration!

Best regards,<br>
[Your Name]<br>
[Your Organization]
</details>


<br>






## Limitations
When using AWS presigned URLs, there is a limitation of 5GB for file uploads ([reference](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html)). If your file size exceeds this limit, you will need to consider alternative methods or break down the file into smaller parts to accommodate the restriction.



<br><br>
___






## Support
For support or any questions, please reach out to Jesse Marks ([email protected])
1 change: 1 addition & 0 deletions s3_presigned_url_generator_jesse/v1/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
boto3==1.24.28
98 changes: 98 additions & 0 deletions s3_presigned_url_generator_jesse/v1/s3_presigned_upload.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
import argparse
import boto3

def generate_presigned_urls(infile, outfile, bucket, key_prefix, expiration_days, access_key, secret_access_key):
"""
Generate a bash script containing curl commands with presigned URLs for uploading files to S3.

This script takes an input file containing a list of file names, and for each file, it generates a presigned URL
using the provided AWS credentials. The presigned URL allows external collaborators to upload their files to the
specified S3 bucket using curl commands. The generated curl commands are written to the output file as a bash script.

Args:
infile (str): Path to the input file containing the list of file names to generate presigned URLs for.
outfile (str): Path to the output bash script file that will contain the generated curl commands.
bucket (str): Name of the S3 bucket where the files will be uploaded.
key_prefix (str): Prefix to be prepended to each file name as the S3 object key.
expiration_days (int): Expiration duration in days for the generated presigned URLs.
access_key (str): AWS access key to be used for authentication.
secret_access_key (str): AWS secret access key to be used for authentication.

Example:
Let's assume you have an input file named 'file_list.txt' containing the following filenames:
```
file1.txt
file2.jpg
file3.pdf
```

You want to generate a bash script named 'upload_script.sh' that will contain the curl commands with presigned
URLs for uploading these files to the S3 bucket 'my-bucket' with the key prefix 'uploads/' and a URL expiration
of 7 days.

You can execute the script as follows:
```
python s3_presigned_upload.py \
--infile file_list.txt \
--outfile upload_script.sh \
--bucket my-bucket \
--key-prefix uploads/ \
--expiration-days 7 \
--aws-access-key YOUR_ACCESS_KEY \
--aws-secret-access-key YOUR_SECRET_ACCESS_KEY
```

The generated 'upload_script.sh' will contain the curl commands to upload the files using presigned URLs.
You can share the 'upload_script.sh' with the external collaborators, and they can execute it in the same
folder as their files to upload them to your S3 account.
"""

session = boto3.Session(aws_access_key_id=access_key, aws_secret_access_key=secret_access_key)
s3 = session.client("s3")

with open(infile) as inF, open(outfile, "w") as outF:
outF.write("#!/bin/bash\n\n")
line = inF.readline()

while line:
seconds = expiration_days * 60 * 60 * 24

key = "{}{}".format(key_prefix, line.strip())
outurl = s3.generate_presigned_url(
'put_object',
Params={'Bucket': bucket, 'Key': key},
ExpiresIn=seconds,
HttpMethod='PUT'
)

outline1 = "##{}".format(line) # comment line
outline2 = "curl --request PUT --upload-file {} '{}'\n\n".format(line.strip(), outurl)

outF.write(outline1)
outF.write(outline2)
line = inF.readline()

outF.write("echo 'File(s) successfully uploaded to S3!'")
print(f"\n\nSuccess!\nCreated the bash script '{outfile}' for uploading files to S3 via presigned URLs.")

if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Generate presigned URLs for S3 objects")
parser.add_argument("--infile", required=True, help="Input file path")
parser.add_argument("--outfile", required=True, help="Output file path")
parser.add_argument("--bucket", required=True, help="S3 bucket name")
parser.add_argument("--key-prefix", "-k", dest="key_prefix", required=True, help="S3 key prefix")
parser.add_argument("--expiration-days", "-e", dest="expiration_days", type=int, help="URL expiration in days")
parser.add_argument("--aws-access-key","-a", dest="access_key", required=True, type=str, help="AWS access key ID")
parser.add_argument("--aws-secret-access-key", "-s", dest="secret_access_key", required=True, type=str, help="AWS secret access key")

args = parser.parse_args()

generate_presigned_urls(
args.infile,
args.outfile,
args.bucket,
args.key_prefix,
args.expiration_days,
args.access_key,
args.secret_access_key
)
Loading