Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fetch_wasabi_file() utility function. #141

Merged
merged 3 commits into from
Feb 20, 2024

Conversation

israelmcmc
Copy link
Collaborator

Includes a unit test.

Usage:

from cosipy.util import fetch_wasabi_file

fetch_wasabi_file('test_file.txt', override = True)

test_file.txt is an actual file I added to the public wasabi folder in order to test this with a small file.

@israelmcmc israelmcmc marked this pull request as ready for review February 20, 2024 18:10
@ckarwin ckarwin self-requested a review February 20, 2024 18:57
@ckarwin
Copy link
Contributor

ckarwin commented Feb 20, 2024

Excellent, thanks @israelmcmc!

The code works well from the command line. However, I still get the following error when trying to run in my Jupyter Notebook:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In [2], line 1
----> 1 fetch_wasabi_file('ComptonSphere/mini-DC2/GalacticScan.inc1.id1.crab2hr.extracted.tra.gz', override = True)

File /zfs/astrohe/ckarwin/COSI/COSIpy_Development/AWS_PR/cosipy/cosipy/util/data_fetching.py:17, in fetch_wasabi_file(file, output, override, bucket, endpoint, access_key_id, access_key)
     14 if os.path.exists(output) and not override:
     15     raise RuntimeError(f"File {output} already exists.")
---> 17 subprocess.run(['aws', 's3api', 'get-object',
     18                 '--bucket', bucket,
     19                 '--key', file,
     20                 '--endpoint-url', endpoint,
     21                 output], 
     22                env = os.environ.copy() | {'AWS_ACCESS_KEY_ID':access_key_id,
     23                                           'AWS_SECRET_ACCESS_KEY':access_key})

File /zfs/astrohe/Software/COSIMain_u2/lib/python3.9/subprocess.py:505, in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    502     kwargs['stdout'] = PIPE
    503     kwargs['stderr'] = PIPE
--> 505 with Popen(*popenargs, **kwargs) as process:
    506     try:
    507         stdout, stderr = process.communicate(input, timeout=timeout)

File /zfs/astrohe/Software/COSIMain_u2/lib/python3.9/subprocess.py:951, in Popen.__init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, user, group, extra_groups, encoding, errors, text, umask)
    947         if self.text_mode:
    948             self.stderr = io.TextIOWrapper(self.stderr,
    949                     encoding=encoding, errors=errors)
--> 951     self._execute_child(args, executable, preexec_fn, close_fds,
    952                         pass_fds, cwd, env,
    953                         startupinfo, creationflags, shell,
    954                         p2cread, p2cwrite,
    955                         c2pread, c2pwrite,
    956                         errread, errwrite,
    957                         restore_signals,
    958                         gid, gids, uid, umask,
    959                         start_new_session)
    960 except:
    961     # Cleanup if the child failed starting.
    962     for f in filter(None, (self.stdin, self.stdout, self.stderr)):

File /zfs/astrohe/Software/COSIMain_u2/lib/python3.9/subprocess.py:1821, in Popen._execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, gid, gids, uid, umask, start_new_session)
   1819     if errno_num != 0:
   1820         err_msg = os.strerror(errno_num)
-> 1821     raise child_exception_type(errno_num, err_msg, err_filename)
   1822 raise child_exception_type(err_msg)

FileNotFoundError: [Errno 2] No such file or directory: 'aws'

I have a similar problem when trying to run:

import os
os.system("AWS_ACCESS_KEY_ID=GBAL6XATQZNRV3GFH9Y4 AWS_SECRET_ACCESS_KEY=GToOczY5hGX3sketNO2fUwiq4DJoewzIgvTCHoOv aws s3api get-object  --bucket cosi-pipeline-public --key ComptonSphere/mini-DC2/GalacticScan.inc1.id1.crab2hr.extracted.tra.gz --endpoint-url=https://s3.us-west-1.wasabisys.com GalacticScan.inc1.id1.crab2hr.extracted.tra.gz")

Error:
sh: aws: command not found

I think @fieldrog and @saurabhmittal23 mentioned that they had a similar issue, and needed to use the install instructions from the aws page: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html.

One additional comment: Can you please add documentation for this new method (i.e. doc string). As part of this, please make clear that the passed file needs to be the full wasabi path.

@israelmcmc
Copy link
Collaborator Author

israelmcmc commented Feb 20, 2024

@ckarwin Can you try again, please? I think the last change should fix this. I realized awscli does have an underlying python API, but it was not documented. I also added the documentation, thanks for noticing that.

@ckarwin
Copy link
Contributor

ckarwin commented Feb 20, 2024

Awesome @israelmcmc, it works now. Please double check the doc string format and let me know if it's ready to be merged.

@israelmcmc
Copy link
Collaborator Author

I added the space before :, but it doesn't seem to have any impact. This however made me realize that I hadn't added this function to the sphinx. This is fixed in the last commit. I think it's ready to merge.

Copy link
Contributor

@ckarwin ckarwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to be merged.

@ckarwin ckarwin merged commit f27cd0b into cositools:main Feb 20, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants