Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Globus GCSv5 Endpoints need the data_access consent added #298

Closed
tylern4 opened this issue Sep 12, 2023 · 16 comments · Fixed by #304
Closed

Globus GCSv5 Endpoints need the data_access consent added #298

tylern4 opened this issue Sep 12, 2023 · 16 comments · Fixed by #304
Labels
Globus Globus semver: bug Bug fix (will increment patch version)

Comments

@tylern4
Copy link
Contributor

tylern4 commented Sep 12, 2023

A NERSC user (@mzelinka) noticed an issue transferring data with zstash using globus on Perlmutter.

ERROR: ('POST', 'https://transfer.api.globus.org/v0.10/transfer', 'Bearer', 403, 'ConsentRequired', 'Missing required data_access consent', 'fD5Thunop')

This looks to be an error coming from newer GCSv5 collections, like the "NERSC Perlmutter" endpoint which require extra consents to be added in order to transfer data. The globus SDK has some documentation on how you can handle getting the data access consent for endpoints that require it.

NERSC is in the process of updating all our endpoints to the newest version of Globus (GCSv5) and all other Globus endpoints will need to updated before the older version GCSv4 is no longer supported in December 2023.

Full error logs:

INFO: Final Step of 3-legged OAuth2 Flows: Exchanging authorization code for token(s)
INFO: Fetching new token from Globus Auth
INFO: request done (success)
INFO: Setting up RefreshTokenAuthorizer with auth_client=[instance:140250644028176]
INFO: Setting up a RenewingAuthorizer. It will use an auth type of Bearer and can handle 401s.
INFO: RenewingAuthorizer will start by using access_token with hash "..."
INFO: Setting up RefreshTokenAuthorizer with auth_client=[instance:140250644028176]
INFO: Setting up a RenewingAuthorizer. It will use an auth type of Bearer and can handle 401s.
INFO: RenewingAuthorizer will start by using access_token with hash "..."
INFO: Creating client of type <class 'globus_sdk.services.transfer.client.TransferClient'> for service "transfer"
INFO: TransferClient.endpoint_autoactivate(6bdc7956-fc0f-4ad2-989c-7aa5ee643a79)
INFO: request done (success)
INFO: TransferClient.endpoint_autoactivate(9cd89cfd-6d04-11e5-ba46-22000b92c6ec)
INFO: request done (success)
INFO: TransferClient.operation_ls(9cd89cfd-6d04-11e5-ba46-22000b92c6ec, {'path': '/home/projects/e3sm/www/WaterCycle/E3SMv2/LR/v2.LR.amip_0101'})
INFO: request done (success)
INFO: Creating a new TransferData object
INFO: TransferClient.get_submission_id(None)
INFO: request done (success)
INFO: TransferData.DATA_TYPE = transfer
INFO: TransferData.DATA = []
INFO: TransferData.source_endpoint = 9cd89cfd-6d04-11e5-ba46-22000b92c6ec
INFO: TransferData.destination_endpoint = 6bdc7956-fc0f-4ad2-989c-7aa5ee643a79
INFO: TransferData.label = v2LRamip_0101 index
INFO: TransferData.submission_id = ...
INFO: TransferData.verify_checksum = True
INFO: TransferData.preserve_timestamp = True
INFO: TransferData.encrypt_data = False
INFO: TransferData.skip_source_errors = False
INFO: TransferData.fail_on_quota_errors = True
INFO: TransferData.delete_destination_extra = False
INFO: TransferData.notify_on_succeeded = True
INFO: TransferData.notify_on_failed = True
INFO: TransferData.notify_on_inactive = True
INFO: TransferClient.submit_transfer(...)
INFO: request done (success)
ERROR: ('POST', 'https://transfer.api.globus.org/v0.10/transfer', 'Bearer', 403, 'ConsentRequired', 'Missing required data_access consent', 'fD5Thunop')

Nick

@ndkeen
Copy link

ndkeen commented Sep 13, 2023

It's not obvious to me from what I know about the issue yet, but is it possible user is trying to work with data on NERSC tape and NERSC scratch, but using globus? I've not actually tried to use the new globus feature, but is it expected to work on the same machine (ie pm-to-pm)?

@forsyth2
Copy link
Collaborator

is it expected to work on the same machine (ie pm-to-pm)?

Yes, the testLsGlobus unit test runs pm-to-pm.

@forsyth2 forsyth2 added semver: bug Bug fix (will increment patch version) Globus Globus labels Sep 20, 2023
@forsyth2
Copy link
Collaborator

@tylern4 Does this have a work-around that does not require a code change? I.e., can consent be granted via the Globus website before running zstash, to avoid this issue? If not, the just-released version of zstash will not work for anyone using Globus... we would need to do a patch release of zstash with this bug fix.

(I've started trying to make the necessary code changes for a sustainable solution in #300).

@tylern4
Copy link
Contributor Author

tylern4 commented Sep 21, 2023

I believe this code will give the needed to allow the zstash client_id the data access consent for a GCSv5 endpoint even without modifying your current code. It will print out the URL which you can authenticate with to give the client_id the data access consent.

import argparse
import globus_sdk


CLIENT = globus_sdk.NativeAppAuthClient(client_id="6c1629cf-446c-49e7-af95-323c6412397f",
                                        app_name="Zstash"
                                        )

def globus_flow(ep=""):
    scopes = f"urn:globus:auth:scope:transfer.api.globus.org:all"
    endpoint_scope = f"[ *https://auth.globus.org/scopes/{ep}/data_access ]"
    data_access_sopes = scopes + endpoint_scope
    CLIENT.oauth2_start_flow(refresh_tokens=True, requested_scopes=data_access_sopes)
    authorize_url = CLIENT.oauth2_get_authorize_url()
    print(f'Please go to this URL and login:\n\n{authorize_url}\n\n')

    get_input = getattr(__builtins__, 'raw_input', input)
    auth_code = get_input(
        'Please enter the code you get after login here: ')
    token_response = CLIENT.oauth2_exchange_code_for_tokens(auth_code)
    print(token_response)

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument("uuid", help="GCSv5 endpoint UUID")
    args = parser.parse_args()
    globus_flow(args.uuid)

@forsyth2
Copy link
Collaborator

@tylern4 Thanks, that does indeed work! That is very good as it means we don't need to rush a patch release of zstash then. That said, I'm assuming we'll want to incorporate this logic into the next release to streamline the process for users.

For reference, steps I followed:

# Copy the above to zstash_globus_consent.py 
$ python zstash_globus_consent.py "6bdc7956-fc0f-4ad2-989c-7aa5ee643a79" # Perlmutter endpoint uuid
# Copy Link
# Log in to Globus
# Name the consent
# Copy the code and paste it to the terminal
$ source /global/common/software/e3sm/anaconda_envs/load_latest_e3sm_unified_pm-cpu.sh
$ python -m unittest tests/test_globus.py
# Test passes

@forsyth2 forsyth2 moved this from Todo to In Progress in forsyth2 current tasks Sep 22, 2023
@forsyth2
Copy link
Collaborator

forsyth2 commented Oct 2, 2023

Relevant discussion post: #302

@forsyth2
Copy link
Collaborator

forsyth2 commented Oct 2, 2023

@tylern4 I'm now running into further difficulties.

When I run the test on the main branch, I now get a long error message beginning with ('POST', 'https://transfer.api.globus.org/v0.10/transfer', 'Bearer', 403, 'PermissionDenied', 'Error validating login to endpoint \'NERSC Perlmutter (6bdc7956-fc0f-4ad2-989c-7aa5ee643a79)\. It goes on to say None of your identities are from domains allowed by resource policies.

Furthermore, when I run the python zstash_globus_consent.py line, I don't see any consents populate at https://auth.globus.org/v2/web/consents.

@tylern4
Copy link
Contributor Author

tylern4 commented Oct 2, 2023

Have you logged into Globus with your NERSC identity recently? The error None of your identities are from domains allowed by resource policies points to you being logged into Globus, but not logged into a NERSC identity.

@forsyth2
Copy link
Collaborator

forsyth2 commented Oct 2, 2023

I'm logged into the Globus web site. I used to enter my NERSC credentials for the Cori endpoint, which required activation. The Perlmutter endpoint doesn't seem to require activation though, so I'm not sure where I would enter those credentials now.

@forsyth2
Copy link
Collaborator

forsyth2 commented Oct 3, 2023

I tried just activating a NERSC endpoint I didn't actually need (in order to use my NERSC credentials), but I still run into the same "Permission denied" error.

@forsyth2
Copy link
Collaborator

forsyth2 commented Oct 4, 2023

@tylern4 @lukaszlacinski Yeah, still running into this issue -- how would I "log into a NERSC identity"? I'm logged into Globus, I have NERSC endpoints activated, I'm not sure what else I could do.

@tylern4
Copy link
Contributor Author

tylern4 commented Oct 4, 2023

It's related to this error in Globus. At NERSC we require you to log in with your @nersc.gov domain to access our endpoints, if you can login and see your data in the NERSC Perlmutter collection you should also be able to use it from your tool when you're logged in. Maybe you need to refresh your token loaded into zstash?

@forsyth2
Copy link
Collaborator

forsyth2 commented Oct 4, 2023

see your data in the NERSC Perlmutter collection

Oh interesting; while it doesn't require endpoint activation, it does require me to enter my credentials to review the contents in the file manager.

I'm now able to run with the latest Unified once again, so I can get back to #304. (I think this required re-running the consent script as well). Thanks @tylern4!

@forsyth2
Copy link
Collaborator

forsyth2 commented Oct 6, 2023

@tylern4 Following #302 (reply in thread), if I go to https://auth.globus.org/v2/web/consents (either by direct link or by following "Settings" > "Consents" > "Manage Your Consents"), I only see a couple consents, both of which were granted 2 years ago. There is nothing specific to the Perlmutter endpoint.

I went to the Perlmutter endpoint (https://app.globus.org/file-manager/collections/6bdc7956-fc0f-4ad2-989c-7aa5ee643a79/overview) > "Manage Consent", where I did see relevant consents. However, I deleted them and tried re-running zstash with Globus and it worked fine. I would have expected to need to re-establish the consents. Am I missing something? Thanks.

@forsyth2
Copy link
Collaborator

forsyth2 commented Oct 6, 2023

@tylern4 Expanding on this, I did a few manual tests (as opposed to using test_globus.py):

On Chrysalis:

# Create something to archive
$ emacs setup.sh 
mkdir zstash_demo
mkdir zstash_demo/empty_dir
mkdir zstash_demo/dir
echo 'file0 stuff' > zstash_demo/file0.txt
echo '' > zstash_demo/file_empty.txt
echo 'file1 stuff' > zstash_demo/dir/file1.txt
$ chmod 755 setup.sh
$ ./setup.sh
$ source /lcrc/soft/climate/e3sm-unified/load_latest_e3sm_unified_chrysalis.sh
$ zstash create --hpss=globus://nersc/~/n298_chrysalis zstash_demo

# ERROR: The 61f9954c-a4fa-11ea-8f07-0a21f750d19b endpoint is not activated or the current activation expires soon. Please go to https://app.globus.org/file-manager/collections/61f9954c-a4fa-11ea-8f07-0a21f750d19b and (re)activate the endpoint.
# => Activate lcrc#dtn_bebop endpoint via web interface

$ zstash create --hpss=globus://nersc/~/n298_chrysalis zstash_demo

# ERROR: ('POST', 'https://transfer.api.globus.org/v0.10/transfer', 'Bearer', 403, 'ConsentRequired', 'Missing required data_access consent', 'ghHoWXlRW')
# => Run:
$ python zstash_globus_consent.py "61f9954c-a4fa-11ea-8f07-0a21f750d19b" # lcrc#dtn_bebop endpoint

# Error on Globus website:
# client_id=6c1629cf-446c-49e7-af95-323c6412397f requested unknown scopes: ['https://auth.globus.org/scopes/61f9954c-a4fa-11ea-8f07-0a21f750d19b/data_access']
# => Run:
$ python zstash_globus_consent.py "9cd89cfd-6d04-11e5-ba46-22000b92c6ec" # NERSC HPSS endpoint
# => The consent script needs the destination endpoint only, not the source endpoint.

$ zstash create --hpss=globus://nersc/~/n298_chrysalis zstash_demo

# Now, on Perlmutter:
$ hsi
$ cd n298_chrysalis
$ ls
# 000000.tar   index.db
# Success

Then, I ran on Perlmutter:

# Same setup script as above
$ source /global/common/software/e3sm/anaconda_envs/load_latest_e3sm_unified_pm-cpu.sh
$ zstash create --hpss=globus://nersc/~/n298_pm zstash_demo

$ hsi
$ cd n298_pm
$ ls
# 000000.tar   index.db
# Success

# Disable consents to test consent activation
# Go to NERSC HPSS on web interface
# https://app.globus.org/file-manager/collections/9cd89cfd-6d04-11e5-ba46-22000b92c6ec/overview
# > Manage Consent
# No consents found
# Go to settings page on web interface
# https://app.globus.org/settings/identities
# > Consents
# > Manage Your Consents
# All consents are from years ago

So, clearly I'm not disabling consents correctly. That's crucial to testing any consent automation.

@tylern4
Copy link
Contributor Author

tylern4 commented Oct 6, 2023

It's not the endpoints you need to delete consents from to test, it's the consents you've given to your tool that you would need to delete.

You've given the client_id the name Globus Endpoint Performance Monitoring so in the Globus Manage Consents page you'll want to find that row, which should say [Native App] next to it, open the dropdown and remove all the consents you have given to it.

@github-project-automation github-project-automation bot moved this from In Progress to Done in forsyth2 current tasks Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Globus Globus semver: bug Bug fix (will increment patch version)
Projects
None yet
3 participants