Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Rewrite AAD token authentication #1298

Open
jikuja opened this issue Sep 15, 2022 · 2 comments
Open

[Feature Request] Rewrite AAD token authentication #1298

jikuja opened this issue Sep 15, 2022 · 2 comments
Labels
Feature Feature Work required.

Comments

@jikuja
Copy link
Contributor

jikuja commented Sep 15, 2022

Is your feature request related to a problem? Please describe.
AAD token authentication is really flaky, hides error messages and has terrible UX

Splitted from issue #1258

Describe the solution you'd like
Rewrite most of the related code

Background Story

Current status

  1. Extension uses subscription information from AZ CLI(1) subscriptions = profile.load_cached_subscriptions(False) loops through the information
  2. tenant of the active subscription is added as first item if tenantsDict (2)
  3. other tenant information from the subscription list is being collected into tenantsDict
  4. the code goes through collected tenants from tenantsDict
  • token for tenant is being fetch from AZ CLI: token = get_token_from_az_login(profile, key[0])
    1. problems on this step described later
  • returned token is validated with ADO endpoint: validate_token_for_instance(organization, credentials)
    1. if validation passed then correct token/tenant was found and and the token can be used
    2. if validation failed than token us not usable
    3. if suitable token not found then empty token is returned from get_token_from_az_logins() and _get_credentials() will raise exception
    4. issues with validate_token_for_instance() is described later

get_token_from_az_login() problems

def get_token_from_az_login(profile, tenant):
try:
raw = profile.get_raw_token(
resource='499b84ac-1321-427f-aa17-267ca6975798', tenant=tenant)
creds = raw[0]
auth_token = creds[1]
return auth_token
except BaseException as ex: # pylint: disable=broad-except
logger.debug('not able to get token from az login')
logger.debug(ex, exc_info=True)
return ""

  • Exceptions raised in profile.get_raw_token() are masked by logger.debug()
    • outdated refresh tokens
    • outdated MFA token
    • other problems

validate_token_for_instance() problems

def validate_token_for_instance(organization, credentials):
logger.debug("instance recieved in validate_token_for_instance %s", organization)
organization = uri_parse_instance_from_git_uri(organization)
logger.debug("instance processed in validate_token_for_instance %s", organization)
connection = _get_connection(organization, credentials)
core_client = connection.get_client(VSTS_MODULE + 'v5_0.core.core_client.CoreClient')
try:
core_client.get_projects(state_filter='all', top=1, skip=0)
return True
except BaseException as ex2: # pylint: disable=broad-except
logger.debug(ex2, exc_info=True)
logger.debug("Failed to connect using provided credentials")
return False

  • Error messages from ADO API are masked and dislayed only with debug logging
  • Minor issue

get_token_from_az_logins() problem

def get_token_from_az_logins(organization, pat_token_present):
profile = Profile()
dummy_user = profile.get_current_account_user() # noqa: F841
subscriptions = profile.load_cached_subscriptions(False)
tenantsDict = OrderedDict()
# first loop to make sure the first identity we try with is coming from selected subscription
for subscription in subscriptions:
if subscription['isDefault']:
tenantsDict[(subscription['tenantId'], subscription['user']['name'])] = ''
for subscription in subscriptions:
tenantsDict[(subscription['tenantId'], subscription['user']['name'])] = ''
skipValidateToken = False
if pat_token_present is False and len(tenantsDict) == 1:
skipValidateToken = True
try:
for key, dummy_value in tenantsDict.items():
try:
logger.debug('trying to get token (temp) for tenant %s and user %s ', key[0], key[1])
token = get_token_from_az_login(profile, key[0])
credentials = BasicAuthentication('', token)
if skipValidateToken is True:
return token
if validate_token_for_instance(organization, credentials):
return token
logger.debug('invalid token obtained for tenant %s', key[0])
except BaseException as ex2: # pylint: disable=broad-except
logger.debug(ex2)
logger.debug('failed while trying to get token for tenant %s', key[0])
except BaseException as ex: # pylint: disable=broad-except
logger.debug(ex)
return ''

  • Errors are again hidden to debug level to decrease amount of console spam

Issues

  • login token problems are hidden and only visible with --debug
    • probably done to decrease amount of console spam but also errors with the correct tenant token is hidden
  • goes through all tenants available on AZ CLI until working tenant is found
  • Current error message is not really helpful for end-users
    • should mention --tenant, --allow-no-subscriptions and maybe even give hint about active subscription selection

Proposed fixes

Add --tenant parameter

If user can applies tenant id then

for subscription in subscriptions:
if subscription['isDefault']:
tenantsDict[(subscription['tenantId'], subscription['user']['name'])] = ''
for subscription in subscriptions:
tenantsDict[(subscription['tenantId'], subscription['user']['name'])] = ''

logic of that code could be simplified and looping all available tenants is not needed

Request tenant id from Azure Devops service

ADO API returns tenant id with 403 replies

The following headers are returned from ADO:

Looping through all tenant information is not needed if tenant information is fetch from the ADO API.

Removal of token validation

  • will be result of other changes?

Removal of the tenant loops

  • will be result of other changes?

(1)

subscriptions = profile.load_cached_subscriptions(False)

(2) this code was broken. Fix is in main but not released yet


Current state:

sequenceDiagram
    autonumber
    participant User
    participant Extension
    participant ADO_API
    participant CLI
    User->>Extension: run command
    Extension->>CLI: get subscriptions
    CLI-->>Extension: 
    %%Extension->>Extension: 
    %%activate Extension
    loop Validation: subscription/user pairs to find working token
        Extension->>CLI: Get token for sub/user
        CLI-->Extension: 
        Extension->>ADO_API: Try fetching data with token
        alt failure
        ADO_API-->>Extension: fail
        Note over ADO_API,Extension: continue loop
        else success 
        ADO_API-->>Extension: success
        Note over ADO_API,Extension: exit loop
        end
        opt After loop exhaustion
        Extension-->>User: Display AAD token error
        end
        opt After successfull token validation
        Extension->>ADO_API: Fetch data
        ADO_API-->>Extension: 
        Extension-->>User: Display for ADO API request
        end
    end
    Note right of ADO_API: slow and unneeded loop
Loading

Future?:

sequenceDiagram
    autonumber
    participant User
    participant Extension
    participant ADO_API
    participant CLI
    User->>Extension: run command
    Extension->>ADO_API: get data from protected enpoint
    ADO_API-->>Extension: 
    Extension->>CLI: get subscriptions
    CLI-->>Extension: 
    Extension->Extension: find correct subscription/username pair for given tenant id
    Extension->>CLI: Get token for sub/user
    CLI-->Extension: 
    Extension->Extension: handle profile.get_raw_token() errors
    
    opt After getting token for given tenant
    Note over ADO_API,Extension: validation not really needed
    Extension->>ADO_API: Fetch data
    ADO_API-->>Extension: 
    Extension-->>User: Display for ADO API request
    end

    opt Token not found or other profile.get_raw_token() exception
    Extension-->>User: Display AAD token error
    end
Loading
@jikuja jikuja added the Feature Feature Work required. label Sep 15, 2022
@jikuja
Copy link
Contributor Author

jikuja commented Nov 23, 2022

And even more things that could be classified as bugs:

login process does full Azure AD interragotion even when user has:

  • entered PAT or OAuth token into AZURE_DEVOPS_EXT_PAT environment variable
  • or used az devops login

If user has large number of tenants in their az cli login credentials cache this will add huge delay before the actual API call is made.

In my use case those CLI has to make extra 8 queries to https://login.microsoftonline.com/ to fetch a token and other 8 extra calls to validate token by calling https://dev.azure.com:443/{organization}/_apis/projects?stateFilter=all&$top=1&$skip=0 even I want to use PAT.

Does this logic make sense? Is there any documentation how current authentication code is supposed to work?

@jikuja
Copy link
Contributor Author

jikuja commented Oct 15, 2024

Anything or is this project just dead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Feature Work required.
Projects
None yet
Development

No branches or pull requests

1 participant