Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Credential type not saved in test #249

Closed
mih opened this issue Feb 18, 2023 · 10 comments · Fixed by #256
Closed

Credential type not saved in test #249

mih opened this issue Feb 18, 2023 · 10 comments · Fixed by #256

Comments

@mih
Copy link
Member

mih commented Feb 18, 2023

This is evident from #248 even after it fixes #245

FAILED ../commands/tests/test_download.py::test_download_new_bearer_token - AssertionError: Got 0 instead of 1 expected results matching
3179  {
3180   "cred_secret": "token123",
3181   "cred_type": "token"
3182  }
3183Inspected 1 record(s):
3184  [
3185   {
3186    "action": "credentials",
3187    "cred_secret": "token123",
3188    "name": "dataladtest_test_download_new_bearer_token",
3189    "status": "ok"
3190   }
3191  ]
3192

So far, this does not replicate for me locally, and also not outside the main CI run on an appveyor machine. It seems to be some evil interaction with the test setup. Grrrr.

@adswa
Copy link
Member

adswa commented Feb 21, 2023

I tried this locally on my machine and the test passes.

@mih
Copy link
Member Author

mih commented Feb 21, 2023

Thx for trying!

I pushed a small push to get some insight. Here it is:

diff --git a/datalad_next/commands/tests/test_download.py b/datalad_next/commands/tests/test_download.py
index 45fff9f..96a83c1 100644
--- a/datalad_next/commands/tests/test_download.py
+++ b/datalad_next/commands/tests/test_download.py
@@ -208,9 +208,16 @@ def test_download_no_credential_leak_to_http(capsys):
     'dataladtest_test_download_new_bearer_token',
 ])
 def test_download_new_bearer_token(capsys):
+    from pprint import pprint
+    print('====query=====')
+    credentials('query')
+    print('====query done=====')
     try:
         download({f'{hbsurl}/bearer': '-'})
         # and it was saved under this name
+        print('====checkcred=====')
+        pprint(credentials('get', name='dataladtest_test_download_new_bearer_token', on_failure='ignore'))
+        print('====checkcred done=====')
         assert_result_count(
             credentials(
                 'get',
@@ -218,10 +225,15 @@ def test_download_new_bearer_token(capsys):
             1, cred_secret='token123', cred_type='token',
         )
     finally:
-        credentials(
+        print('====remove=====')
+        pprint(credentials(
             'remove',
             name='dataladtest_test_download_new_bearer_token',
-        )
+        ))
+        print('====remove done=====')
+        print('====checkcred=====')
+        pprint(credentials('get', name='dataladtest_test_download_new_bearer_token', on_failure='ignore'))
+        print('====checkcred done=====')

Which gives:

748====query=====
< results do not contain the credential in question > 
749====query done=====
750{
751  "authenticated": true, 
752  "token": "token123"
753}
754====checkcred=====
755[{'action': 'credentials',
756  'cred_secret': 'token123',
757  'name': 'dataladtest_test_download_new_bearer_token',
758  'status': 'ok'}]
759====checkcred done=====
760====remove=====
761[{'action': 'credentials',
762  'name': 'dataladtest_test_download_new_bearer_token',
763  'status': 'notneeded'}]
764====remove done=====
765====checkcred=====
766[{'action': 'credentials',
767  'message': 'credential not found',
768  'name': 'dataladtest_test_download_new_bearer_token',
769  'status': 'error'}]
770====checkcred done=====

which is very different from what I get locally:

====query=====
< results do not contain the credential in question > 
====query done=====
{
  "authenticated": true, 
  "token": "token123"
}
====checkcred=====
[{'action': 'credentials',
  'cred_last-used': '2023-02-21T13:17:12.580451',
  'cred_realm': 'https://httpbin.org/',
  'cred_secret': 'token123',
  'cred_type': 'token',
  'name': 'dataladtest_test_download_new_bearer_token',
  'status': 'ok'}]
====checkcred done=====
====remove=====
[{'action': 'credentials',
  'name': 'dataladtest_test_download_new_bearer_token',
  'status': 'ok'}]
====remove done=====
====checkcred=====
[{'action': 'credentials',
  'message': 'credential not found',
  'name': 'dataladtest_test_download_new_bearer_token',
  'status': 'error'}]
====checkcred done=====

It looks as if the credential storing code path though the credential manager never happened in the CI. That fact that it fails uniformly across all CIs and has not been observed outside the CIs is confusing.

With a locally deployed httpbin, the stored credential looks like this, locally for me.

[{'action': 'credentials',
  'cred_last-used': '2023-02-21T13:31:42.743980',
  'cred_realm': 'http://localhost:8765/',
  'cred_secret': 'token123',
  'cred_type': 'token',
  'name': 'dataladtest_test_download_new_bearer_token',
  'status': 'ok'}]

@mih
Copy link
Member Author

mih commented Feb 21, 2023

This may be of no particular relevance, but to get the "wrong" credential, this could be what is running:

$ DATALAD_CREDENTIAL_MIKE_SECRET=some datalad -f json_pp credentials get mike
{
  "action": "credentials",
  "cred_secret": "some",
  "name": "mike",
  "status": "ok"
}

So maybe the credential retrieval setup, running inside the downloader is somehow not passing any identifying info, besides a secret.

@mih
Copy link
Member Author

mih commented Feb 23, 2023

OK, next round. Still cannot replicate. I have now enabled datalad/datalad#7298 and the following patch

diff --git a/datalad_next/commands/tests/test_download.py b/datalad_next/commands/tests/test_download.py
index 45fff9f..235362a 100644
--- a/datalad_next/commands/tests/test_download.py
+++ b/datalad_next/commands/tests/test_download.py
@@ -207,9 +207,10 @@ def test_download_no_credential_leak_to_http(capsys):
     # after download, it asks for a name
     'dataladtest_test_download_new_bearer_token',
 ])
-def test_download_new_bearer_token(capsys):
+def test_download_new_bearer_token(memory_keyring, capsys):
     try:
         download({f'{hbsurl}/bearer': '-'})
+        breakpoint()
         # and it was saved under this name
         assert_result_count(
             credentials(

with that I see (again on a machine where things work):

(Pdb) memory_keyring.store
{('datalad-dataladtest_test_download_new_bearer_token', 'secret'): 'token123'}
(Pdb) os.environ['GIT_CONFIG_GLOBAL']
'/tmp/datalad_temp_cfwgqbhn/.gitconfig'
(Pdb) print(pathlib.Path(os.environ['GIT_CONFIG_GLOBAL']).read_text())
...
[datalad "credential.dataladtest_test_download_new_bearer_token"]
        type = token
        realm = https://httpbin.org/
        last-used = 2023-02-23T09:47:03.379781

adswa added a commit to adswa/datalad-next that referenced this issue Feb 23, 2023
adswa added a commit to adswa/datalad-next that referenced this issue Feb 23, 2023
@mih
Copy link
Member Author

mih commented Feb 23, 2023

So it seems on a FAILING github action AND on appveyor run I get the exact same thing:

{('datalad-dataladtest_test_download_new_bearer_token', 'secret'): 'token123'}
/crippledfs/datalad_temp_i4ws5fji/.gitconfig
....
[datalad "credential.dataladtest_test_download_new_bearer_token"]
	type = token
	realm = http://localhost:8765/
	last-used = 2023-02-23T09:00:55.167905
FAILED

So the credential is saved properly, but something is broken with the reporting. progress at last!!

adswa added a commit to adswa/datalad-next that referenced this issue Feb 23, 2023
adswa added a commit to adswa/datalad-next that referenced this issue Feb 23, 2023
adswa added a commit to adswa/datalad-next that referenced this issue Feb 23, 2023
@mih
Copy link
Member Author

mih commented Feb 23, 2023

CredentialManager().get('dataladtest_test_download_new_bearer_token')

returns

{'type': 'token', 'realm': 'http://localhost:8765/', 'last-used': '2023-02-23T09:15:19.591822', 'secret': 'token123'}

So that is fine too...

@mih
Copy link
Member Author

mih commented Feb 23, 2023

So inside credentials there is a CredentialManager.get() call that is just like the one above (but with a CredentialManager instance inside the credential command code.

It returns the crippled credential!

{'secret': 'token123'}

The call is exactly the same, same arguments, just the instance is different.

@mih
Copy link
Member Author

mih commented Feb 23, 2023

Maybe...maybe for some reason the underlying ConfigManager is not detecting the config change that came from storing the credential. Hence it is not reporting on it?!

@mih
Copy link
Member Author

mih commented Feb 23, 2023

A dedicated datalad.cfg.reload() or even datalad.cfg.reload(force=True) has no impact on the result of credentials('get')

@mih
Copy link
Member Author

mih commented Feb 23, 2023

I filed datalad/datalad#7299 which is what I currently believe is causing this unfortunate behavior -- that I have merely triggered with recent changes:

Recently the credentials command started using parameter validation.

As part of that EnsureDataset is used. If will always produce a dataset instance for a given path, or CWD if there was no path given. Importantly, if allowed, it will also return a Dataset instance for directories that have no dataset (aka uninstalled/non-existing datasets). This is what triggered datalad/datalad#7299

Whenever any test had caused the instantiation of a Dataset on a path that contained no repository, this instance and its ConfigManager would be preserved via DataLad's flyweight pattern implementation. Each and every code creating a dataset instance for the same path would get this instance (until garbage collected, which is not instantaneous).

But because of datalad/datalad#7299 and the way the ConfigManager works, subsequent config manipulations (after the first dataset instance was created) would no be reflected in that private instance. So any code, like credentials() that works with the Dataset-bound ConfigManagers would see a configuration that is no longer valid.

This explanation is backed up by a report from @adswa that a prior run of the credentials command is required to trigger the failure, even on the systems where the failure is occurring at all. It also explains we I could never replicate the failure locally -- I am running typically inside the source repo, hence I always have a valid ConfigManager.

I believe datalad/datalad#7299 should be fixed to have Dataset.config directly return datalad.cfg if there is no actual repository to be found. This would avoid this particular issue, and also a range of other related issues of stale configuration knowledge.

Until that happens, each command using .config with potentially uninstalled datasets must be carefully checked whether it is susceptible to this problem.

mih added a commit to mih/datalad-next that referenced this issue Feb 23, 2023
When called with `dataset=None`, `credentials()` would auto-locate
a dataset containing CWD (and use its configuration), whereas
`download()` would use the global configuration in this case.

This change makes the behavior uniform and chooses what `download()`
did as the model behavior.

This uniformatization is meaningful on its own.

However, due to datalad/datalad#7299 it is
also a necessity, because otherwise credentials deposited by
`download()` remain invisible to `credentials()` -- which is the
superficial trigger for datalad#249

Closes datalad#249
@mih mih closed this as completed in #256 Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants