-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store and manage credentials separate from datasets #6612
Comments
Please note that there are different types of credentials that we want to support:
|
Currently, the MagLocator case class has a field credentials (case class FileSystemCredentials) – this means, the credentials are stored directly in the datasource-properties.json. Instead, I’d suggest to store a credentialsReference there, which is an id pointing towards the postgres database. The explore route should add supplied credentials into the database, and return a json with the matching reference. Later, when the datastore uses the MagLocator to load data, via zarrMag.remoteSource, The credentials should be looked up via an RPC to webKnossos (with caching), so a get route needs to be added to the wkRemoteDatastoreController, and a matching client method to the datastore. I’m not sure yet how accessing this client method can work in the ZarrBucketProvider, we’ll have to think about this. But I’d say you could already start designing the postgres entries. I’m not sure this is the time to add the support for other kinds of credentials that are not essentially user/password (as we have now with basic auth and s3). But we should keep in mind that this will be added later and we need to be able to adapt this without having to rewrite the existing datasource-properties.json s |
I think we should care about the GCS case in this iteration. Maybe it the pem file could be represented as a password string. I think it would be useful if the credential object had a field to what type it belongs (eg. S3, sasic auth, GCS) with a scope (e.g. S3 bucket, HTTP domain) and an optional user-definable name. That will make it easier for users to manage these credential objects. |
So as I understand it in the datasource-properties.json there should be a field "credential" containing an object id. This object id is then used to look up the credential in the db when accessing the data set. Since different credential types have to be supported, should this be done via subclassing and using a different table for each type or with more generic table columns and a credential type enum? |
Your understanding is right :) I’d say I’d say in scala, subclassing sounds fair, in postgres I’d say it’s better to have it in one table, with some nullable fields and a way to distinguish what the type is (enum may be a solution). I don’t have a super strong opinion here, though. If during the implementation you find that using different tables seems better, feel free to go that way too. |
#6646 progress report S3 access key and http basic auth now work.
There is a scope value in the db which is optional, however it is not yet validated. Should the use of the credential fail if the scope does not match the request?
This is not yet implemented. Are there examples for datasets with these credentials? Also while the routes to create credentials are already working, credentials are not yet automatically created when exploring a data set from the frontend |
Wohoo! 🎉
I think this should be implemented in this iteration. The json returned by the explore routes should contain a reference to the newly created credentials object (and no longer the credentials themselves). This is exactly the benefit of this change. I’d forward the other two questions to @normanrz |
How are these scopes defined?
I have GCS credentials, but GCS is not yet implemented as filesystem provider. For HTTP token, the wk auth token would be a candidate (not sure if it is accepted via |
Detailed Description
Credentials for remote datasets should be stored and managed separately from datasets. The credentials should be created by dataset managers or admins. Credentials should never be sent back to the client. Credentials can then be attached to layers (or mags).
Not sure if we should allow to delete credentials that are referenced by a dataset. Pro: It is very easy to revoke access. Con: It makes datasets unusable and maybe hard to find out why.
The text was updated successfully, but these errors were encountered: