-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a new KeyInfoManager based on SQLite #424
Comments
Proposed Design OverviewThis is a living proposal for the Contents
Motivation
Config Changes
Add a
Add a optional Additional Dependencies
It should only be the
Database Schema (Structure Overview)This section aims to give an overview of what current storage schema is with the Current OnDiskKeyInfoManager Schema:
Where:
Proposed SqliteKeyInfoManager Schema:
|
Field Name | Field Type | Primary Key | Description |
---|---|---|---|
authenticator_id | INT | ✔ | Id of the authenticator used. |
application_name | TEXT | ✔ | Name of the application. |
key_name | TEXT | ✔ | The unique unicode key name. |
provider_uuid | TEXT | ❌ | UUID of the provider |
provider_name | TEXT | ❌ | Name of the provider, used to distinguish two providers of the same type running in parallel. |
key_id | BLOB | ❌ | A key_id representing a reference to a key in the provider, encoded into bytes using bincode::serialize(key_id) . |
key_id_version | INT | ❌ | An integer version number tracking the serialization used to encode the key_id |
key_attributes | BLOB | ❌ | A psa_key_attributes::Attributes struct encoded into bytes using bincode::serialize(key_attributes) . |
key_attributes_version | INT | ❌ | An integer version number tracking the serialization used to encode the key_attributes |
The primary key of the table will consist of a authenticator_id
-application_name
-key_name
composite key. Therefore, unless a key/request matches all 5 of these properties it will be treated as a separate key.
manager_metadata
table:
Field Name | Field Type | Primary Key | Description |
---|---|---|---|
key | TEXT | ✔ | The metadata item key name. |
int_value | INT | ❌ | The value of the metadata item. |
For example:
Key | Value |
---|---|
"schema_version" | 1 |
Execution Flow
Initialization:
- If the db does not exist, create it (
sqlite-key-info-manager.sql
). - Check the
parsec_version
value from themanager_metadata
table. - If the KIM
parsec_version
> Parsecparsec_version
(KIM:0.9.0
, Parsec:0.8.0
for example):- Throw an error as we don't support backwards-migration of Parsec.
- If they want to do this at their own risk, they can make a backup of the key mappings and then edit the
parsec_version
manually but compatibility is not guaranteed as the KIM structure may have changed.
- If the KIM
parsec_version
== Parsecparsec_version
(KIM:0.8.0
, Parsec:0.8.0
for example):- Do nothing as it is correct pairing.
- If the KIM
parsec_version
< Parsecparsec_version
(KIM:0.7.0
, Parsec:0.9.0
for example):- Make a copy of
sqlite-key-info-manager.sql
assqlite-key-info-manager-0.7.0-backup.sql
in the same directory: - While (KIM
parsec_version
< the Parsecparsec_version
):- Iteratively read migration files:
- Read
sqlite-kim-migration-0.7.0-to-0.8.0.sql
from disk. - Execute migration SQL on the database.
- Read
sqlite-kim-migration-0.8.0-to-0.9.0.sql
from disk. - Execute migration SQL on the database.
- Read
- Iteratively read migration files:
- Make a copy of
sqlite-key-info-manager.sql
key_mapping
table queried usingSELECT
statements to retrieve persistent data.- Persistent data saved to a
key_store<KeyTriple, KeyInfo>
HashMap, acts as a read cache. - New
SqliteKeyInfoManager
is returned.
Inserting KeyInfo:
SqliteKeyInfoManager.insert(KeyTriple, KeyInfo)
action is called.- First perform
INSERT
onkey_mapping
table with the data supplied. - Then add the mapping to the
key_store
HashMap read cache.
Testing
- Currently the
OnDiskKeyInfoManager
has some unit tests that would be applicable to theSqliteKeyInfoManager
, such as testing large key names with emoticons. This logic should be split out, so that the non-manager-specific tests are run against all current and future Key Info Managers. SqliteKeyInfoManager
-specific unit tests.
Discussion Points
- Should we consider moving from a
KeyTriple
? To aKeyQuadruple
orKeyQuintuple
? Relates to Enable multiple authenticators to work simultaneously #271- For Example:
KeyIdentity { application: ApplicationIdentity { name: String, authenticator_id: u8 }, provider: ProviderIdentity { uuid: String, name: String, }, key_name: String, }
- For Example:
- Are we going to allowing migration of data stored using
OnDiskKeyInfoManager
toSqliteKeyInfoManager
? - Do we think the database schema is appropriate?
Any feedback is appreciated.
Thanks,
Matt
Thanks for the design. It generally looks good to me. I have some comments/discussion points! Some comments on the proposal above:
About the discussion points:
I think we could also keep the
Note that those changes would also impact the
I think we could have a script that reads the OnDiskKeyInfoManager info and insert it into the database? We will have to think how to populate the information that is not implicit like
I personally think so. A related question that I have is how this schema handles stability? What happens if in future Parsec version we add new fields? Can they be primary keys? What if Parsec is newer than the database and tries to read fields that do not yet exist? Other
|
Seems good to me.
Yes very true, I thought it had already been implemented but I was mistaken.
Sounds sensible to me.
I think an optional migration script is definitely the way to go.
I would suggest on startup reading the current parsec version of the
A typical Initialization Flow:
In terms of can we add new primary keys, yes. If we have an appropriate default value to set the new primary key as upon creation this shouldn't be an issue.
As Sqlite stores the database as a file on the filesystem, I think the same model as the previous KIM would also apply here.
Should just be the Thanks for the feedback! 👌 |
A few thoughts/questions in no apparent order.
Maybe not a tuple and use a normal struct instead? Tuples are a bit annoying in terms of readability.
Hmm, I'd argue we should accept some sort of configuration file for the script so we can tell it exactly what to convert and to what. For example, we could even consider expanding it to cover app name migration. Maybe it would accept something like [authenticator]
id = 123
[[providers]]
id = 456
name = "some-name" And the script would then use these mappings to shift things around. I'm leaning towards something like this because it's nice to have some persistent representation of what got converted to what, maybe you want to refer to it in the future or compare with your old/new service config file. We should probably worry about this migration after the new KIM is available, though.
I thought I read that |
Yeah that's a good idea.
Added some details about dependencies earlier, don't know if it was in the revision you saw. You can have |
Are we intending to encapsulate all of this with a feature flag? Should we? (All providers and authenticators have feature flags, and it possibly should follow that KIMs have them as well, although the existing on-disk KIM doesn't). |
I like the proposal, although it has started me thinking about the slightly awkward relationship that currently exists between the KIM and the providers. Even the on-disk KIM today is effectively using both the application identity and the provider identity as part of the namespace. But this isn't how we normally describe the semantics of Parsec namespaces, which are supposed to be based only on the client identity. If I was starting from a blank slate, I think I would recommend the following:
This would mean that a given KIM would not be able to store two keys with the same name in two different providers. But I would argue that this is actually the correct data model, and probably should have been to begin with. It allows a use case where a client can choose the "best" provider at key-creation time, but any subsequent code that uses the key no longer needs to care which provider it is in, because the combination of application-identity/key-name implies the provider from that point. We are, of course, not starting from a blank slate! We necessarily have to preserve all aspects of how Parsec is working today. But if we're introducing a new KIM, then perhaps it's a good opportunity to make sure that we have the semantics exactly right, on the assumption that everyone will eventually move to it. And of course, with single-provider deployments, it doesn't matter anyway. If the KIM were 1-1 bound to an authenticator, then there would be no need for |
BTW, maybe |
I'm not familiar with how we are currently using feature flags, what would be the advantage of this? Improved compilation time?
Yes, maybe something like this would be more appropriate?
But yes, as you pointed out this would cause stability problems; creating 2 keys with the same name in this structure would throw an error whereas in previous versions of parsec it could have been a valid call.
Would you suggest a new KIM instance for each authenticator? In this case, on-disk, we would end up with
Yeah, I feel like a name change is appropriate. Maybe |
Yes, but also flexibility, binary size etc. For example, some providers require various dependencies locally when you try to compile them, so it might be desirable to only compile, say, the TPM provider on your system if that's all you need. Similarly for authenticators, the JWT SVID authenticator adds a lot of dependencies. Adding a new feature, at least until the KIM work is done sounds reasonable.
This seems a bit odd to me, we'd be introducing artificial constraints even though they're not needed. I think we all agree that you shouldn't "cross" these boundaries - e.g. a key created in provider A shouldn't be accessible in provider B - but I can't grasp the (overwhelming) benefit of not allowing the name to be reused across the boundaries in a case like this. Being able to use the same name for keys that have identical "profiles" in different providers seems just as useful as being able to identify an exact provider to use given a key name.
Well yes, but if you want to support multiple authenticators then you need to maintain two KIMs, potentially of the same kind, in parallel (e.g. two mappings folders on disk, or two databases). Maybe that's an acceptable tradeoff, again, not sure. |
It is not an "artificial constraint" at all. It is exactly how we characterise Parsec multi-tenancy. The client identity is the namespace in the key store. The provider is simply where the key happens to have been created. Once a client has created some key K, then that key name is "taken", and its resident provider is just a detail of it. Whether it has benefits - overwhelming or otherwise - is something we can argue. I'm coming more from the angle of making the implementation match our stated design. However, I think it may have benefits for clients, because key creation and key usage might occur in completely different parts of the client's code base. Key creation might occur in a factory provisioning tool, for instance, while usage occurs in general runtime code. The provisioning code would need to have the intelligence to select the best provider if more than one is available. But the runtime code wouldn't need to care - it would only need the name of the key, and Parsec can route this to the correct provider. It might be hard to conclude on this now, because we have no real-world experience of multi-provider deployments and the kinds of client dev experiences that would work well with them. I'm just trying to stay true to the original design principles. |
When I refer to "original design principles", I'm mainly talking about the way we describe Parsec multi-tenancy in conference talks and presentations/demos and so forth. The concept was also captured in writing in the now-archived "system architecture" document:
Note in particular: "all storage and all activities within the service". We have never spoken of providers as playing a role in namespaces. That said, this behaviour might be something that we aren't documenting well enough in the live sections of the Parsec Book. So perhaps whatever we decide in terms of the data model, the main thing we need to do is ensure that the behaviour is very clearly documented in a suitable part of the book. |
The last point I was trying to make in the community meeting was that we should avoid making any changes we don't have to make, especially in the config file. We need to introduce provider names to allow the provider ID to become dynamic - that's alright, we can introduce it with some defaulting in the config file and add lots of warnings about what the implications are. But we don't need to tie the KIM to the authenticator (in the config) for now, simply because we have to keep the old way of linking KIMs to providers anyway, and we can just enhance that one to support the SQLite KIM when it's done. Saying this not because I think that linking isn't important or desirable, but more because I'd be keen to separate away anything that doesn't strictly fall under the label of creating a new KIM. We can debate and implement that linking later under a different issue. |
Regarding KIM namespaces, one issue we may run into is that currently keys are scoped to their provider in the on-disk KIM. Therefore, it is currently/previously possible for a client to create 2 keys with the same name provided they exist on different providers. If we are to move to a new KIM where provider IDs/names are not part of the key identity, then I believe this would be a breaking change. This is as migration would only be possible for users who have unique key names, from the old KIM to the new KIM. |
Not only is it possible, we have tests to check that this is the case, e.g. see the I think that's a good reason to start the transition through the SQLite KIM. We're going to release it in 0.9.0 so we can iron out the details until then (including the bigger picture on where the KIM belongs conceptually and how that should be implemented). Apart from that, maybe we can version the on-disk KIM to preserve the current one for compatibility and change the "schema" for the new version. Or simply use the authenticator-to-KIM linking to make this transition. But overall I think we should separate the concern about introducing a breaking change in the on-disk KIM from the implementation of this one. |
Random thought: could we make the namespace policy configurable? I will stand by everything I've said above in terms of what I think the namespace policy should be. I don't think that providers should play a role in it. I think it should be based purely on the client identity. But:
The first of these points is quite weak, because it's effectively basing design on implementation, which is backwards. But it does impact things, especially if we want a coherent migration story from systems that have benefitted (rightly or wrongly) from how it works now. The second point is perhaps a stronger reason to hedge our bets. But suppose, hypothetically, we had I guess, as config switches go, that this is a pretty drastic one, right? It fundamentally changes the data model as we've already seen in the above discussion. Maybe it would point towards continuing to have the provider as part of the lookup. but changing how inserts and selects are done, depending on that policy choice. It's been a while since I've hacked any SQL, but maybe there's a way to make that not be too awful. Again, just a suggestion. |
I'll just go ahead and throw some ideas out here as well; could it make more sense to let the end-developer decide with some client config, instead of the admin? Because we're essentially debating whether the end-developer would prefer to:
Would it not be best to let the developer decide how they want to consume the service? We could offer both the |
As a general rule, we try to take as many decisions as possible away from the client. The idea is to give the client a minimal, transparent dev experience - which is why we invest so much in auto-configuring providers, authenticators, key types and so forth. The namespace policy is, I think, just not something that the client would be in a position to make an informed choice about - and it could get very messy if the client started changing this choice, or if multiple clients behaved in mutually-inconsistent ways. Clients might not even know what their own identity is, or how it is being determined (because this detail is figured out within the internals of the client library, rather than in the client's actual code). This really feels like a central service policy decision to me., if indeed we want to offer this option. |
I agree with Ionut here and also adding that we should try to keep things simple if we can. The starting reason we are doing this work is to enable:
That does not mean that we have to implement everything now but I think we should do everything to at least allow it. To the extent of this task, I am in favour of:
I believe those are the minimal changes to have a strong basis? In the future we will still be able to move the KIM section somewhere more appropriate. NoteIn your schema above you noted for
I believe that needs to be the serialised Protobuf version of the attributes instead of the |
@MattDavis00 - can we close this, since you already created separate issues for everything else? |
Yes that's good with me. I'll close it |
See #394 for history and the reason behind this. See also these community meeting minutes.
This new KIM could enable:
psa-crypto
However, there would need to be a way to transfer key mappings from the old KIM to the new one.
Also we need to think if multiple KIMs are allowed in parallel.
The text was updated successfully, but these errors were encountered: