Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Subject/Object encoding #792

Closed
4 of 6 tasks
g13013 opened this issue Dec 5, 2021 · 8 comments · Fixed by #840
Closed
4 of 6 tasks

Automatic Subject/Object encoding #792

g13013 opened this issue Dec 5, 2021 · 8 comments · Fixed by #840
Assignees
Labels
feat New feature or request.
Milestone

Comments

@g13013
Copy link

g13013 commented Dec 5, 2021

Preflight checklist

Describe your problem

We are willing to use Keto in our stack as we think it's a perfect fit for our projects. We've been following the progress of Keto since the earliest versions. Before version 0.6 there was an AWS-like implementation I believe, it was convenient for us as our implementations are also inspired by AWS techs, for example, we are using URNs in our applications to reference objects between services, the change in recent Keto specs are fine except one thing, Objects/Subjects naming limitations. forcing every application to translate objects/subjects usually referenced by ARN's to UUIDs or so is far from being ideal. We believe there is a little change in Keto that could remove this limitation without forcing applications to obscure their implementation to overcome this limitation.

Describe your ideal solution

In fact, this limitation could easily be addressed in Keto itself. Keto could do the encoding/decoding of the object/subject before hitting the database. The encoding could be done by using SHA256 or UUID-5 (with tuple's namespace).

  • Write: Request -> Encode -> Backend
  • Read: Request -> Encode -> Backend

Also, the object and subject type and format could be relaxed to include objects as well, which could be transformed to JSON and then encoded.

So taking the example from the Doc:

A request to keto like:

{
    "action": "insert",
    "relation_tuple": {
        "namespace": "network",
        "object": "tcp/22",
        "relation": "access",
        "subject_id":  {
            "subnet": "192.168.0.0/24",
            "office_hours": true
         }
}

Assuming the namespace network's uuid is 4fee2932-561d-11ec-8a6d-ff46757a867c the request could be translated using uuid-5 to the following before write/read:

{
    "action": "insert",
    "relation_tuple": {
        "namespace": "network",
        "object": "23c60557-883b-5032-a4ca-5ee40d408b75",
        "relation": "access",
        "subject_id":  "ae60432e-c611-5bf0-b222-b3d7dda64f54"
}

Also this could be an option enabled in the config

namespaces:
  - id: 0
    name: network
    auto_encode:  uuid5
    uuid5_namespace: 4fee2932-561d-11ec-8a6d-ff46757a867c

  - id: 0
    name: videos
    auto_encode:  sha256

Workarounds or alternatives

  • Write the encoding logic in the applications which makes it difficult to have consistent implementation across all applications especially using microservices with different teams.
  • implementing some sort of request rewriter in front of keto which could be a viable solution but with a risk of data corruption or bugs if done wrong.

Version

v0.7

Additional Context

No response

@g13013 g13013 added the feat New feature or request. label Dec 5, 2021
@zepatrik
Copy link
Member

zepatrik commented Dec 7, 2021

I would like to know what limitations you are hitting. The URNs would be good to use directly IMO. Are some of them longer than 64 characters? If the use of the characters :#@ is the problem, that limitation should actually not be there anymore since #638.

@g13013
Copy link
Author

g13013 commented Dec 7, 2021

@zepatrik , basically we don't have any limitation for the URNs length except what the standard states, but even the smallest of the URN's could could easily hit the limit or 64.

Also In a scenario where 2 services (or more) read/write the same permission from Keto they need to share the same encoding logic which puts the application at risk of bugs or even security issues if any of the services changes the encoding without the others updating the same encoding in the exact same time.

it's a little addition to Keto that would make it's usage straight forward and safer.

If the use of the characters :#@ is the problem

this is good news !

@zepatrik
Copy link
Member

zepatrik commented Dec 9, 2021

Also In a scenario where 2 services (or more) read/write the same permission from Keto they need to share the same encoding logic which puts the application at risk of bugs or even security issues if any of the services changes the encoding without the others updating the same encoding in the exact same time.

Ideally you would not trans- or encode your internal identifiers but use them directly. The 64 char limit on them is mainly based on the fact that they will be potentially passed around quite a lot. I guess having a mapping would make sense in some cases. We might then use UUID column types instead of strings, and you can either supply your own, or have a string <-> UUID mapping done by Keto. What do you think @aeneasr ?

@zepatrik
Copy link
Member

After a quick offline discussion, this sounds reasonable and will be added to the roadmap 👍

@zepatrik zepatrik added the help wanted We are looking for help on this one. label Dec 10, 2021
@zepatrik zepatrik added this to the v0.8.0 milestone Dec 10, 2021
@RichiCoder1
Copy link

RichiCoder1 commented Dec 26, 2021

So would the idea be a generic mechanism to define encoding raw ID inputs to a uuid?

For string ID's the conversion could be something like uuidv5(<config-defined-ns>, <id>), but the above also mentions an DTO as an ID. Would an object id then (as an example) be something like uuidv5(<config-defined-ns>, sha256(json.Marshal(<id>)))? Or entirely unsupported?

Edit: Reading the API Docs it sounds like the answer is no, so the client would still need to encode if they wanted to do an DTO subject like in the example.

Also how would this interact with API's like https://www.ory.sh/keto/docs/guides/list-api-display-objects#listing-objects ? This would effectively prevent consumers from groking the return of that no? Since encoding is a transparent process and non-reversible?

@zepatrik
Copy link
Member

zepatrik commented Jan 7, 2022

I think what makes most sense is to use only UUIDs internally in the relation tuple table. Then for all cases where you don't want to provide a UUID, but any string, we can use a mapping table that is basically only <string, UUIDv4>. This will allow us to have unbound strings.
All requests will then first resolve the string to UUID, use only UUIDs internally, and before response re-translate UUID to string.
We have to use a table for the mapping because UUIDv5 are only one-way (SHA1) and we can't inverse that to get back the external facing ID.
This will affect all APIs btw.

@hperl
Copy link
Collaborator

hperl commented Jan 7, 2022

Hi, I'd like to take a stab at it :)

@zepatrik
Copy link
Member

zepatrik commented Feb 17, 2022

Ok, to sum up the current state:

This feature is developed on the feature branch https://github.com/ory/keto/tree/feat/uuid-mapping to not block releases from current master. Further progress can be tracked at #840.

@hperl hperl self-assigned this May 6, 2022
hperl added a commit that referenced this issue May 9, 2022
In relation tuples and related API objects such as queries, subjects
(including subject sets) and objects are now automatically mapped to and
from UUIDs.  The mapping is done in each handler and for each protocol
(HTTP, gRPC) separately.

Below the handler, all objects and subjects are now UUIDs, but still
handled as strings, for the following reasons:
 * No duplication of datastructures (one for type `string`, one for type
   `uuid.UUID`).
 * No unnecessary copying, the mapping is done in-place and batched
   across multiple objects.
hperl added a commit that referenced this issue May 10, 2022
In relation tuples and related API objects such as queries, subjects
(including subject sets) and objects are now automatically mapped to and
from UUIDs.  The mapping is done in each handler and for each protocol
(HTTP, gRPC) separately.

Below the handler, all objects and subjects are now UUIDs, but still
handled as strings, for the following reasons:
 * No duplication of datastructures (one for type `string`, one for type
   `uuid.UUID`).
 * No unnecessary copying, the mapping is done in-place and batched
   across multiple objects.

Additionally, the migration (adding the string values to the mapping
table and replacing the values with UUIDs) is now performed
automatically as part of the migrations.  Tests are in `migratest`,
which now also handles `GoMigrations` properly.  This method replaces
the dedicated command in `cmd/migrate/...` for the UUID mappings.

XXX WIP

XXX WIP

XXX WIP migrations_test package

XXX WIP migrations_test package

XXX WIP migrations_test package

XXX WIP migrations_test package

XXX WIP migrations_test package

XXX WIP migrations_test package

WIP
hperl added a commit that referenced this issue May 16, 2022
In relation tuples and related API objects such as queries, subjects
(including subject sets) and objects are now automatically mapped to and
from UUIDs.  The mapping is done in each handler and for each protocol
(HTTP, gRPC) separately.

Below the handler, all objects and subjects are now UUIDs, but still
handled as strings, for the following reasons:
 * No duplication of datastructures (one for type `string`, one for type
   `uuid.UUID`).
 * No unnecessary copying, the mapping is done in-place and batched
   across multiple objects.

Additionally, the migration (adding the string values to the mapping
table and replacing the values with UUIDs) is now performed
automatically as part of the migrations.  Tests are in `migratest`,
which now also handles `GoMigrations` properly.  This method replaces
the dedicated command in `cmd/migrate/...` for the UUID mappings.
@hperl hperl removed the help wanted We are looking for help on this one. label Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants