Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proto/agent: add privileged API #7

Conversation

mauriciovasquezbernal
Copy link
Contributor

@mauriciovasquezbernal mauriciovasquezbernal commented May 31, 2021

Add proto definitios for new privileged API. This API includes (for now)
the FetchX509SVIDBySelectors that allows a privileged client to get
the X509-SVID of a given workload by passing a set of selectors.
X509-SVID for registration entries matching all the passed selectors
are returned.

Fixes #4


Questions

  • "privileged" seems to be too generic. What else should be a good name?
  • I created a definition for message X509SVID, it seems wrong to have it as it's a common type. However the existing one doesn't include the private key, crl nor the federated bundles. Should we create a new X509SVIDWithKey or similar message type?

I implemented a POC of this API in spire -> https://github.com/kinvolk/spire/commits/mauricio-privileged-attestation-api-poc-v3.

@evan2645
Copy link
Member

evan2645 commented Jun 1, 2021

"privileged" seems to be too generic. What else should be a good name?

Yes... 🤔

Delegation API? Platform API?

I created a definition for message X509SVID, it seems wrong to have it as it's a common type. However the existing one doesn't include the private key, crl nor the federated bundles. Should we create a new X509SVIDWithKey or similar message type?

I was wondering how to deal with this too. I bet @azdagron has an opinion here.

Overall, I think the alignment with the workload api makes sense as a first step. It will be a smaller lift. Eventually, we may think about a different shape in order to better facilitate callers that are subscribing to multiple SVIDs, but that can be introduced later as a new RPC.

@evan2645
Copy link
Member

evan2645 commented Jun 1, 2021

I implemented a POC of this API in spire

I had a quick peek at this, happy to see that it seems to have been relatively straightforward! Some minor things here and there, but overall looking good IMO.

@mauriciovasquezbernal
Copy link
Contributor Author

Eventually, we may think about a different shape in order to better facilitate callers that are subscribing to multiple SVIDs,

Yes, it's something we probably need in Cilium's use case. I agree to do this incrementally.

@azdagron
Copy link
Member

azdagron commented Jun 1, 2021

Overall, I think the alignment with the workload api makes sense as a first step

I was leaning the other direction 😬 There are a few bits of "baggage" in the Workload API that we may or may not want to replicate here until we need it. A few examples:

  • crl, SPIRE does not populate this and may never
  • bundle in each X509SVID message can be lifted directly into the response message

I guess it depends on whether we want this API to look more like the workload API or the other SPIRE APIs. Another thing we may wish to change is that currently, federated trust domain bundles are keyed by the SPIFFE ID or the trust domain (e.g. spiffe://example.org), but everywhere else in the SPIRE API surface is kyed by the name of the trust domain (e.g. example.org).

I'm sort of leaning towards making it look and feel like other SPIRE APIs, but I could see how one might want to go the other direction, depending on how the delegatee is going to pass along these materials to the workload.

message X509SVID, it seems wrong to have it as it's a common type

I agree. What would you all think of something the following, declared locally in this API definition:

message X509SVIDWithKey {
    spire.api.types.X509SVID x509_svid = 1;
    bytes x509_svid_key = 2;

@mauriciovasquezbernal
Copy link
Contributor Author

bundle in each X509SVID message can be lifted directly into the response message

Aren't bundles rotated?

I guess it depends on whether we want this API to look more like the workload API or the other SPIRE APIs.

I don't have strong opinions. Whatever shape that contains the data we need is good for me.

message X509SVIDWithKey {
 spire.api.types.X509SVID x509_svid = 1;
 bytes x509_svid_key = 2;

Seems good to me. I only have the question about the bundle rotation above.

@azdagron
Copy link
Member

azdagron commented Jun 2, 2021

I presume a bundle rotation would result in a new response on the stream, similar to how the Workload API operates. The difference between the two is that instead of repeating that same bundle on each X509SVID in the response, we'd just return one bundle.

@evan2645
Copy link
Member

evan2645 commented Jun 2, 2021

instead of repeating that same bundle on each X509SVID in the response, we'd just return one bundle.

This makes me wonder if we need to have any special consideration around scale and grpc message size

@mauriciovasquezbernal
Copy link
Contributor Author

mauriciovasquezbernal commented Jun 2, 2021

we'd just return one bundle.

I tried to put this idea in proto terms and ended up with something like:

// FetchX509SVIDsBySelectorsResponse can be either a list of SVIDs or a list of
// bundles.
message FetchX509SVIDsBySelectorsResponse {
    oneof Response {
        X509SVIDs x509_svids = 1;
        X509Bundles x509_bundles = 2;
    }
}
...

I'm not sure if we should be mixing different messages over the same rpc, what about having a separated rpc to get bundle updates?

Edit: I'm not that sure that having a different rpc would be good. It just adds complexity. Maybe an option to avoid sending the same bundle again and again would be to make it optional and only send it when there is a change. It'll need some logic on the spire side to understand that it was changed and the client should always check this field. I'm happy to hear more ideas on this @azdagron.

@evan2645
Copy link
Member

evan2645 commented Jun 7, 2021

I'm not sure if we should be mixing different messages over the same rpc, what about having a separated rpc to get bundle updates?
...
Edit: I'm not that sure that having a different rpc would be good. It just adds complexity.

Yes I agree. I think perhaps @azdagron was suggesting not a separate message for bundle, but having it in the top-level message instead e.g.

message FetchX509SVIDsBySelectorsResponse {
    repeated X509SVIDWithKey svids = 1;
    map<string, bytes> bundles = 2;
    ...
}

message X509SVIDWithKey {
    spire.api.types.X509SVID x509_svid = 1;

    // @azdagron why no type for this one?
    bytes x509_svid_key = 2;

    string spiffe_id = 3;
    string trust_domain = 4;
}

@mauriciovasquezbernal
Copy link
Contributor Author

Yes I agree. I think perhaps @azdagron was suggesting not a separate message for bundle, but having it in the top-level message instead e.g.

I think you're right, I was over-complicating it.

message FetchX509SVIDsBySelectorsResponse {
    repeated X509SVIDWithKey svids = 1;
    map<string, bytes> bundles = 2;

What about the bundle for this trust domain?, isn't it better to have it as a separated field to avoid always having to perform the lookup in the map? (No strong opinion here, just wondering)

If we put the federated bundles here then we have to include a list of federates_with for each svid, right? (it's possible that all returned svids don't federate with the same trust domains).

string spiffe_id = 3;
string trust_domain = 4;

spiffe_id and trust_domain are part of spire.api.types.X509SVID. Should we repeat those?

@evan2645
Copy link
Member

evan2645 commented Jun 9, 2021

What about the bundle for this trust domain?, isn't it better to have it as a separated field to avoid always having to perform the lookup in the map? (No strong opinion here, just wondering)

Historically, we have the local bundle and the federated bundles in our response message(s). What we've observed is that this arrangement leads to client implementations that don't support federation, because federation ends up being a totally different codepath. As a result, we've been leaning towards an "always check the map" approach, as it unifies the "this" trust domain authN and the federated codepaths rather than special casing the latter.

If we put the federated bundles here then we have to include a list of federates_with for each svid, right? (it's possible that all returned svids don't federate with the same trust domains).

Hmm ... yes this is a good point. Reading this, I've also realized that I think our bundle repetition optimization is probably in vain because we expect callers to hit this RPC once per workload, rather than once for all workloads... So I think you were probably right in your original assertion of a different RPC, if the goal is to reduce overhead there. It definitely causes client friction. @azdagron and I are chatting offline and will provide some more feedback here soon

@evan2645
Copy link
Member

evan2645 commented Jun 9, 2021

I think our bundle repetition optimization is probably in vein because we expect callers to hit this RPC once per workload, rather than once for all workloads

I should clarify that it is not fully in vain, as we get the optimization over SVID responses on a per-workload basis, but in practice there is usually a very small number (almost always 1) of these.

@azdagron
Copy link
Member

Having some more time to think on this and chatting with @evan2645 a little bit, I think maybe I was overzealous in my suggestions. I think we should model this closely to what is needed for cilium's use case and emphasize somehow, either through the name and shape of the RPC, and in the documentation that the intention of this API is for a delegatee to obtain the SVID+bundle materials on a per-workload basis. I also worry that I could personally yak-shave this thing to death trying to find the optimal shape that covers all use cases and think that we should just settle on whatever makes sense for cilium for now. If that means a shape identical to the Workload API, so be it. If at a future point we determine a better API, we can always introduce additional RPCs or a V2 API.

@mauriciovasquezbernal
Copy link
Contributor Author

I have been experimenting a bit more with the definition of this API, the different optimization ideas and the Cilium needs. I think there is a good coupling of the Cilium needs and the optimization ideas.

what is needed for cilium's use case

For the cilium use case we need

  1. Get SVIDs (spiffe ID, cert + key) for each pod on the cluster
  2. Get bundles (including federated ones)
  3. An efficient way to watch SVIDs for multiple pods. (Up to hundreds).

I think points 1 and 2 are not that difficult, all the different shapes of the API we have discussed include this information. I wanted to avoid discussing point 3 but I think it's better to have that discussion now because it could also affect the definition of this API.

I fear having a single gRPC stream for each workload won't escalate well and we'd need to "multiplex" updates for multiple workloads in a single stream. (Any input of your previous experience on this point is more than welcome).

We'd need then to define an API that sends updates for multiple workloads on the same stream and that allows the client to add and remove "watched" workloads. After some thinking I have the following idea on mind:

service Privileged {
   //... other rpcs not important for this discussion

   // Fetch X.509-SVIDs for multiple workloads. 
   rpc FetchX509SVIDs(stream FetchX509SVIDsRequest) returns (stream FetchX509SVIDsResponse);
}

// FetchX509SVIDsRequest is used by the clients to modify the set of workloads
// that they receive SVIDs for.
message FetchX509SVIDsRequest {
    enum Operation {
        UNSPEC = 0;
        // Add a new workload to the watched set.
        ADD = 1;
        // Delete a workload from the watched set.
        DEL = 2;
    }

    // Operation to perform. Add/delete a workload to/from the watched set.
    Operation operation = 1;

    // Workload's identifier. The meaning of this field depends on the
    // operation:
    // ADD: The server will include this id in the messages containing updates
    // for this workload.
    // DEL: The id of the workload to remove from the watched set.
    uint64 id = 2;

    // Selectors of the workload to add to the watched set. Only used when
    // operation is ADD.
    repeated spire.api.types.Selector selectors = 3;
}

message FetchX509SVIDsResponse {
    // Id of the workload. This value corresponds to the one used when adding
    // the workload to the watched set.
    uint64 id = 1;

    repeated X509SVIDWithKey x509_svids = 2;
}

The rpc is bidirectional streaming to allow the client to add / remove workloads. It's needed in the cilium case as the workloads to watch change when pods are created/deleted. The id field is an optimization to avoid sending all the selectors in each reply.

I'm not sure if it makes sense. What do you think about it? Do you have any different idea to implement a scalable way to watch multiple workloads? Is that feasible to implement in SPIRE? Another possibility is to continue with the one stream per workload rpc and see how far we can get with that before looking for a more optimized way...

@evan2645
Copy link
Member

Ok, this is good ... I feel like we are making some progress. What I'm hearing is:

  • We want to scale efficiently to hundreds of pod watches
  • We are ok with more complicated stream semantics in order to do it efficiently

I think the single stream with subscription control messages in one direction and multi-tenant SVID data in the other makes sense for this goal... I also think that we will want a separate stream for bundle. Some considerations are

  • We don't want local bundle update to trigger SVID update message for all subscribed pods
  • We want to ensure that any message we send to the client represents an actual change in state for the referenced thing

The consequence will be that clients will have some codepaths for SVID management and bundle management, and those paths may be separate. Here's an idea that iterates on @mauriciovasquezbernal's latest example

service Privileged {
   // Fetch X.509-SVIDs for multiple workloads. 
   rpc FetchX509SVIDs(stream FetchX509SVIDsRequest) returns (stream FetchX509SVIDsResponse);

  // Fetch local and federated bundles
  rpc FetchX509Bundles(empty) returns (stream FetchX509BundlesResponse);
}

// FetchX509SVIDsRequest is used by the clients to modify the set of workloads
// that they receive SVIDs for.
message FetchX509SVIDsRequest {
    enum Operation {
        UNSPEC = 0;
        // Add a new workload to the watched set.
        ADD = 1;
        // Delete a workload from the watched set.
        DEL = 2;
    }

    // Operation to perform. Add/delete a workload to/from the watched set.
    Operation operation = 1;

    // Workload's identifier. The meaning of this field depends on the
    // operation:
    // ADD: The server will include this id in the messages containing updates
    // for this workload.
    // DEL: The id of the workload to remove from the watched set.
    uint64 id = 2;

    // Selectors of the workload to add to the watched set. Only used when
    // operation is ADD.
    repeated spire.api.types.Selector selectors = 3;
}

message FetchX509SVIDsResponse {
    // Id of the workload. This value corresponds to the one used when adding
    // the workload to the watched set.
    uint64 id = 1;

    repeated X509SVIDWithKey x509_svids = 2;
}

message X509SVIDWithKey {
    spire.api.types.X509SVID x509_svid = 1;
    bytes x509_svid_key = 2;

    string spiffe_id = 3;
    string trust_domain = 4;
    repeated string federates_with = 5;
}

message FetchX509BundlesResponse {
    string trust_domain_name = 1;
    bytes bundle = 2;
}

Consumer calls ADD/DEL on fetch svid to manage the stream subscriptions. Consumer also attaches to fetch bundles to receive updates on all bundles the agent has. They are sent one bundle per message (this lets us send update of a single bundle without retransmitting all the rest). X509SVIDWithKey carries a repeated string that states the trust domain names that workload should federate with. The consumer manages the lookup against data received from the bundle stream.

There are some smaller gotchas in here, like the id field you added to track state from consumer side ... and some typing related stuff (and naming of the messages) .. mostly curious to hear thoughts on the overall shape and interaction pattern of an approach like above, given everything we've discussed.

@azdagron
Copy link
Member

How does a caller know when it can drop a previously received bundle from the state it is caching? Relying on the aggregated federates_with fields of the X509SVIDWithKey response could be racy and a little chicken-and-egg like.

@evan2645
Copy link
Member

How does a caller know when it can drop a previously received bundle from the state it is caching? Relying on the aggregated federates_with fields of the X509SVIDWithKey response could be racy and a little chicken-and-egg like.

Hmm that is a good question ... in general, i expected client to track all bundles that agent tracks ... do we have mechanisms for bundle redaction in agent core? we could send an update for td name with empty bundle? Open to other ideas :) I'm having a hard time coming up with an efficient answer that includes bundle data in SVID response...

@evan2645
Copy link
Member

Other option is to package all bundle data in one message, but still have them separate from the SVID message(s)

@azdagron
Copy link
Member

I think the agent implementation is already going to require some level of "diffing" old state v.s. new on the agent side. We could probably send a deletion event... empty bundle field in the response sounds just as good as anything.

@evan2645
Copy link
Member

@mauriciovasquezbernal curious to hear if you have any concerns about handling svid and bundle data separately? It will push some complexity on to the client, but the end result I think will be much more efficient than the proposals we had that include bundle and svid data together

@mauriciovasquezbernal
Copy link
Contributor Author

like the id field you added to track state from consumer side

The goal here is to avoid sending the full selectors each time. I considered also the case where it's the server who creates this ID field, however I found it more difficult because the server will have to tell the client which is the ID for a given set of selectors before sending any update for those. Another idea that could work is to use the hash of the selectors instead. @evan2645 any comment?

Relying on the aggregated federates_with fields of the X509SVIDWithKey response could be racy and a little chicken-and-egg like.

Yeah, this could be a little bit difficult from the client side. For instance when there is an SVID that federates with a trust domain that the client hasn't received the bundle for. IMO it's something that the clients could live with anyway.

@mauriciovasquezbernal curious to hear if you have any concerns about handling svid and bundle data separately? It will push some complexity on to the client, but the end result I think will be much more efficient than the proposals we had that include bundle and svid data together

@evan2645 I don't have any concerns so far. I think it makes a lot more sense now that we would be able to watch multiple SVIDs from the same client. Actually my concern is more about the spire-side implementation. I trust blindly your assessment there.

I'll update this PR so we can discuss directly on the code.

@mauriciovasquezbernal mauriciovasquezbernal force-pushed the mauricio-add-privileged-api-proto branch 2 times, most recently from 13845f5 to b9ccd1f Compare June 24, 2021 14:47
Add proto definitios for new delegation API. This API includes two
rpcs: FetchX509SVIDs and FetchX509Bundles.

The FetchX509SVIDs rpc allows a privileged client to get X509-SVIDs
for a given set of workloads. This method uses a bidirectional gRPC
stream. The server to client flow is used to send SVID updates and
the client to server one is used as a control mechanism to add /
remove workloads from the watched set.

The client can add workloads to the watched set by specifying a list of
workload's selectors. X509-SVIDs for registration entries matching
*all* the passed selectors are returned.

The FetchX509Bundles rpc streams all bundles (the federated and the
trust domain one) to the client.

Signed-off-by: Mauricio Vásquez <[email protected]>
Signed-off-by: Mauricio Vásquez <[email protected]>
@mauriciovasquezbernal
Copy link
Contributor Author

Hi @azdagron @evan2645. I have been working with the AccuKnox folks (@nyrahul @navarrothiago @rscampos) on this PR and they are going to take over the task. I'll try to participate in the discussions as much as possible. Thanks so much for the help and keep rocking with this project!

Copy link
Member

@evan2645 evan2645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mauriciovasquezbernal - this is looking pretty great. I left a handful of comments around naming and encoding. Many of the questions are focused on making sure we have the behavior and encoding etc documented as proto comment. I'd like to get @azdagron's thoughts as he is better at this kind of thing than I am :)

Also wanted to apologize for the delay last couple weeks. US holiday plus SPIRE 1.0 release tied us up for quite a while.

I have been working with the AccuKnox folks (@nyrahul @navarrothiago @rscampos) on this PR and they are going to take over the task. I'll try to participate in the discussions as much as possible. Thanks so much for the help and keep rocking with this project!

Thank you very much for everything thus far @mauriciovasquezbernal. It has been very nice working with you.

And also nice to meet you @nyrahul @navarrothiago and @rscampos. Perhaps you'd like to have a quick introductory call to get things handed off?

// clients on the spire-agent configuration.
service Delegation {
// Fetch X.509-SVIDs for multiple workloads.
rpc FetchX509SVIDs(stream FetchX509SVIDsRequest) returns (stream FetchX509SVIDsResponse);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since have a significant departure from the workload API here, perhaps a different RPC name makes sense?

To me, this RPC has a sessionized feel to it.

SubscribeToX509SVIDs?
StartSessionX509SVID?
ReceiveX509SVIDs?
AttachX509SVIDStream?

The bundle RPC would be updated as well to match e.g. AttachBundleStream

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of those, I'm leaning towards SubscribeToX509SVIDs the most I think?

//
// The caller must be local and its identity must be listed in the allowed
// clients on the spire-agent configuration.
service Delegation {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure the idiomatic naming here ... is it Delegate or Delegation? @azdagron do you have a preference?

Maybe DelegateSVIDIssuance or DelegateIssuance or DelegateWorkloadIdentity? I do think the term "delegate" is appropriate here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sticking with nouns is important, so I'd rather see "Delegated*" than "Delegate*". Even though this isn't quite shaped like our other resource-oriented APIs, I wonder a resource-oriented name makes sense. Maybe DelegatedIdentity?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good, we think DelegatedIdentity is more explicit, tks :D

// - ADD: The server will include this id in the messages containing updates
// for this workload.
// - DEL: The id of the workload to remove from the watched set.
uint64 id = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about this field, the more it feels like we're building a multiplexed stream on top of a multiplexed stream (i.e. our custom scheme on top of HTTP/2). I can't help but wonder if there are other options that utilize this facility at a lower level rather than duplicate.

Putting that aside, I think this field name could be a bit more descriptive like workload_identifier or selector_set_id. Is the field optional?

Copy link

@navarrothiago navarrothiago Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the first line, about building a multiplexed stream, do you think that would be better to split the FetchX509SVIDs? Doing so, the new shape will contains the following the RPCs:

    rpc SubscribeToX509SVIDs(stream SubscribeToX509SVIDsRequest) returns (SubscribeToX509SVIDsResponse);
    rpc UnsubscribeToX509SVIDs(stream UnsubscribeToX509SVIDsRequest) returns (UnsubscribeToX509SVIDsResponse);
    rpc AttachX509SVIDsStream(AttachX509SVIDsStreamRequest) returns (stream AttachX509SVIDsStreamResponse);

The SubscribeToX509SVIDs would be the same as FetchX509SVIDs (operation=ADD)
The UnsubscribeToX509SVIDs would be the same as FetchX509SVIDs (operation=DEL)
The AttachX509SVIDsStream would be the FetchX509SVIDs, but without the operations.

Doing in this way, we just remove this field operation in the AttachX509SVIDsStream (the old API FetchX509SVIDs).

If we just want to subscribe a pod, just use SubscribeToX509SVIDs. If we just want to fetch the certificate, just use AttachX509SVIDsStream. In this way we can separate all the logics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about this field, the more it feels like we're building a multiplexed stream on top of a multiplexed stream (i.e. our custom scheme on top of HTTP/2). I can't help but wonder if there are other options that utilize this facility at a lower level rather than duplicate.

I understand your point. I tried to think about that but I didn't find a solution. Perhaps the input of a grpc / HTTP/2 expert could help here.

Is the field optional?

I don't think it should be optional. If we make this optional, then how the client will be able to create the response-to-workloads relationship?

I think we have two options regarding this field:

  1. Keep is mandatory and specified by the client as it's now in the PR.
  2. Remove it completely and replace by something like selectors_hash. It'll serve for the same purpose (establishing the response-to-workload relationship) but it won't be specified by the client but instead calculated from the set of selectors. We'll have to define how to calculate the hash of a set of selectors of course.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    rpc SubscribeToX509SVIDs(stream SubscribeToX509SVIDsRequest) returns (SubscribeToX509SVIDsResponse);
    rpc UnsubscribeToX509SVIDs(stream UnsubscribeToX509SVIDsRequest) returns (UnsubscribeToX509SVIDsResponse);
    rpc AttachX509SVIDsStream(AttachX509SVIDsStreamRequest) returns (stream AttachX509SVIDsStreamResponse);

I should work fine too. I kept everything in a single stream because some possible race conditions, but now that the id (workload_identifier / selector_set_id) is set by the client those are not relevant anymore.

Copy link
Member

@azdagron azdagron Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should strive very hard to bind the "scope" of the subscription to an individual RPC, otherwise it puts the burden of cleanup on the caller, which could go away at any time. We don't want the server holding onto a subscription because it doesn't know that the caller disappeared.

I think the choice here is either 1) have callers start a stream per subscription or 2) have callers aggregate all subscriptions to a single stream, which is how it is currently defined. Both have tradeoffs. I think the only thing you really save with the aggregate approach is maybe some resource overhead imposed by the individual gRPC streams. I don't have a strong preference. From the server implementation perspective, option 1 seems a little easier to implement. Option 2 at first glance seemed easier to implement from a client perspective, but I think it still requires the client to implement code to demux responses and manage subscriptions. It isn't a clear win for me. I a-little-more-than-slightly prefer option 1, but could be convinced to stick with option 2. I also think we could easily change this up later once we learn more. We don't need to get it perfect now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know if we got your point, when you say individual RPC, is just only one RPC to do all the functionality (the way its done today - subscribe/unsubscribe/get x509 cert)? In this way, we just use the rpc SubscribeToX509SVIDs to do all the functionality (as you commented above about the name of the rpc).

Or, having 3 differents streams, one for each rpcs:SubscribeToX509SVIDs, UnsubscribeToX509SVIDs and AttachX509SVIDsStream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry I was confusing :)

What I mean is that the lifetime of a subscription should be tied to an invocation of a gRPC method. If we split the individual RPC we have now into a Subscribe and Unsubscribe RPC, then we have to handle clients never calling Unsubscribe, which complicates the server needlessly. If we only have a single RPC that they interact with, then the lifetime of the subscription naturally aligns to the lifetime of the stream; when the stream is closed, all of the subscriptions opened by that stream can be cleaned up.

Considering all of that, I'm a fan a single RPC. I slightly prefer having a single subscription for each invocation instead of multiplexing multiple subscriptions over the same gRPC stream, but like I mentioned above, I can live with the latter. I don't see how it really reduces code complexity in the client that much, but maybe it saves per-gRPC stream overhead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problem @azdagron, I think now I got it :)

So for now we just let SubscribeToX509SVIDs and do all the things in only one gRPC.
In addition, today, our current PoC is sending all request in only one stream. Doing so its possible to do the subscribe process for multiple client - informing the id of the client. Should it be ok?

// empty bundle indicates that it was removed.
message FetchX509BundlesResponse {
string trust_domain_name = 1;
bytes bundle = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bundle encoding and format?

I think workload api does DER-encoded X.509 certs here which would probably be fine.

Should it be repeated bytes ca_certificates or something? The term "bundle" is pretty overloaded, but it still feels appropriate as the message name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we think repeated bytes ca_certificates would be better, to be more descriptive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also like repeated here.

// X.509 SPIFFE Verifiable Identity Document with the private key.
message X509SVIDWithKey {
spire.api.types.X509SVID x509_svid = 1;
bytes x509_svid_key = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key encoding? I recommend DER PKCS#8

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying we should define this encoding (DER PKCS#8) in the comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that is what @evan2645 intends. Otherwise it isn't immediately clear to callers how to parse the result.

spire.api.types.X509SVID x509_svid = 1;
bytes x509_svid_key = 2;

repeated string federates_with = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more description here would be good

Suggested change
repeated string federates_with = 3;
// Names of the trust domains that this workload should federates with
repeated string federates_with = 3;

Copy link
Member

@azdagron azdagron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more general comment. Once we finalize the field names here, we should go through and add Require. prefix to the field comments for required fields. It makes it easy to see at a glance what is minimally expected in each request/response. See the other API descriptions for examples.

@rscampos
Copy link
Contributor

Guys, we just send a new PR #8 related to all considerations discussed here :D

@evan2645
Copy link
Member

evan2645 commented Aug 9, 2021

Gonna go ahead and close this pr in favor of #8

Thank you @rscampos and @navarrothiago for preserving @mauriciovasquezbernal's authorship.

And special thank you to @mauriciovasquezbernal for kicking this off and getting everyone to a point where we're almost ready to go 😄 🚀

@evan2645 evan2645 closed this Aug 9, 2021
guilhermocc pushed a commit to guilhermocc/spire-api-sdk that referenced this pull request Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a protobuf for a new privileged API on SPIRE Agent
5 participants