-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(adr): initial ADR to address service list #377
Conversation
Listing requirements, considerations, and alternative implementation considerations around having a common service list for all EdgeX services. Signed-off-by: Jim White <[email protected]>
Co-authored-by: Bryon Nevis <[email protected]> Signed-off-by: Jim White <[email protected]>
Co-authored-by: Bryon Nevis <[email protected]> Signed-off-by: Jim White <[email protected]>
Co-authored-by: Bryon Nevis <[email protected]> Signed-off-by: Jim White <[email protected]>
Co-authored-by: Bryon Nevis <[email protected]> Signed-off-by: Jim White <[email protected]>
Co-authored-by: Bryon Nevis <[email protected]> Signed-off-by: Jim White <[email protected]>
I would like to mention a possible solution to the dynamic token generation problem that might go well with this dynamic service list issue. There is an existing CNCF incubation project out there called SPIFEE/SPIRE. https://github.com/spiffe/spire This project used Unix domain sockets to authenticate the calling process and issue them credentials. It does this by looking up the remote end's process ID, and collecting metadata on it. So, the path and sha of the executable for a Linux process, a docker image id for a process in a Docker container, or pod names or other metadata for a kubernetes pod. This data is looked up in a central table that is used to make an authorization decision. If a match is found, a JWT or X.509 certificate is delivered to the calling process. This technology would be effective in replacing the static list of secret store tokens with a file-based mechanism to authorize new services on-demand. It would also get us 90% there for EdgeX microservice authentication. The catch? There is no support for Windows. |
Would WLS2 in Windows consider support for Windows? If so, then we should be ok. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments added inline...
design proposal stemming from the 5/17 Monthly Archs meeting Also cleaned up and added content to requirements and context sections based off additional review feedback Signed-off-by: Jim White <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: jpwhitemn <[email protected]>
cad080b
- The user interface needs to know which services, specifically device and application services, are available in order to provide interaction with those services, and potentially to display some metrics on operational services. | ||
- The CLI could refer to a dynamic list of which services are operational and which device services and application services to include in start/stop operations. *Note: Alternatively, the CLI could simply issue a command to a service and if the service is not available, issue an error response. However, if a new service was added (such as a new device service), the CLI would not know about it unless it had access to some dynamic list of services.* | ||
- Security proxy setup requires a list of services for which a reverse proxy route should be established. This is a subset of the full list of EdgeX services. | ||
- Core Metadata must maintain a list of device-type services. This list is used to validate device additions (i.e., any new device definition must include a valid named device service which it is going to be associated with). This list is also used when routing commands (via core command) to the appropriate device service. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed delayed start services ADR depends on service keys hard-coded into the docker-compose file in order to build a mapping from service to SPIFFE ID. The ADR also distinguishes service-name from service-key, in the case where there are multiple instances of the same executable executing within the system, personized by configuration arguments. The service-name list can be added-to by end-users to create new services.
Currently, the list of services is haphazardly managed by the needing service. | ||
|
||
- The security bootstrapping services use a combination of static lists for known EdgeX services and environment variables for services that are not statically known. (This list is needed for both Vault and Consul to setup appropriate access control lists.) | ||
- The user interface and CLI do not deal with service lists today and assume all services are operational (with known service host and port defined in configuration). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the UI now using the new Registry GetAllServiceEndpoints() API to get all registered services?
- In some cases, a service may be “disabled” at runtime because of resource constraints (example: notification service is turned off as it is not used and there is a need to operate with limited memory and CPU capability). Security would still need to know about disabled services in order to create a token for such services that may be enabled later. However, the other services (or 3rd party applications) may need to know that they are disabled vs enabled. | ||
- In the case that the services are distributed (especially when device services run on different hosts - as is likely), access to a distributed file system or volume mount becomes more of an issue and requires additional technology. *Note: distributing device services while in secure mode is now more complex due to the fact that all services require Vault access and the services get their Vault token from file. Distributed device services in secure mode would therefore would require a distributed file system.* | ||
- The service list would have to be a list of services and service locations (URI). Therefore, the service list will really be a map of service names (keys) and the location of the services URIs (values) and potentially other information such as `known secrets` as used by the secret store setup (`redisdb` is the only `known secret` to date). This will help support things like the API gateway setup as well. In some cases, just a list of the service names is needed (in security token setup for example). In this latter case, the service would just pick off the service keys. | ||
- Generation of security tokens for the new services under most of the conceived alternatives is complex. In the current security bootstrapping, secret store setup will require a restart of all the services in order to add or remove a service. The ORRA project and some adopters are clear that they would like to have the ability to add/remove services dynamically at runtime. *Note: in some circumstances, services can be added/removed dynamically today, but not when EdgeX security services are enabled. CLI and UI also do not allow for the addition/remove of services - such as device or application services.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now out of date with recent enhancements. Secret Store setup can now be restarted w/o requiring the other service to be restarted.
Per the Monthly Architect's meeting of 10/18/21 - we are rejecting this ADR for now.
|
Listing requirements, considerations, and alternative implementation considerations around having a common service list for all EdgeX services.
Signed-off-by: Jim White [email protected]
PR Checklist
Please check if your PR fulfills the following requirements:
If your build fails due to your commit message not passing the build checks, please review the guidelines here: https://github.com/edgexfoundry/edgex-docs/blob/master/.github/Contributing.md.
What is the current behavior?
n/a - new ADR for service list across all services
Issue Number:
none
What is the new behavior?
n/a
Does this PR introduce a breaking change?
Specific Instructions
Are there any specific instructions or things that should be known prior to reviewing?
Other information