Skip to content
This repository has been archived by the owner on Feb 24, 2021. It is now read-only.

feat(security): Deploy new security-bootstrapper service #372

Conversation

jim-wang-intel
Copy link
Contributor

@jim-wang-intel jim-wang-intel commented Jan 5, 2021

Docker-compose deploys with a new security-bootstrapper service which controls the security bootstrapping steps for various phases.
The details are summarized in ADR secure bootstrapping.

Signed-off-by: Jim Wang [email protected]

PR Checklist

Please check if your PR fulfills the following requirements:

  • Tests for the changes have been added (for bug fixes / features)
  • [x ] Docs have been added / updated (for bug fixes / features)

If your build fails due to your commit message not passing the build checks, please review the guidelines here: https://github.com/edgexfoundry/developer-scripts/blob/master/.github/Contributing.md.

What is the current behavior?

This is no security-bootstrapping service currently.

Issue Number: #349

What is the new behavior?

New security-bootstrapper service with service-gating mechanism in installation phase.

Does this PR introduce a breaking change?

  • Yes
  • [X ] No

Specific Instructions

Are there any specific instructions or things that should be known prior to reviewing?
In draft mode: this won't function properly until edgex-go's implementation PR merged into branch master

Other information

Pending until this PR resolving the ARM64 type of tests work. edgexfoundry/edgex-go#3082

Copy link
Collaborator

@bnevis-i bnevis-i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jim-wang-intel jim-wang-intel force-pushed the add-new-security-bootstrapper branch 3 times, most recently from 860089c to 35c0917 Compare January 12, 2021 22:02
@jim-wang-intel jim-wang-intel force-pushed the add-new-security-bootstrapper branch from 35c0917 to 3730bc9 Compare January 13, 2021 23:04
@lenny-goodell
Copy link
Member

@jim-wang-intel , I merge my PR. Please rebase and address any changes needed in the new partial compose files for services that are expected to run in secure mode.

@jim-wang-intel jim-wang-intel force-pushed the add-new-security-bootstrapper branch 2 times, most recently from ce08efb to 30748b5 Compare January 20, 2021 21:04
compose-builder/add-security.yml Outdated Show resolved Hide resolved
compose-builder/add-security.yml Outdated Show resolved Hide resolved
compose-builder/add-security.yml Outdated Show resolved Hide resolved
@jim-wang-intel jim-wang-intel force-pushed the add-new-security-bootstrapper branch from ef02557 to 32f731f Compare January 21, 2021 18:31
Copy link
Member

@lenny-goodell lenny-goodell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one minor name suggestion.

compose-builder/README.md Outdated Show resolved Hide resolved
@jim-wang-intel jim-wang-intel force-pushed the add-new-security-bootstrapper branch from 32f731f to 89cfe0e Compare January 21, 2021 21:51
@jim-wang-intel jim-wang-intel marked this pull request as ready for review January 21, 2021 22:35
Copy link
Member

@lenny-goodell lenny-goodell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but want to test in my environment first.

@jim-wang-intel
Copy link
Contributor Author

LGTM, but want to test in my environment first.

ok

@cherrycl
Copy link

@jim-wang-intel I found the below log on edgex-security-bootstrapper when running taf test. It seems to try to connect consul. Does it need to set depends-on with consul?

level=INFO ts=2021-01-22T09:40:23.436782771Z app=edgex-security-bootstrapper source=client.go:50 msg="TCP server edgex-core-consul:54324 may be not ready yet, retry in 1 second"

When running TAF test, I found the edgex-proxy-setup cannot connect edgex-security-bootstrapper on time, because the edgex-security-bootstrapper start-up completed time later than edgex-proxy-setup timeout.
That also causes the TAF script failure.

@jim-wang-intel
Copy link
Contributor Author

jim-wang-intel commented Jan 22, 2021

@jim-wang-intel I found the below log on edgex-security-bootstrapper when running taf test. It seems to try to connect consul. Does it need to set depends-on with consul?
level=INFO ts=2021-01-22T09:40:23.436782771Z app=edgex-security-bootstrapper source=client.go:50 msg="TCP server edgex-core-consul:54324 may be not ready yet, retry in 1 second"

@cherrycl That is expected. The essence of security-bootstrapper is to gate the right startup sequence for services. So in this particular case, it awaits the Consul service to be ready.

When running TAF test, I found the edgex-proxy-setup cannot connect edgex-security-bootstrapper on time, because the edgex-security-bootstrapper start-up completed time later than edgex-proxy-setup timeout.
That also causes the TAF script failure.

@cherrycl not sure what this test case is. the edgex-proxy-setup is gated until edgex-scurity-bootstrapper reaches to the ready-to-run state. To reach to the ready-to-run state, all security related services have to be started up and running. These include vault, secretstore-setup, consul, redis database, and kong-db. If time is not enough for gating these services in TAF tests, we can change the value of env var. SECTY_BOOTSTRAP_GATING_TIMEOUT_DURATION in common-sec-stage-gate.env file, in which it is set to 60s right now.

Also, edgex-proxy-setup waits for kong started-up as well. This is because it uses kong to setup the API gateway route.

@lenny-goodell
Copy link
Member

@cherrycl , I think the root cause may be that the TAF scripts are using docker run to start each service individually, rather that use docker-compose -p edgex -f <compose-file> up as the compose files are designed for. This may be causing the delays and failure you are seeing and also requires the TAF scripts to explicitly know the service names, causing maintenance when changed as we are seeing now.
https://github.com/edgexfoundry/edgex-taf/blob/master/TAF/utils/scripts/docker/deploy-edgex.sh
https://github.com/edgexfoundry/edgex-taf/blob/master/TAF/utils/scripts/docker/deploy-services.sh
https://github.com/edgexfoundry/edgex-taf/blob/master/TAF/utils/scripts/docker/deploy-device-service.sh

Can't the TAF scripts just depend on docker-compose -p edgex -f <compose-file> up to start all the required services based on the appropriate compose file that was downloaded?

@cherrycl
Copy link

@lenny-intel May we keep the current deploy steps? Recently security test got failure because of the following vault-worker error. We found this PR seems to fix the error. Is this PR ready to merge?
/usr/local/bin/entrypoint.sh: line 39: /edgex-init/security-bootstrapper: not found

I will investigate the starting all services at once on TAF later.

@lenny-goodell
Copy link
Member

recheck

@lenny-goodell
Copy link
Member

We found this PR seems to fix the error. Is this PR ready to merge?

Yes, once it passes smoke tests.

I will investigate the starting all services at once on TAF later.

Yes, that is fine.

@lenny-goodell
Copy link
Member

recheck

@jim-wang-intel
Copy link
Contributor Author

Hi @cherrycl the taf tests still failed and probably due to the V2 DTO event stuff recently changed. Please update that and hopefully taf tests can all passed.

@cherrycl
Copy link

@jim-wang-intel Ginny is helping with app-service failure.
But the arm security test also failed. The vault failed to start and got the following log.

Script for waiting security bootstrapping on Vault
Tue Jan 26 03:15:50 UTC 2021 VAULT_LOCAL_CONFIG: 
listener "tcp" { 
               address = "edgex-vault:8200" 
               tls_disable = "1" 
               cluster_address = "edgex-vault:8201" 
           } 
          backend "file" {
               path = "/vault/file"
           } 
           default_lease_ttl = "168h" 
           max_lease_ttl = "720h"
 
Tue Jan 26 03:15:50 UTC 2021 Executing dockerize on vault server with waiting on     tcp://edgex-security-bootstrapper:54321
/edgex-init/dockerize: line 1: syntax error: unexpected word (expecting ")")

Do you have any idea for the error?

@cherrycl
Copy link

@jim-wang-intel The failed test for app-service is fixed by edgex-taf PR #282.

@lenny-goodell
Copy link
Member

recheck

@jim-wang-intel
Copy link
Contributor Author

@jim-wang-intel Ginny is helping with app-service failure.
But the arm security test also failed. The vault failed to start and got the following log.

Script for waiting security bootstrapping on Vault
Tue Jan 26 03:15:50 UTC 2021 VAULT_LOCAL_CONFIG: 
listener "tcp" { 
               address = "edgex-vault:8200" 
               tls_disable = "1" 
               cluster_address = "edgex-vault:8201" 
           } 
          backend "file" {
               path = "/vault/file"
           } 
           default_lease_ttl = "168h" 
           max_lease_ttl = "720h"
 
Tue Jan 26 03:15:50 UTC 2021 Executing dockerize on vault server with waiting on     tcp://edgex-security-bootstrapper:54321
/edgex-init/dockerize: line 1: syntax error: unexpected word (expecting ")")

Do you have any idea for the error?

@jim-wang-intel Ginny is helping with app-service failure.
But the arm security test also failed. The vault failed to start and got the following log.

Script for waiting security bootstrapping on Vault
Tue Jan 26 03:15:50 UTC 2021 VAULT_LOCAL_CONFIG: 
listener "tcp" { 
               address = "edgex-vault:8200" 
               tls_disable = "1" 
               cluster_address = "edgex-vault:8201" 
           } 
          backend "file" {
               path = "/vault/file"
           } 
           default_lease_ttl = "168h" 
           max_lease_ttl = "720h"
 
Tue Jan 26 03:15:50 UTC 2021 Executing dockerize on vault server with waiting on     tcp://edgex-security-bootstrapper:54321
/edgex-init/dockerize: line 1: syntax error: unexpected word (expecting ")")

Do you have any idea for the error?

Hi @cherrycl yea, this is a defect in the current dockerfile of security-bootstrapper only supporting amd64 arch. We will add ARM64 support into Dockerfile of that. Thanks!

@lenny-goodell
Copy link
Member

recheck

1 similar comment
@lenny-goodell
Copy link
Member

recheck

@cherrycl
Copy link

cherrycl commented Jan 27, 2021

@jim-wang-intel Thanks. The vault error was not found now, but kong appears a similar error and start fail.

Script for waiting security bootstrapping on Kong
Wed Jan 27 01:04:06 UTC 2021 Executing dockerize with waiting on tcp://edgex-security-bootstrapper:54329
/edgex-init/kong_wait_install.sh: line 31: /edgex-init/dockerize: No such file or directory

@jim-wang-intel
Copy link
Contributor Author

@jim-wang-intel Thanks. The vault error was not found now, but kong appears a similar error and start fail.

Script for waiting security bootstrapping on Kong
Wed Jan 27 01:04:06 UTC 2021 Executing dockerize with waiting on tcp://edgex-security-bootstrapper:54329
/edgex-init/kong_wait_install.sh: line 31: /edgex-init/dockerize: No such file or directory

@cherrycl This is a different type of error and very weird error. /edgex-init/dockerize should be in there as volume mount is shared. not sure why this is happening.

@jim-wang-intel
Copy link
Contributor Author

@jim-wang-intel Thanks. The vault error was not found now, but kong appears a similar error and start fail.

Script for waiting security bootstrapping on Kong
Wed Jan 27 01:04:06 UTC 2021 Executing dockerize with waiting on tcp://edgex-security-bootstrapper:54329
/edgex-init/kong_wait_install.sh: line 31: /edgex-init/dockerize: No such file or directory

@cherrycl This is a different type of error and very weird error. /edgex-init/dockerize should be in there as volume mount is shared. not sure why this is happening.

@cherrycl maybe the way TAF tests using docker run has created some timing issue among different containers regarding to the docker volume sharing among them. I think the better way to resolve this may be like @lenny-intel 's suggestion to use docker-compose up to run instead of individual docker run. The original way of docker run was in place due to the startup sequence was not guarded against several containers in the past but now the security-bootstrapper is making that startup sequence right.

@jinlinGuan
Copy link
Contributor

@jim-wang-intel I got the same error when running docker-compose -f docker-compose-nexus-arm64.yml up -d.

@jim-wang-intel
Copy link
Contributor Author

recheck

Docker-compose deploys with a new security-bootstrapper service which controls the security bootstrapping steps for various phases.
The details are summarized in ADR secure bootstrapping.

Add command env addition and command overrides for core-services

Now the environment vars for security-stage-gate are in env file and overrides when necessary.

Update asc-http-export-secure and asc-mqtt-export-secure to be gated by security-bootstrapper.

Added common-sec-stage-gate.env description

Updated some typos in the document

Standardize the security-bootstrapper env file and naming of some docker containers
eg vault-worker -> secretstore-setup, edgex-proxy -> proxy-setup

Use the name common-sec-stage-gate.env as the env file for security-bootstrapper

Fix the problem of missing common.env and env quotation

Standardize the naming for appservice to app-service

Closes: edgexfoundry#349, edgexfoundry#237

Signed-off-by: Jim Wang <[email protected]>
@jim-wang-intel jim-wang-intel force-pushed the add-new-security-bootstrapper branch from 0012330 to 0cd276f Compare January 27, 2021 19:09
@jim-wang-intel
Copy link
Contributor Author

jim-wang-intel commented Jan 27, 2021

rebased and squashed. ready to merged once build done.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
3-high priority denoting release-blocking issues ireland security-services
Projects
None yet
5 participants