Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

component startup ordering #23

Open
xiang90 opened this issue Jul 3, 2019 · 5 comments
Open

component startup ordering #23

xiang90 opened this issue Jul 3, 2019 · 5 comments

Comments

@xiang90
Copy link
Contributor

xiang90 commented Jul 3, 2019

The components in the operationalConfiguration is a flat array right now. In some cases, we might want to express the dependencies of components. Component A might need to start before Component B, or Component B might fail.

We might solve this problem by blindly retry in starting phase of components, but ideally we can express ordering requirement explicitly.

@vturecek
Copy link
Member

vturecek commented Jul 9, 2019

It might help to break this down into two scenarios:

  1. The first is a situation where I need a start-up task to do some initialization before other components can run. This would be a run-to-completion task that needs to finish before creating any other components.

  2. The second is a component dependency, where component A must be up and running for component B to work. In other words, component B is useless without component A.

The first scenario reminds me of environment setup tasks that require elevated privileges. I don't think the application model should define this. There is a separate role defined for these types of tasks, called "infrastructure operator" who would be responsible for preparing the operating environment using the native language of the platform that the application is running on. This keeps infrastructure related operations out of the application model, and allows operators to use the underlying platform's RBAC mechanism as well.

For the second scenario, startup order isn't going to help, and could even be harmful if application code depends on it. Startup ordering won't guarantee execution timing and readiness once components are started, and a component can always fail, timeout, or restart at any time after the initial deployment. Applications should always expect that a dependent component may be unavailable at any time.

That said, there is an opportunity to provide something better than blind retries in application code without the need to carefully order startup operations. A combination of health probes and service mesh functionality should be able to provide the kind of fault tolerance needed.

@xiang90
Copy link
Contributor Author

xiang90 commented Jul 9, 2019

For the second scenario, startup order isn't going to help, and could even be harmful if application code depends on it. Startup ordering won't guarantee execution timing and readiness once components are started, and a component can always fail, timeout, or restart at any time after the initial deployment. Applications should always expect that a dependent component may be unavailable at any time.

Applications usually still need retries to be reliable. But it will solve the blindly retry problem once order is explicitly expressed where retry is only triggered when a real failure happens, dramatically reduce the startup time for applications that contains many components (we have some use cases).

@vturecek
Copy link
Member

vturecek commented Jul 9, 2019

I suspect we'd have to go a few steps further than just start up ordering to significantly reduce the chance of a communication failure during deployment and upgrade, and even then, it would still only be reducing the chances of failure. I think a combination of readiness probes and traffic management traits (retries and back-offs) could handle this in a broader and safer manner.

Reduction of start up time for applications is a more compelling reason to have ordering in my mind. What are some of the use cases you've found where the startup time is reduced dramatically?

@resouer
Copy link
Member

resouer commented Jul 28, 2019

For the second scenario, startup order isn't going to help, and could even be harmful if application code depends on it. Startup ordering won't guarantee execution timing and readiness once components are started, and a component can always fail, timeout, or restart at any time after the initial deployment. Applications should always expect that a dependent component may be unavailable at any time.

@vturecek It reads like we are mixing "cause" with "effect" here.

Start Order is a hint so that a Component has a way to know the application topology if it cares.

After this Component got this information, it can then check readiness gate of its dependencies, or do retry or back-offs, or do whatever implementation details it wants to.

While before that, the Component should have a way to know whether it should start after/before some other Component, that's what's missing in current spec.

@technosophos
Copy link
Contributor

I am not convinced that introducing sophisticated dependency diagrams into Hydra as a first-class concept is a good idea. I don't think the purpose of Hydra is to create an alternative to Terraform or Ansible. It's to define an application model that encourages following specific practices for cloud native and microservice development.

I think it is reasonable to do a sequential release of N components. I think anything beyond that is outside of the scope of the tool, and should be accomplished by using multiple operational configurations and an external tool like Ansible or Terraform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants