-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scale to SPI and tests #234
Conversation
* @param id the app deployment id, as returned by {@link #deploy} | ||
* @param replicas the desired number of replicas | ||
*/ | ||
void scale(String id, int replicas); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markpollack suggests void scale(String deploymentId, int desiredCount)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should support (instead or in-addition) a +/- N option... otherwise it's the user's responsibility to know the current state before calling this, and that seems to create a race condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markfisher I'm not sure I follow. If the user would scale, don't they already know the current count? This follows the protocol of k8s and PCF just tell me how many instances you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markfisher - I agree with your suggestion now that we are revisiting this, especially for autoscaling use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure about my own suggestion after reading this: https://hackernoon.com/level-triggering-and-reconciliation-in-kubernetes-1f17fe30333d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting a new count is easiest way to get started as then you don't get race conditions. We had scaling in our good ol' yarn deployer and at least in a hadoop world you had to set a new count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubernetes is the same (as described in the edge vs level triggering article I posted above), and that avoids the race conditions which are exacerbated by the fact that scaling can take some time, during which multiple requests for changes may occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True new count is idempotent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markfisher , thanks for sharing this article! Imo, we should stick with the level
approach. It would be enough if the scale
function is idempotent - usually available OOTB in the underlying platform such as k8s.
Avoid keeping state (e.g. changes relative to previous state) or try to be too smart. It is impossible to protect agains all concurrency or stability issues from within the scale function alone. I used to be good in tuning PID controllers and solving Laplace transformations to stabilize control systems. It's all gone now but it makes me aware that one can not avoid stability issues without considering all components in the control loop (in our case auto-scaling chain).
So keep the scale
simple without integral
or derivative
functionality ;) If needed this can be added externally.
Having said this we still can have some configurable guards such as max-instance count or so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dturanski , here is a snipped from my s1p demo. A web hook is configured to receive 2 types of (json) alerts, coming from a Prometheus AlertManager: (1) ThroughtputDeltaHigh - the throughput between the source and processor (e.g. fraud-detection
) apps is higher than a given threshold; (2) ThroughtputDeltaZero - the throughputs of the source and the processor apps are the same.
When the delta is high the fraud-detection
processor is scaled to 4 instances and reduced to 1 otherwise. I believe this is aligned with the level approach. We express the desired state/level and let the platform reach it eventually. Most of the PID control logic is done though the AlertManager configuration.
The scale API looks like scale(streamName, AppName, Count)
@RequestMapping(value = "/alert", method = RequestMethod.POST)
public void alertMessage(@RequestBody List<Map<String, Object>> alerts) {
for (Map<String, Object> alert : alerts) {
String alertStatus = fn(from alert);
String alertName = fn(alert);
String streamName = fn(alert) ;
if (alertStatus.equals("firing")) {
if (alertName.equals("ThroughtputDeltaHigh")) {
kubeScaleService.scale(streamName, "fraud-detection", 4);
} else if (alertName.equals("ThroughtputDeltaZero")) {
kubeScaleService.scale(streamName, "fraud-detection", 1);
}
}
}
}
We will revisit in the context of Spinnaker integration in the future. |
Resolves #62