This repository has been archived by the owner on Oct 9, 2023. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Scale out with propeller manager and workflow sharding (#351) * added 'manager' command Signed-off-by: Daniel Rammer <[email protected]> * using go routine and timer for manager loop Signed-off-by: Daniel Rammer <[email protected]> * moved manager loop out of cmd and into pkg directory Signed-off-by: Daniel Rammer <[email protected]> * detecting missing replicas Signed-off-by: Daniel Rammer <[email protected]> * moved extracting replica from pod name to new function Signed-off-by: Daniel Rammer <[email protected]> * creating managed flytepropeller pods Signed-off-by: Daniel Rammer <[email protected]> * refactored configuration Signed-off-by: Daniel Rammer <[email protected]> * removed regex parsing for replica - checking for existance with fully qualified pod name Signed-off-by: Daniel Rammer <[email protected]> * mocked out shard strategy abstraction Signed-off-by: Daniel Rammer <[email protected]> * adding arguments to podspec for ConsistentHashingShardStrategy Signed-off-by: Daniel Rammer <[email protected]> * updated import naming Signed-off-by: Daniel Rammer <[email protected]> * moved manager to a top-level package Signed-off-by: Daniel Rammer <[email protected]> * added shard strategy to manager configuration Signed-off-by: Daniel Rammer <[email protected]> * setting shard key label selector on managed propeller instances Signed-off-by: Daniel Rammer <[email protected]> * fixed random lint issues Signed-off-by: Daniel Rammer <[email protected]> * split pod name generate to separate function to ease future auto-scaler implementation Signed-off-by: Daniel Rammer <[email protected]> * cleaned up pod label selector Signed-off-by: Daniel Rammer <[email protected]> * delete pods on shutdown Signed-off-by: Daniel Rammer <[email protected]> * added prometheus metric reporting Signed-off-by: Daniel Rammer <[email protected]> * updated manager run loop to use k8s wait.UntilWithContext Signed-off-by: Daniel Rammer <[email protected]> * moved getKubeConfig into a shared package Signed-off-by: Daniel Rammer <[email protected]> * assigning shard and namespace labels on FlyteWorkflow Signed-off-by: Daniel Rammer <[email protected]> * implement NamespaceShardStrategy Signed-off-by: Daniel Rammer <[email protected]> * implemented NamespaceShardStrategy Signed-off-by: Daniel Rammer <[email protected]> * fixed shard label Signed-off-by: Daniel Rammer <[email protected]> * added comments Signed-off-by: Daniel Rammer <[email protected]> * checking for existing pods on startup Signed-off-by: Daniel Rammer <[email protected]> * handling delete of non-existent pod Signed-off-by: Daniel Rammer <[email protected]> * changes ConsistentHashing name to Random - because that's what it really is Signed-off-by: Daniel Rammer <[email protected]> * implemented EnableUncoveredReplica configuration option Signed-off-by: Daniel Rammer <[email protected]> * added leader election to manager using existing propeller config Signed-off-by: Daniel Rammer <[email protected]> * fixed disable leader election in managed propeller pods Signed-off-by: Daniel Rammer <[email protected]> * removed listPods function Signed-off-by: Daniel Rammer <[email protected]> * added leader election to mitigate concurrent modification issues Signed-off-by: Daniel Rammer <[email protected]> * enabled pprof to profile resource metrics Signed-off-by: Daniel Rammer <[email protected]> * added 'manager' target to Makefile to start manager in development mode (similar to existing server) Signed-off-by: Daniel Rammer <[email protected]> * added shard strategy test for computing key ranges Signed-off-by: Daniel Rammer <[email protected]> * fixed key range computation Signed-off-by: Daniel Rammer <[email protected]> * implemented project and domain shard types Signed-off-by: Daniel Rammer <[email protected]> * returning error on out of range podIndex during UpdatePodSpec call on shard strategy Signed-off-by: Daniel Rammer <[email protected]> * fixed random lint issues Signed-off-by: Daniel Rammer <[email protected]> * added manager tests Signed-off-by: Daniel Rammer <[email protected]> * fixed lint issues Signed-off-by: Daniel Rammer <[email protected]> * added doc comments on exported types and functions Signed-off-by: Daniel Rammer <[email protected]> * exporting ComputeKeyRange function and changed adding addLabelSelector function name to addLabelSelectorIfExists to better reflect functionality Signed-off-by: Daniel Rammer <[email protected]> * adding pod template resource version and shard config hash annotations to fuel automatic pod management on updates Signed-off-by: Daniel Rammer <[email protected]> * removed pod deletion on manager shutdown Signed-off-by: Daniel Rammer <[email protected]> * cleaned up unit tests and lint Signed-off-by: Daniel Rammer <[email protected]> * updated getContainer function to retrive flytepropeller container from pod spec using container name instead of command Signed-off-by: Daniel Rammer <[email protected]> * removed addLabelSelectorIfExists function call Signed-off-by: Daniel Rammer <[email protected]> * changed bytes.Buffer from a var to declaring with new Signed-off-by: Daniel Rammer <[email protected]> * created a new shardstrategy package Signed-off-by: Daniel Rammer <[email protected]> * generating mocks for ShardStrategy to decouple manager package tests from shardstrategy package tests Signed-off-by: Daniel Rammer <[email protected]> * fixed lint issues Signed-off-by: Daniel Rammer <[email protected]> * changed shard configuration defintions and added support for wildcard id in EnvironmentShardStrategy Signed-off-by: Daniel Rammer <[email protected]> * updated documentation Signed-off-by: Daniel Rammer <[email protected]> * fixed lint issues Signed-off-by: Daniel Rammer <[email protected]> * setting managed pod owner references Signed-off-by: Daniel Rammer <[email protected]> * updated documentation Signed-off-by: Daniel Rammer <[email protected]> * fixed a few nits Signed-off-by: Daniel Rammer <[email protected]> * delete pods with failed state Signed-off-by: Daniel Rammer <[email protected]> * changed ShardType type to int instead of string Signed-off-by: Daniel Rammer <[email protected]> * removed default values in manager config Signed-off-by: Daniel Rammer <[email protected]> * updated config_flags with pflags generation Signed-off-by: Daniel Rammer <[email protected]> Signed-off-by: Haytham Abuelfutuh <[email protected]> * Create codeql-analysis.yml Signed-off-by: Haytham Abuelfutuh <[email protected]> * Handle code quality issue Signed-off-by: Haytham Abuelfutuh <[email protected]> * check boundaries Signed-off-by: Haytham Abuelfutuh <[email protected]> * 0 is ok Signed-off-by: Haytham Abuelfutuh <[email protected]> * Use ParseUint instead Signed-off-by: Haytham Abuelfutuh <[email protected]> * bump for DCO Signed-off-by: Haytham Abuelfutuh <[email protected]> Co-authored-by: Dan Rammer <[email protected]>
- Loading branch information