-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identify the action items for adapting SageMaker containers in Flyte #453
Comments
While there's no way to explicitly inject an env var to a SM job via SM's k8s operator, we can inject a variable to a standalone training job or a training job underlying an hpo job by implicitly injecting it via the For example,
What SageMaker does is that it will put a summarized map of hyperparameters and values (which includes the variable you want to inject) to the path /opt/ml/input/config/hyperparameters.json inside your container, and their wrapper script parses that file and passes the hyperparameters to the user script as command-line arguments. See the following example log from SageMaker:
So you only need to prepare your |
* Add deck_uri in NodeExecutionEvent Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> * lint fix Signed-off-by: Kevin Su <[email protected]> * Updated idl Signed-off-by: Kevin Su <[email protected]> * wip Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> * Fix tests Signed-off-by: Kevin Su <[email protected]> * Fix tests Signed-off-by: Kevin Su <[email protected]> * Fix tests Signed-off-by: Kevin Su <[email protected]> * Fix tests Signed-off-by: Kevin Su <[email protected]> * Fix tests Signed-off-by: Kevin Su <[email protected]> * Updated idl Signed-off-by: Kevin Su <[email protected]> * address comment Signed-off-by: Kevin Su <[email protected]> * fix tests Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> * fix tests Signed-off-by: Kevin Su <[email protected]> * fix tests Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> * fix tests Signed-off-by: Kevin Su <[email protected]> * updates Signed-off-by: Kevin Su <[email protected]> * updates Signed-off-by: Kevin Su <[email protected]> * More cleanup Signed-off-by: Haytham Abuelfutuh <[email protected]> * lint Signed-off-by: Haytham Abuelfutuh <[email protected]> * PR comments Signed-off-by: Haytham Abuelfutuh <[email protected]> * Unit tests Signed-off-by: Haytham Abuelfutuh <[email protected]> * lint Signed-off-by: Haytham Abuelfutuh <[email protected]> Co-authored-by: Kevin Su <[email protected]>
Signed-off-by: Flyte-Bot <[email protected]> Co-authored-by: flyte-bot <[email protected]>
Signed-off-by: Niels Bantilan <[email protected]>
* Add deck_uri in NodeExecutionEvent Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> * lint fix Signed-off-by: Kevin Su <[email protected]> * Updated idl Signed-off-by: Kevin Su <[email protected]> * wip Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> * Fix tests Signed-off-by: Kevin Su <[email protected]> * Fix tests Signed-off-by: Kevin Su <[email protected]> * Fix tests Signed-off-by: Kevin Su <[email protected]> * Fix tests Signed-off-by: Kevin Su <[email protected]> * Fix tests Signed-off-by: Kevin Su <[email protected]> * Updated idl Signed-off-by: Kevin Su <[email protected]> * address comment Signed-off-by: Kevin Su <[email protected]> * fix tests Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> * fix tests Signed-off-by: Kevin Su <[email protected]> * fix tests Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> * fix tests Signed-off-by: Kevin Su <[email protected]> * updates Signed-off-by: Kevin Su <[email protected]> * updates Signed-off-by: Kevin Su <[email protected]> * More cleanup Signed-off-by: Haytham Abuelfutuh <[email protected]> * lint Signed-off-by: Haytham Abuelfutuh <[email protected]> * PR comments Signed-off-by: Haytham Abuelfutuh <[email protected]> * Unit tests Signed-off-by: Haytham Abuelfutuh <[email protected]> * lint Signed-off-by: Haytham Abuelfutuh <[email protected]> Co-authored-by: Kevin Su <[email protected]>
Signed-off-by: Flyte-Bot <[email protected]> Co-authored-by: flyte-bot <[email protected]>
Signed-off-by: Niels Bantilan <[email protected]>
Signed-off-by: Niels Bantilan <[email protected]>
Signed-off-by: Niels Bantilan <[email protected]>
Signed-off-by: Niels Bantilan <[email protected]>
Related PRs:
https://github.com/lyft/flyteplayground/pull/61
flyteorg/flytekit#156
Verify we can pass Env Vars to the SM Job when submitting it. (We can either set
SAGEMAKER_PROGRAM
env var to a generated script for the SM Task inside the image or to a generic flytekit-provided script that then reads another flyte- specific env var to know which python func to run)How do we reconcile Spark Entrypoint with this behavior
Looks like this might be easy because SM No longer requires an entrypoint to be set.
How do we reconcile FlyteKit commands with this behavior. Related: Implement pyflyte-exec-alternative as the alternative entrypoints for SageMaker Custom Training tasks #479
SM will invoke
docker run <image> train
The text was updated successfully, but these errors were encountered: