Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify the action items for adapting SageMaker containers in Flyte #453

Closed
4 tasks done
bnsblue opened this issue Aug 5, 2020 · 1 comment
Closed
4 tasks done
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@bnsblue
Copy link
Contributor

bnsblue commented Aug 5, 2020

Related PRs:
https://github.com/lyft/flyteplayground/pull/61
flyteorg/flytekit#156

  • Verify we can pass Env Vars to the SM Job when submitting it. (We can either set SAGEMAKER_PROGRAM env var to a generated script for the SM Task inside the image or to a generic flytekit-provided script that then reads another flyte- specific env var to know which python func to run)

  • How do we reconcile Spark Entrypoint with this behavior
    Looks like this might be easy because SM No longer requires an entrypoint to be set.

  • How do we reconcile FlyteKit commands with this behavior. Related: Implement pyflyte-exec-alternative as the alternative entrypoints for SageMaker Custom Training tasks #479

SM will invoke docker run <image> train

  • flyte_sagemaker_runner.py as the entry point to pass through the pyflyte execute command
@EngHabu EngHabu added this to the 0.7.0 milestone Aug 5, 2020
@bnsblue bnsblue added the enhancement New feature or request label Aug 5, 2020
@bnsblue
Copy link
Contributor Author

bnsblue commented Aug 10, 2020

While there's no way to explicitly inject an env var to a SM job via SM's k8s operator, we can inject a variable to a standalone training job or a training job underlying an hpo job by implicitly injecting it via the hyperParameters field of the TrainingJob CRD or the trainingJobDefinition.staticHyperParameters field of the HyperparameterTuningJob CRD.

For example,

...
 trainingJobDefinition:
    staticHyperParameters:
      - name: __FLYTE_ENTRYPOINT_SELECTOR__
        value: "SAGEMAKER"
...

What SageMaker does is that it will put a summarized map of hyperparameters and values (which includes the variable you want to inject) to the path /opt/ml/input/config/hyperparameters.json inside your container, and their wrapper script parses that file and passes the hyperparameters to the user script as command-line arguments. See the following example log from SageMaker:

SM_HP___FLYTE_ENTRYPOINT_SELECTOR__=SAGEMAKER
...

Invoking script with the following command:
/usr/bin/python3 flyte_entrypoint_selector.py --__FLYTE_ENTRYPOINT_SELECTOR__ SAGEMAKER

So you only need to prepare your entrypoint_selector script in a way that it is ready takes the injected variable from the command line.

@bnsblue bnsblue assigned EngHabu and unassigned bnsblue and kumare3 Aug 11, 2020
@kumare3 kumare3 modified the milestones: 0.7.0, 0.8.0 Sep 2, 2020
@EngHabu EngHabu closed this as completed Sep 2, 2020
@EngHabu EngHabu modified the milestones: 0.8.0, 0.7.0 Sep 2, 2020
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
* Add deck_uri in NodeExecutionEvent

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* lint fix

Signed-off-by: Kevin Su <[email protected]>

* Updated idl

Signed-off-by: Kevin Su <[email protected]>

* wip

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* Fix tests

Signed-off-by: Kevin Su <[email protected]>

* Fix tests

Signed-off-by: Kevin Su <[email protected]>

* Fix tests

Signed-off-by: Kevin Su <[email protected]>

* Fix tests

Signed-off-by: Kevin Su <[email protected]>

* Fix tests

Signed-off-by: Kevin Su <[email protected]>

* Updated idl

Signed-off-by: Kevin Su <[email protected]>

* address comment

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* updates

Signed-off-by: Kevin Su <[email protected]>

* updates

Signed-off-by: Kevin Su <[email protected]>

* More cleanup

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* lint

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* PR comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Unit tests

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* lint

Signed-off-by: Haytham Abuelfutuh <[email protected]>

Co-authored-by: Kevin Su <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 20, 2022
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 9, 2023
* Add deck_uri in NodeExecutionEvent

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* lint fix

Signed-off-by: Kevin Su <[email protected]>

* Updated idl

Signed-off-by: Kevin Su <[email protected]>

* wip

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* Fix tests

Signed-off-by: Kevin Su <[email protected]>

* Fix tests

Signed-off-by: Kevin Su <[email protected]>

* Fix tests

Signed-off-by: Kevin Su <[email protected]>

* Fix tests

Signed-off-by: Kevin Su <[email protected]>

* Fix tests

Signed-off-by: Kevin Su <[email protected]>

* Updated idl

Signed-off-by: Kevin Su <[email protected]>

* address comment

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* updates

Signed-off-by: Kevin Su <[email protected]>

* updates

Signed-off-by: Kevin Su <[email protected]>

* More cleanup

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* lint

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* PR comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Unit tests

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* lint

Signed-off-by: Haytham Abuelfutuh <[email protected]>

Co-authored-by: Kevin Su <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 21, 2023
Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Apr 30, 2024
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Apr 30, 2024
austin362667 pushed a commit to austin362667/flyte that referenced this issue May 7, 2024
robert-ulbrich-mercedes-benz pushed a commit to robert-ulbrich-mercedes-benz/flyte that referenced this issue Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants