Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement or refactor the SageMaker flyteplugin to enable Custom Training Job #451

Closed
6 tasks done
bnsblue opened this issue Aug 5, 2020 · 0 comments
Closed
6 tasks done
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@bnsblue
Copy link
Contributor

bnsblue commented Aug 5, 2020

Related PRs

flyteorg/flyteplugins#113

- [ ] Inject an env var into hyperparameter to tell the selector script which entrypoint to choose

  • Inject the entire command passed in from Flytekit as one env var
  • Change the image lookup logic to take runnable container into consideration
  • Change the Output Path prefix to RawOutputPrefix
  • Build a separate CustomTrainingJob plugin, in which we need to process each literal in the input literal map, and inject inputs properly: inject blob-like inputs to the input channels, and inject others as command-line arguments
    (We currently do not know how properly pass complicated data type such as Generic (structs or maps) to SageMaker. An easy way might be using JSON string and let pyflyte-execute-alt use JSON parser when the input is a Generic type.)
  • Let Flytekit take care of the output handling for now. Do not write anything to the output channel (i.e., do not write to the /opt/ml/model path in the container)
  • Testing end2end

Handling the flyte container's args and environment

The only way to pass non-blob-type inputs to our container via SageMaker is to use the hyperparameter field of the SageMaker CRD. With that, the inputs will be passed in as command-line arguments. That means we have to pass in the container's args and environment via command line.

To pass in

env1=val1 env2=val2 service_venv pyflyte-execute --task-module blah --task-name bloh --output-prefix s3://fake-bucket --inputs s3://fake-bucket

Our flyteplugin need to rewrite it into and our runner script needs to parse it from the following format:

 --__FLYTE_ENV_VAR_env1__ val1 --__FLYTE_ENV_VAR_env2__ val2
 --__FLYTE_CMD_0_service_venv__ __FLYTE_CMD_DUMMY_VALUE__
 --__FLYTE_CMD_1_pyflyte-execute__ __FLYTE_CMD_DUMMY_VALUE__
 --__FLYTE_CMD_2_--task-module__ __FLYTE_CMD_DUMMY_VALUE__
 --__FLYTE_CMD_3_blah__ __FLYTE_CMD_DUMMY_VALUE__
 --__FLYTE_CMD_4_--task-name__ __FLYTE_CMD_DUMMY_VALUE__
 --__FLYTE_CMD_5_bloh__ __FLYTE_CMD_DUMMY_VALUE__
 --__FLYTE_CMD_6_--output-prefix__ __FLYTE_CMD_DUMMY_VALUE__
 --__FLYTE_CMD_7_s3://fake-bucket__ __FLYTE_CMD_DUMMY_VALUE__
 --__FLYTE_CMD_8_--inputs__ __FLYTE_CMD_DUMMY_VALUE__
 --__FLYTE_CMD_9_s3://fake-bucket__ __FLYTE_CMD_DUMMY_VALUE__

We added the prefix and suffix in order to prepare for the future hyperparameter jobs.

@EngHabu EngHabu added this to the 0.7.0 milestone Aug 5, 2020
@bnsblue bnsblue added the enhancement New feature or request label Aug 5, 2020
@kumare3 kumare3 modified the milestones: 0.7.0, 0.8.0 Sep 2, 2020
@bnsblue bnsblue closed this as completed Sep 3, 2020
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
* updated flytestdlib

Signed-off-by: Daniel Rammer <[email protected]>

* set flytestdlib to new release

Signed-off-by: Daniel Rammer <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
…yteorg#451)

Setting different values in multiple parts of the configs applied resulted in the least specific spec being applied
Added extra tests verifying mainly interruptible override fallbacks including application configuration

Signed-off-by: Nick Müller <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 20, 2022
…#451)

* chore(tsc): exclude test/stories/mock files only from build

Signed-off-by: Nastya Rusina <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 9, 2023
* updated flytestdlib

Signed-off-by: Daniel Rammer <[email protected]>

* set flytestdlib to new release

Signed-off-by: Daniel Rammer <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 21, 2023
…yteorg#451)

Setting different values in multiple parts of the configs applied resulted in the least specific spec being applied
Added extra tests verifying mainly interruptible override fallbacks including application configuration

Signed-off-by: Nick Müller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants