-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Dynamic parsing without .parse_known_args() #133
Comments
Would be nicer if you took inspiration from LightningCLI than from the old way that used plain argparse. In particular I don't like the
This is what subcommands are for, and I don't really get your concern. Subcommands are a common approach for many command line tools, so much that it will be more likely that your cli's users will require less effort to learn how to use it if you follow the subcommand approach than doing something different. To me seems an unfounded concern to think that the subcommand will be overlooked. This mostly depends on your design and what names you give to the subcommands. Take for example git, which works like If you do use subcommands, your example with jsonargparse would be convenient to structure as follows: from jsonargparse import ArgumentParser
class A:
def __init__(self, op0: int = 1, op1: str = 'op1 default'):
...
class B:
def __init__(self, op0: float = 0.2, op2: int = 3, op3: bool = False):
...
def main():
parser_a = ArgumentParser()
parser_a.add_class_arguments(A)
parser_b = ArgumentParser()
parser_b.add_class_arguments(B)
parser = ArgumentParser()
parser.add_argument("--gop1")
subcommands = parser.add_subcommands()
subcommands.add_subcommand('select_A', parser_a)
subcommands.add_subcommand('select_B', parser_b)
config = parser.parse_args() Doing this, then you would run it for example as |
Thank you for your very extensive reply! You are completely right on the add_argsparse_arguments() methods, I also planned to remove them. In my case, the classes are two nn.Modules (let's call them ModelA and ModelB). I would now like to select the correct model with --model ModelA or --model ModelB, as I think this makes the calls more clear and not just writing "ModelA" as a singleton. In your example with the git commands you are definitely right, but in my case I would prefer to dynamically select the correct model with --model ModelA and then be able to further configure the model by --model.layers X --model.ensemble_size Y. Looking through the LightningCLI it seems possible there, so is my best bet directly switching to LightningCLI, or is this also possible in pure jsonargparse? Thanks |
Without subcommands what you want is not possible. If which arguments will be available depend on the value of one argument has, then the If you are using pytorch-lightning for certain better if you use LightningCLI. There the selection of models is not through subcommands. How to achieve this with jsonargparse is a bit more complex. But with LightningCLI this is already implemented, so it will be less effort for you. |
I know this issue is closed, but I have a related question. I'm creating a CLI application for a reinforcement learning model that accepts an arbitrary list of policies/models. I'd like for the script to be compatible with AWS SageMaker Hyperparameter tuning. I considered a script that accepted arguments like this: train.py \
--policy_ids='["red", "blue"]' \
--policies.hidden_layer_size='{"red": 128, "blue": 512}' \ # Tunable parameters
train.py \
--policy_ids='["alien", "predator", "marine"]' \
--policies.hidden_layer_size='{"alien": 128, "predator": 512, "marine": 256}' \ # Tunable parameters But the above doesn't work because SageMaker tunable parameters can't output JSON. It can only output integers, floats, and categorical strings. Given that constraint, I considered a script like this instead: train.py \
--policy_ids='["red", "blue"]' \
--red_policy.hidden_layer_size=128 \
--blue_policy.hidden_layer_size=512
train.py \
--policy_ids='["alien", "predator", "marine"]' \
--alien_policy.hidden_layer_size=128 \
--predator_policy.hidden_layer_size=512 \
--marine_policy.hidden_layer_size=256 But this requires an initial
But this is suboptimal. It requires allocating a fixed number of arguments. It also pollutes the help text and requires a lot of boilerplate to convert the hardcoded arguments into a list. I was wondering if you have any suggestions (other than abandoning SageMaker 😆). |
@psirenny I haven't used AWS SageMaker Hyperparameter tuning. Can you please explain a bit how it works. You give it a template of the command it should run? |
It takes configuration for a tuning job like this: sagemaker_client.create_hyper_parameter_tuning_job(
HyperParameterTuningJobName = "red_blue_tuning_job",
HyperParameterTuningJobConfig = {
"HyperParameterTuningJobObjective": {
"MetricName": "episode_reward_mean",
"Type": "Maximize"
},
"ParameterRanges": {
"IntegerParameterRanges": [
"Name": "red_policy.hidden_layer_size",
"MinValue": 32,
"MaxValue": 2048,
],
"IntegerParameterRanges": [
"Name": "blue_policy.hidden_layer_size",
"MinValue": 32,
"MaxValue": 2048,
],
},
"ResourceLimits": {
"MaxNumberOfTrainingJobs": 100,
"MaxParallelTrainingJobs": 5,
},
"Strategy": "Bayesian",
},
TrainingJobDefinition = {
"AlgorithmSpecification": {
"TrainingImage": docker_image,
"TrainingInputMode": "File"
},
"ResourceConfig": {
"InstanceCount": 1,
"InstanceType": "ml.c4.2xlarge",
},
"StaticHyperParameters": {
"batch_size": 2000,
"learning_rate": 0.00005,
"policy_ids": "'[\"red\", \"blue\"]'",
},
"StoppingCondition": {
"MaxRuntimeInSeconds": 60 * 60
},
# …
},
) and spawns multiple SageMaker training jobs from that configuration. Each training job invokes a dockerized CLI program with arguments serialized from the above parameters. Kind of like this: docker run <docker_image> train \
--batch_size=2000 \
--learning_rate=0.00005 \
--policy_ids='["red", "blue"]' \
--red_policy.hidden_layer_size=49 \ # parameter being tuned
--blue_policy.hidden_layer_size=63 # parameter being tuned I suppose the real issue is in the way that SageMaker training jobs serialize parameters to CLI arguments. As I type this, it occurs to me that SageMaker also writes parameters to a JSON file. That JSON file is available to the docker container. Perhaps the best way forward is to create a CLI that loads the JSON file instead. |
Several features have been added to allow configuring complex type hints with individual command line arguments, for example List append and Class type and sub-classes#command-line. There is a similar pending feature, to add individual items to an argument with type train.py \
--policy_ids='["red", "blue"]' \
--policy_hidden_layer_size.red=128 \
--policy_hidden_layer_size.blue=512 |
I have implemented (commit 2397f90) the support for individual dict items from command line, see docs-latest#dict-items. This is not yet released but you could try it out if you install from source. |
Hello, I love your library, however, I struggle with your design decision of removing .parse_known_args() as a function from the parser.
My current setup (in simplified pseudocode form) is like this:
This could now be called with
python bla.py --op4 --sub_select A --sub_op0 --sub_op1
.As the fix with the underscores was really ugly, I was now hoping, that I could easily restructure this so sub_opX could be now specified by --sub.opX (with nested dicts after parse_args()).
So a call should now look like this
python bla.py --sub_select A --sub.op0 --sub.op1
or with something like subcommands
python bla.py --sub_select A --op0 --op1
How can I achieve such a behavior without having access to .parse_known_args()?
As far as I understand with subcommands all I could achieve would be something like this
python bla.py --op4 A --op0 --op1
However, I would like this dynamic parsing to depend on the evaluated value of the argument --sub_select and not on a singleton argument A, which could stand somewhere in a list of arguments and be easily overlooked.
My inspiration for this is the PyTorch Lightning Guideline.
Thanks
The text was updated successfully, but these errors were encountered: