Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support attribute docstrings #150

Closed
psirenny opened this issue Jul 25, 2022 · 4 comments
Closed

Support attribute docstrings #150

psirenny opened this issue Jul 25, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@psirenny
Copy link
Contributor

psirenny commented Jul 25, 2022

🚀 Feature request

Support attribute docstrings like SimpleParsing does here. For example:

train.py

from dataclasses import dataclass
from jsonargparse import ArgumentParser
from typing import Literal

@dataclass
class Hyperparameters:
    """Values that control the learning process and determine the values of model parameters."""

    attention_head_count: int = 1
    """Number of attention heads within the MultiHeadAttention units."""

    attention_size: int = 64
    """Input and output size of each transformer unit."""

    batch_size: int = 512
    """Size of the training batch."""

    entropy_regularization: float = 0.001
    """Coefficient used in entropy regularization."""

    fcn_activation_function: Literal["relu", "swish", "tanh"] = "relu"
    """Activation function of each fully connected layer."""

    fcn_layer_count: int = 2
    """Number of fully connected layers."""

    fcn_layer_size: int = 256
    """Size of each fully connected layer."""

    gae_gamma: float = 0.99
    """How much the future is discounted at each timestep."""

    gae_lambda: float = 1.0
    """Smoothing parameter to reduce variance."""

    kl_divergence_initial_coefficient: float = 0.2
    """Initial coefficient used in KL Divergence."""

    kl_divergence_target: float = 0.01
    """Target value of KL Divergence."""

    learning_rate: float = 0.0001
    """Rate at which network weights are adjusted by the loss gradient."""

    loss_clipping_epsilon: float = 0.3
    """Epsilon value in the epsilon clipped surrogate loss function."""

    # …


@dataclass
class TrainOptions:
    """Training options."""

    hyperparameters: Hyperparameters
    """Model hyperparameters."""

    # …


def train(options: TrainOptions):
    print('training with…', options)


def main():
    argument_parser = ArgumentParser()
    argument_parser.add_argument("--options", type=TrainOptions)
    argument_parser.parse_args()
    train(arguments.options)

if __name__ == "__main__":
    main()

Motivation

The goal is to colocate attributes with their docstrings and benefit from less typing, less scrolling, and shorter git diffs 😄.

@psirenny psirenny added the enhancement New feature or request label Jul 25, 2022
@mauvilsa
Copy link
Member

Sounds good! Though note that from what I know, the only non-rejected pep where attribute docstrings are mentioned is PEP 257. Only acceptable is a literal string right after the attribute, like in the example above. Not from comments on the same or previous line as supported in SimpleParsing.

Regarding implementation, this shouldn't be part of jsonargparse. This should be contributed as a new feature to docstring_parser and then update jsonargparse to make use of that new feature. Also ideally the implementation should use ast instead of manually parsing the source like in simple_parsing/docstring.py.

I will create a feature request in docstring_parser to hear their thoughts. But from what I have seen they are very open on receiving new contributions.

@psirenny
Copy link
Contributor Author

Though note that from what I know, the only non-rejected pep where attribute docstrings are mentioned is PEP 257. Only acceptable is a literal string right after the attribute, like in the example above.

Great, that also happens to be my preference 😄.

I will create a feature request in docstring_parser to hear their thoughts.

Awesome, I'll keep an eye out for it!

@mauvilsa
Copy link
Member

Created https://github.com/rr-/docstring_parser/issues/71

mauvilsa added a commit that referenced this issue Sep 13, 2022
- Added way to configure parsing docstrings with a single style.
@mauvilsa
Copy link
Member

mauvilsa commented Sep 13, 2022

I have added the support in commit 36a6a3f. See there the changes in the documentation. It is necessary to parse attribute docstrings to figure out if a class has them, since there is no way of knowing this a priori. This adds overhead for all classes, thus attribute docstrings parsing is disabled by default. To enable you need to add:

from jsonargparse import set_docstring_parse_options
set_docstring_parse_options(attribute_docstrings=True)

Adding this to the train.py script above, the output of python train.py --help gives:

...

Training options:
  --options CONFIG      Path to a configuration file.

Model hyperparameters:
  --options.hyperparameters CONFIG
                        Path to a configuration file.
  --options.hyperparameters.attention_head_count ATTENTION_HEAD_COUNT
                        Number of attention heads within the MultiHeadAttention units. (type: int, default: 1)
  --options.hyperparameters.attention_size ATTENTION_SIZE
                        Input and output size of each transformer unit. (type: int, default: 64)
  --options.hyperparameters.batch_size BATCH_SIZE
                        Size of the training batch. (type: int, default: 512)
  --options.hyperparameters.entropy_regularization ENTROPY_REGULARIZATION
                        Coefficient used in entropy regularization. (type: float, default: 0.001)
  --options.hyperparameters.fcn_activation_function {relu,swish,tanh}
                        Activation function of each fully connected layer. (type: Literal['relu', 'swish', 'tanh'], default: relu)
  --options.hyperparameters.fcn_layer_count FCN_LAYER_COUNT
                        Number of fully connected layers. (type: int, default: 2)
  --options.hyperparameters.fcn_layer_size FCN_LAYER_SIZE
                        Size of each fully connected layer. (type: int, default: 256)
  --options.hyperparameters.gae_gamma GAE_GAMMA
                        How much the future is discounted at each timestep. (type: float, default: 0.99)
  --options.hyperparameters.gae_lambda GAE_LAMBDA
                        Smoothing parameter to reduce variance. (type: float, default: 1.0)
  --options.hyperparameters.kl_divergence_initial_coefficient KL_DIVERGENCE_INITIAL_COEFFICIENT
                        Initial coefficient used in KL Divergence. (type: float, default: 0.2)
  --options.hyperparameters.kl_divergence_target KL_DIVERGENCE_TARGET
                        Target value of KL Divergence. (type: float, default: 0.01)
  --options.hyperparameters.learning_rate LEARNING_RATE
                        Rate at which network weights are adjusted by the loss gradient. (type: float, default: 0.0001)
  --options.hyperparameters.loss_clipping_epsilon LOSS_CLIPPING_EPSILON
                        Epsilon value in the epsilon clipped surrogate loss function. (type: float, default: 0.3)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants