Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Asteroid CLI design #201

Closed
jonashaag opened this issue Aug 13, 2020 · 12 comments
Closed

RFC: Asteroid CLI design #201

jonashaag opened this issue Aug 13, 2020 · 12 comments

Comments

@jonashaag
Copy link
Collaborator

jonashaag commented Aug 13, 2020

Here's my draft for Asteroid CLI design. I guess it's a radical change from what we have at the moment...

Let's discuss only about design here, not implementation. I have already given implementation some thought as well and already have a prototype for some parts of the design, but let's agree to a design first.

Please don't be afraid to critise what you don't like. It is likely that I forgot or did not know of some use cases when coming up with the design.


Design goals

  • Separate models, datasets, and experiments (= a model trained on a dataset) from each other.
  • Deduplicate common code.
  • Provide a consistent and convenient interface for users.

API design

Starting from scratch

Assuming you start with an empty hard disk, and want to train a model from scratch.

Steps:

  • Install Asteroid
  • Create dataset config
  • Create model config
  • Run training
  • Run evaluation

Create dataset config (Download and prepare dataset)

Prepare = Create mixtures, create JSON files, etc.

Download dataset from official URL:

$ asteroid data librimix download
Downloading LibriMix dataset to /tmp/librimix-raw...

Prepare dataset, if necessary. Some datasets don't need preparation, there the prepare cmd is absent.

$ asteroid data librimix prepare --n-speakers 2 --raw /tmp/librimix-raw --target ~/asteroid-datasets/librimix2
Found LibriMix dataset in /tmp/librimix-raw.
Creating LibriMix2 (16 kHz) in ~/asteroid-datasets/librimix2...  # "prepare" never modifies the raw downloads; always creates a copy.
Wrote dataset config to ~/asteroid-datasets/librimix2/dataset.yml.

Generated dataset.yml:

dataset: "asteroid.data.LibriMix"
n_speakers: 2
train_dir: data/tt
val_dir: data/cv
...
sample_rate: 16000

Pass options to prepare:

$ asteroid data librimix prepare --n-speakers 3 --sample-rate 8000 --raw /tmp/librimix-raw --target ~/asteroid-datasets/librimix3
Found LibriMix dataset in /tmp/librimix-raw.
Creating LibriMix2 (8 kHz) in ~/asteroid-datasets/librimix3...  # "prepare" never modifies the raw downloads; always creates a copy.
Wrote dataset config to ~/asteroid-datasets/librimix3/dataset.yml.

dataset.yml:

dataset: "asteroid.data.LibriMix"
n_speakers: 3
sample_rate: 8000
train_dir: data/tt
val_dir: data/cv
...

Create model config

Models have a separate config from datasets (and from experiments, see below). Create one with configure:

$ asteroid model convtasnet configure > ~/asteroid-models/convtasnet-default.yml
$ asteroid model convtasnet configure --n-filters 1337 > ~/asteroid-models/convtasnet-larger.yml

Generated convtasnet-default.yml:

n_filters: 512
kernel_size: 16
...

Run training

$ asteroid train --model ~/asteroid-models/convtasnet-default.yml --data ~/asteroid-datasets/librimix2/dataset.yml
Saving training parameters to exp/train_convtasnet_exp1/experiment.yml
Training epoch 0/100...

Generated experiment.yml (Experiment = train or eval) contains model info, dataset info, training info:

data:
  # (Copy of dataset.yml)
  dataset: "asteroid.data.librimix"
  n_speakers: 3
  sample_rate: 8000
  train_dir: data/tt
  val_dir: data/cv
  ...
model:
  # (Copy of convtasnet-default.yml)
  model: "asteroid.models.ConvTasNet"
  n_filters: 512
  kernel_size: 16
  ...
training:
  optim:
    optimizer: "adam"
    ...
  batch_size: 5
  max_epochs: 100
  ...

Change model, dataset, or training params in place:

$ asteroid train --model ~/asteroid-models/convtasnet-default.yml --data ~/asteroid-datasets/librimix2/dataset.yml --n-filters 1234 --sample-rate 8000 --batch-size 5 --max-epochs 50
Saving training parameters to exp/train_convtasnet_exp2/experiment.yml
Warning: Resampling dataset to 8 kHz.
Training epoch 0/50...

Continue training from checkpoint:

$ asteroid train --continue exp/train_convtasnet_exp1/
Creating experiment folder exp/train_convtasnet_exp3/...
Saving training parameters to exp/train_convtasnet_exp3/experiment.yml
Continuing training from checkpoint 42.
Training epoch 43/100...

Run evaluation

$ asteroid eval --experiment exp/train_convtasnet_exp3/
Saving training parameters to exp/train_convtasnet_exp4/experiment.yml
Evaluating ConvTasNet on LibriMix2...

Can change training params for eval:

$ asteroid eval --experiment exp/train_convtasnet_exp3/ --batch-size 10
Saving training parameters to exp/eval_convtasnet_exp5/experiment.yml
Evaluating ConvTasNet on LibriMix2...

Eval on different dataset:

$ asteroid eval --experiment exp/train_convtasnet_exp3/ --data ~/asteroid-datasets/wsj0
Saving training parameters to exp/eval_convtasnet_exp6/experiment.yml
Evaluating ConvTasNet on WSJ0...

Starting from pretrained

$ asteroid download-pretrained "mpariente/DPRNN-LibriMix2-2020-08-13"
Downloading DPRNN trained on LibriMix2 to exp/pretrained_dprnn_exp7...
$ ls exp/pretrained_dprnn_exp7
- dprnn_best.pth
- experiment.yml
...

Eval pretrained:

$ asteroid eval --experiment exp/train_convtasnet_exp7/ --data ~/asteroid-datasets/wsj0
Saving training parameters to exp/eval_convtasnet_exp7/experiment.yml
Evaluating DPRNN on WSJ0...

Finetune pretrained on custom dataset:

$ asteroid train --continue exp/pretrained_dprnn_exp7 --data /my/dataset.yml --batch-size 123
...
@popcornell
Copy link
Collaborator

We will also do an inference engine I think with the CLI interface smthing like:

$ asteroid infer --experiment exp/train_convtasnet_exp7/ --folder /media/sam/separate --output /folder/separated --window_size 32000

the idea is parsing everything recursively and then using overlap_add to separate every file and save it to an output folder.

@popcornell
Copy link
Collaborator

Are you going to use click for CLI interface ?

@jonashaag
Copy link
Collaborator Author

jonashaag commented Aug 13, 2020

Let's discuss implementation later, but yes, my current protoype involves dataclasses (Python 3.x+), click, autoclick (optional; to derive CLI options from dataclasses), and lightning's DataModules

@mpariente
Copy link
Collaborator

First, thanks a lot for the detailed design explanation, the time and effort you put into it !

I really love the idea of having a powerful CLI in asteroid, and I'm sure it can be beneficial.
I'm convinced that having a CLI for dataset preparation and inference will only be beneficial, I'm not 100% sure about evaluation and I have more than mixed feelings about training. I can detail more tomorrow but it has to do with the genericity/expressivity compomise.

But first, I have few questions. In your opinion

  • Would this design replace the recipes? Or live next to the recipes?
  • Who is this interface intended for? What type of users?
  • What would be the steps to include a new model?
  • Is the underlying training script shared by all the training experiments?

Actually, it all boils down to listing use cases that we want to support in asteroid and what type of users we are "targeting", and this is not an easy question.

@jonashaag
Copy link
Collaborator Author

jonashaag commented Aug 14, 2020

(Added the dataset: and model: strings to the YAML files above, which I forgot.)

It would replace most recipes (all that we can convert to the new design).

Users: People running inference on pretrained models; people training their own models; people working on new models to contribute to Asteroid; people working on new private models/datasets. So I guess everyone :-)

In the design there isn't really a training script anymore. The logic has been split entirely between dataset and model. Maybe this assumption doesn't work well with some use cases; if so, can you give an example?

Below are some ideas on what the code can look like.

Steps to include new model

Create new Python module (either in asteroid.* or somewhere else*) with Torch code; model config definition (what params the model expects); a bit of "registration" code to make it available to the CLI; done. Sketch:

class MyModel(...):
    name = "mymodel"

    def __init__(self, config: MyModelConfig):
        self.config = config

    def forward(self, ...):
        ...


class MyModelConfig:
    # Note: This is also the single source of truth for what the model's config schema is.
    # CLI and YAML format is derived from this
    def __init__(self, n_filters=64, ...):
        self.n_filters = n_filters
        ....


asteroid.register_model(MyModel, MyModelConfig)

*) If outside of asteroid.*, I'm not sure yet how we make the CLI find the module in the first place. Maybe something like asteroid --load-module myprivatestuff.models train --model-name "myprivatestuff.models.MyModel" ....

Steps to include new dataset

Same as with model: Create new module with Torch code; dataset config definition (e.g. number of speakers); registration code; optionally preparation code (download + mix); optionally additional CLI commands that are specific to the dataset. Sketch:

class MyDataset(...):
    name = "mydata"

    def __init__(self, config: MyDatasetConfig):
        self.config = config

    # Required for all modules
    def get_datamodule(self) -> LightningDataModule:
        ...

    # Optional but implemented by Asteroid
    def download(self):
        ...

    # Custom and unknown to Asteroid
    def custom_print_stats(self):
        ...


class MyDatasetConfig:
    def __init__(self, n_speakers=2, ...):
        self.n_speakers = n_speakers
        ....


cli = asteroid.register_dataset(MyDataset, MyDatasetConfig)

# Register custom additional command
@cli.command()
def print_stats(...):
    """Can be run with: asteroid data mydata print-stats"""
    dataset = asteroid_load_dataset_from_cli_args()
    dataset.custom_print_stats()

@mpariente
Copy link
Collaborator

This sounds really cool !

In the design there isn't really a training script anymore. The logic has been split entirely between dataset and model. Maybe this assumption doesn't work well with some use cases; if so, can you give an example?

I didn't mean training script per se, but we need to do the training at one point. Where does this live? I guess a general and configurable class in asteroid?

Just to be clear, I see the advantage in having generic tools that enable training any model on any dataset with no code duplicates, I'm just a bit worried about how general might hurt flexibility / ease of getting into the code. So I'm trying to find the edge cases where having such a general framework would hurt us.

  • How would you handle a custom training loop?
  • Dataset can have any number of outputs, ones that have to be passed to the model, ones to the loss, others that might control the training loop, others for logging. How do you pass these to the right objects?
  • How do you differentiate between single/multi channel models? Between single/multi channel datasets?
  • How would you handle the TwoStep recipe for example? Is it in the training config so that every model can use this? Or it is not supported because too specific?

@jonashaag
Copy link
Collaborator Author

Good questions. I'll try to come up with solutions for these.

@mpariente
Copy link
Collaborator

mpariente commented Aug 14, 2020

I'm afraid to become the Keras of source separation, where research gets difficult, where it's harder and harder to get in the source code.

I understand that the current code has a lot of duplicate but it has one advantage: it's pretty easy to get into because the organization is trivial, not too abstracted; so you can easily change things from inside the recipes to make new ones.

Yes, training TasNet, ConvTasNet or DPRNNTasNet on wsj0-mix, WHAM, WHAMR or LibriMix doesn't require a large/complex code because the recipes are really the same, and writing a CLI for it would be very nice for the users.
How would this core code grow with time? How easy would it be to modify it?
I'm a bit afraid to create an API which is far from Modules and Datasets because it takes time to learn.

In my opinion, having a CLI for "easy" use-cases would be great but it would be hard to have it general, flexible, and modifiable (with ease).

If we are able to achieve this, with easy strategies to add custom datasets/models/losses/training loops/, we can replace all the recipes by the CLI.

If not, how many of the recipes can we translate? Does it make sense to have two different functioning modes for Asteroid (verbosy recipes with duplicated code -that we can also try to reduce- vs CLI recipes) ? How much effort would it require?

@jonashaag
Copy link
Collaborator Author

jonashaag commented Aug 14, 2020

  • How would you handle a custom training loop?

Register a custom CLI command from a custom module and use that. It may reuse some parts of the default training loop. For example if the "official" train command is defined as

# asteroid.cli.default_commands

def train(model: ModelConfig, data: DataConfig, continue_from: Experiment = None, ...):
    ...

asteroid.register_cmd("train")

You can define your own:

# myprivatesetuff.cli_commands

TwoStepStep = typing.Literal["filterbank", "separator"]

def mytrain(step: TwoStepStep, ...):
    """Use like this: asteroid train-twostep --step filterbank|separator ..."""

asteroid.register_cmd("train-twostep")

Btw, I don't insist on doing this on CLI. Instead of asteroid data librimix download we can also say that users should run import asteroid.data.librimix; asteroid.data.librimix.download("/target/path") in a Python shell. But then I'm not sure how to easily change params from other scripts. Pass them via env variables? Tell users to roll their own CLI parsing?

  • Dataset can have any number of outputs, ones that have to be passed to the model, ones to the loss, others that might control the training loop, others for logging. How do you pass these to the right objects?
  • How do you differentiate between single/multi channel models? Between single/multi channel datasets?

I think we can cover the common cases in the general training code and special cases need to be special cased in the recipes.

  • How would you handle the TwoStep recipe for example? Is it in the training config so that every model can use this? Or it is not supported because too specific?

I think things like that should live in their own recipes. But those recipes could be written in a way that allows for any dataset to be used.

For TwoStep in particular, another idea would be to split it into two models, since it essentialy is a two-model approach:

$ asteroid model twostep-filterbank configure --n-filters 1234 > twostep-filterbank.yml
$ asteroid train --model twostep-filterbank.yml --data ~/asteroid-datasets/librimix2/dataset.yml
Creating experiment folder exp/blabla1/...
Training TwoStep filterbank on LibriMix2...
...
Wrote filterbank to exp/blabla1/filterbank.
$ asteroid model twostep-separator configure --filterbank-path exp/blabla1/filterbank > twostep-separator.yml
$ asteroid train --model twostep-separator.yml --data ~/asteroid-datasets/librimix2/dataset.yml
Creating experiment folder exp/blabla2/...
Training TwoStep separator on LibriMix2...

@mpariente
Copy link
Collaborator

But then I'm not sure how to easily change params from other scripts. Pass them via env variables? Tell users to roll their own CLI parsing?

What do you mean by that?

I think we can cover the common cases in the general training code and special cases need to be special cased in the recipes.

This is also what I believe. While having CLI for the common cases will be extremely useful, I think we cannot completely replace recipes.

So, should CLI and recipes share the same abstraction? i.e should recipes reuse the config organization you mentioned?

I would suggest to start with

  • Models

    • TasNet
    • ConvTasNet
    • DPRNN
    • Sudormrf
    • DPTNet
  • Datasets

    • wsj-mix
    • wham
    • whamr
    • LibriMix
    • DNS Challenge's dataset
    • VoiceBank + Demande
    • Maybe musdb?
    • Maybe FUSS?

@jonashaag
Copy link
Collaborator Author

jonashaag commented Aug 14, 2020

What do you mean by that?

I was just thinking loudly how users were to implement a convenient way to change params when they implement recipes without using the new Asteroid CLI “framework” (ie recipes in the current form, as simple scripts). Like changing number of filters in an entirely custom recipe. Usually you’d use CLI params for that, but that means you’ll have to use a CLI parser, ... so as soon as you start adding a non-trivial amount of options to your recipe you’ll want to use the Asteroid CLI “framework”.

Anyways, I think we agree sufficiently for me to come up with a first implementation that covers some of the common use cases, models and datasets. So my goal would be to make these use cases really simple and the “framework” reasonably extensible, and for anything that’s to complex to integrate into the “framework” currently we’ll simply keep it as is for now. (Maybe we can move some duplicated code from these special recipes to Asteroid “core” anyways, without changing their code.)

@mpariente
Copy link
Collaborator

Thanks for the clarification.
I do agree with you and cannot wait to see the first implementation !

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants