Custom Data Training #313

shyama-20 · 2022-04-01T10:07:01Z

❓ Question

Hello,
I have a custom dataset of very small size(only 100s of MBs) and I have been trying to train it using dora run dset=my_dset (as mentioned in #221 ) which lead to the Pytorch error "Killed".
i) Does this command run the training in CPU by default?
ii) If it is running in CPU, why does the process get killed?

I'm using a system with Intel Core i7-10750H CPU @ 2.60GHz × 12 in Ubuntu 20.04 OS. The system has a RAM of 16GB (currently 12GB free). This is the my_dset.yaml I used (as mentioned in #300). In the config.yaml file I have changed dset.sources, epochs, batch_size and augment.

I tried using dora run -d dset=my_dset for training with GPU (NVIDIA GeForce GTX 1650, 4 GB) but ran into "CUDA out of memory" error.

I hope I'm following the steps correctly for training a custom dataset.

Can I get a solution for this issue?

Thanks in advance!

The text was updated successfully, but these errors were encountered:

KimberleyJensen · 2022-04-01T10:34:56Z

@shyama-20 lower your batch_size, you might even have to use batch_size: 1 with 4GB VRAM

shyama-20 · 2022-04-04T05:35:07Z

Thank you @KimberleyJensen, the training is running now.

Can anyone please tell me approximately how much RAM should my GPU have in case of several GBs of training data?

adefossez · 2022-04-04T09:04:31Z

The issue is not so much the size of your training data, but really optimization and loading parameters like batch size, model size, or training segment length. The default segment size is 10 seconds, you can lower that a bit, but not too much (this will in general lower quality), setting for instance

dset:
    segment: 6

You can also try to train a smaller model, lowering a bit the number of channels:

demucs:
    channels: 48
# or for hybrid
hdemucs:
    channels: 32

adefossez · 2022-04-04T09:05:49Z

I dont remember the memory usage, but you can act a bit on either btch size, segment length or nb of channels until it fits.
Also to go lower than 4 batch size, you need to change this value: https://github.com/facebookresearch/demucs/blob/main/conf/config.yaml#L63 to be the actual batch size.

shyama-20 · 2022-04-04T10:35:04Z

Thank you so much for your time @adefossez!

shyama-20 added the question Further information is requested label Apr 1, 2022

shyama-20 closed this as completed Apr 4, 2022

adefossez mentioned this issue Apr 6, 2022

Pretrained Hdemucs model #316

Open

inagoy mentioned this issue Mar 4, 2023

Amazing Project inagoy/drumsep#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom Data Training #313

Custom Data Training #313

shyama-20 commented Apr 1, 2022

KimberleyJensen commented Apr 1, 2022

shyama-20 commented Apr 4, 2022

adefossez commented Apr 4, 2022

adefossez commented Apr 4, 2022

shyama-20 commented Apr 4, 2022

Custom Data Training #313

Custom Data Training #313

Comments

shyama-20 commented Apr 1, 2022

❓ Question

KimberleyJensen commented Apr 1, 2022

shyama-20 commented Apr 4, 2022

adefossez commented Apr 4, 2022

adefossez commented Apr 4, 2022

shyama-20 commented Apr 4, 2022