[BUG🐛] Size match in converted xttsv2 models #43

scruffynerf · 2024-12-18T22:06:02Z

Bug Description

[rank0]:   File "mypath/lib/python3.10/site-packages/auralis/core/tts.py", line 85, in _load_model
[rank0]:     return MODEL_REGISTRY[config['model_type']].from_pretrained(model_name_or_path, **kwargs)
[rank0]:   File "mypath/lib/python3.10/site-packages/auralis/models/xttsv2/XTTSv2.py", line 299, in from_pretrained
[rank0]:     model.load_state_dict(hifigan_state)
[rank0]:   File "mypath/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2584, in load_state_dict
[rank0]:     raise RuntimeError(
[rank0]: RuntimeError: Error(s) in loading state_dict for XTTSv2Engine:
[rank0]: 	size mismatch for text_embedding.weight: copying a param with shape torch.Size([6153, 1024]) from checkpoint, the shape in current model is torch.Size([6681, 1024]).
[rank0]: 	size mismatch for text_head.weight: copying a param with shape torch.Size([6153, 1024]) from checkpoint, the shape in current model is torch.Size([6681, 1024]).
[rank0]: 	size mismatch for text_head.bias: copying a param with shape torch.Size([6153]) from checkpoint, the shape in current model is torch.Size([6681]).

## Minimal Reproducible Example

use the current converter script with either
HF's drewThomasson/Morgan_freeman_xtts_model
or
HF's scruffynerf/xtts-vincent

(both of these work, and were trained using https://github.com/daswer123/xtts-finetune-webui )

and then try to use/load the resulting converted files

The text was updated successfully, but these errors were encountered:

scruffynerf · 2024-12-18T22:53:40Z

Ah ha, figured it out.

Coqui xtts2 v2.0.2 differs from v2.0.3 in the # of tokens

https://huggingface.co/coqui/XTTS-v2/commit/6b8036b35d787cf43d18d640587956b9db8fd1b8

the above models were training on v2.0.2

The convertor script needs to be aware of this, since any difference will cause it to not work once converted, since the config/etc don't match the actual trained gpt section of the model

Correct me if I'm wrong, but basically, either this means the gpt config must be adjusted in this case, since it no longer matches the stock config/etc. OR you should just fail the convertor, and complain that only v2.0.3 models can be converted.

C00reNUT · 2024-12-19T10:16:29Z

same issue here with 2.0.0 model version used for training, this would also maybe explain the difference in quality/output #27 when I am converting coqui 2.0.0 model using provided script...

elvinzade · 2024-12-23T08:48:45Z

Error

[rank0]: raise RuntimeError(
[rank0]: RuntimeError: Error(s) in loading state_dict for XTTSv2Engine:
[rank0]: size mismatch for text_embedding.weight: copying a param with shape torch.Size([6681, 1024]) from checkpoint, the shape in current model is torch.Size([8155, 1024]).
[rank0]: size mismatch for text_head.weight: copying a param with shape torch.Size([6681, 1024]) from checkpoint, the shape in current model is torch.Size([8155, 1024]).
[rank0]: size mismatch for text_head.bias: copying a param with shape torch.Size([6681]) from checkpoint, the shape in current model is torch.Size([8155]).

Explanation

same issue here i trained for a new languague when i run checkpoint_converter.py script it download json file for xtts config and gpt tokenizer config then i update json file with new languague vocabulary. But i also get following error.

mlinmg · 2024-12-23T13:28:21Z

Cool I didn't knew about this, I'll be looking into it
@C00reNUT did you still face different quality with the new model conversion script? there was a typo so it was actually overwriting the converted checkpoints with the default ones, we think that was the cause of error

C00reNUT · 2024-12-23T13:48:36Z

@mlinmg I have tried the new conversion script, but after conversion I needed manually replace the tokenizer from version 2.0.3 to 2.0.0 which is the model version I used for finetuning. And adjust the setting accordingly to the 2.0.0. repo. The quality is much better but it is still worse than in original...

I couldn't figure it why maybe you are calculating latents from reference differently or something. Or I am missing some settings that are different between implementations...

@elvinzade you also need to change the tokenizer size in all of the config.json and some python files where it is referenced, just search for the number in all files and change to the version corresponding to the coqui model repo version on huggingface

scruffynerf · 2024-12-24T01:31:46Z

I don't believe it's quite a drop-in 'downgrade' to go backwards.

elvinzade · 2024-12-24T04:21:24Z

@C00reNUT thanks for your proposal. I appreciate your proposal regarding the Coqui model. However, I've trained a new language that is not part of the languages introduced in the official Coqui model repository. During this process, I extended the tokenizer vocabulary size to accommodate the new language. The model should work seamlessly with the extended tokenizer and support the newly added language. If I attempt to adjust the tokenizer size, I lose the ability to use the newly added language feature.

scruffynerf added the bug Something isn't working label Dec 18, 2024

scruffynerf mentioned this issue Dec 18, 2024

v.2.0.2 token count differs from v.2.0.3 daswer123/xtts-finetune-webui#87

Open

scruffynerf mentioned this issue Dec 26, 2024

mismatch config #50

Closed

mlinmg mentioned this issue Dec 28, 2024

dynamic checkpoint converter #52

Merged

mlinmg closed this as completed in b76cd42 Dec 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG🐛] Size match in converted xttsv2 models #43

[BUG🐛] Size match in converted xttsv2 models #43

scruffynerf commented Dec 18, 2024

scruffynerf commented Dec 18, 2024 •

edited

Loading

C00reNUT commented Dec 19, 2024 •

edited

Loading

elvinzade commented Dec 23, 2024

mlinmg commented Dec 23, 2024

C00reNUT commented Dec 23, 2024

scruffynerf commented Dec 24, 2024

elvinzade commented Dec 24, 2024

[BUG🐛] Size match in converted xttsv2 models #43

[BUG🐛] Size match in converted xttsv2 models #43

Comments

scruffynerf commented Dec 18, 2024

Bug Description

scruffynerf commented Dec 18, 2024 • edited Loading

C00reNUT commented Dec 19, 2024 • edited Loading

elvinzade commented Dec 23, 2024

Error

Explanation

mlinmg commented Dec 23, 2024

C00reNUT commented Dec 23, 2024

scruffynerf commented Dec 24, 2024

elvinzade commented Dec 24, 2024

scruffynerf commented Dec 18, 2024 •

edited

Loading

C00reNUT commented Dec 19, 2024 •

edited

Loading