Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with the huggingface hf-speech-bench #1623

Open
valaofficial opened this issue Oct 30, 2023 · 2 comments
Open

Error with the huggingface hf-speech-bench #1623

valaofficial opened this issue Oct 30, 2023 · 2 comments

Comments

@valaofficial
Copy link

I followed this blog post and used it to fine tune the whisper model using a custom data set, but after training when trying to run this command

trainer.push_to_hub(**kwargs)

it throws this error

HTTPError                                 Traceback (most recent call last)
 
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
    260     try:
--> 261         response.raise_for_status()
    262     except HTTPError as e:

 9 frames
HTTPError: 400 Client Error: Bad Request for url: https://huggingface.co/api/models/valacodes/whisper-small-hausa/commit/main

The above exception was the direct cause of the following exception:

BadRequestError                           Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
    297                 f"\n\nBad request for {endpoint_name} endpoint:" if endpoint_name is not None else "\n\nBad request:"
    298             )
--> 299             raise BadRequestError(message, response=response) from e
    300 
    301         # Convert HTTPError into a HfHubHTTPError to display request information

BadRequestError:  (Request ID: Root=1-65271724-79a6b33830e49217395944e2;736a08e9-3998-4e6f-b43e-86df049f04ed)

Bad request for commit endpoint:
"model-index[0].results[0].dataset.config" must be a string

and visiting the hf-speech-bench webpage shows this

TypeError: string indices must be integers
Traceback:
File "/home/user/.local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
    exec(code, module.__dict__)
File "/home/user/app/app.py", line 143, in <module>
    dataframe = get_data()
File "/home/user/.local/lib/python3.10/site-packages/streamlit/runtime/legacy_caching/caching.py", line 715, in wrapped_func
    return get_or_create_cached_value()
File "/home/user/.local/lib/python3.10/site-packages/streamlit/runtime/legacy_caching/caching.py", line 696, in get_or_create_cached_value
    return_value = non_optional_func(*args, **kwargs)
File "/home/user/app/app.py", line 107, in get_data
    for row in parse_metrics_rows(meta):
File "/home/user/app/app.py", line 72, in parse_metrics_rows
    lang = result["dataset"]["args"]["language"]
```
@valaofficial valaofficial changed the title Error with the whisper ASR model Error with the huggingface hf-speech-bench Oct 30, 2023
@ahmed8047762
Copy link

I am having same issue.

@thiagobarbosa
Copy link

thiagobarbosa commented Jan 15, 2024

I was having the same issue and the reason was something that looked unrelated: apparently there's a bug when fetching the dataset metadata if you're using multiple processors on any function applied on the dataset (like map or filter).
That ended up failing the push_to_hub parse on the dataset information.
The discussion happened here and I there's a PR on the way.

In the meantime, it's working for me when using num_proc=1 on my maps and filters. Not sure if it also works if you remove the dataset information all together from the **kwargs that is used on trainer.push_to_hub(**kwargs). So you can give that a shot as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants