Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for decoding input with ffmpeg (Linux) #2133

Merged
merged 1 commit into from
May 21, 2024

Conversation

WilliamTambellini
Copy link
Contributor

WIP: for early review only. Do not merge.

  • add cmake option to build with ffmpeg
  • search for ffmpeg at cmake time
  • include ffmpegs/libav headers in main.cpp
    Remaining todos:
  • if needed convert input file to wav 16khz in main.cpp

@WilliamTambellini
Copy link
Contributor Author

@ggerganov @slaren could you please early review before I move on ? Best, WT.

@ggerganov
Copy link
Owner

We can add the FindFFmpeg.cmake script in the cmake folder and use it to find ffmpeg libs

Probably the conversion functionality should be implemented in common.cpp so that it can be reused by all examples, not just main

@WilliamTambellini
Copy link
Contributor Author

WilliamTambellini commented May 13, 2024

We can add the FindFFmpeg.cmake script in the cmake folder and use it to find ffmpeg libs

done

Probably the conversion functionality should be implemented in common.cpp so that it can be reused by all examples, not just main

ok

@WilliamTambellini
Copy link
Contributor Author

@ggerganov
reready for review, tks

@WilliamTambellini
Copy link
Contributor Author

@petterreinholdtsen review please

@WilliamTambellini WilliamTambellini changed the title Add support for decoding input with ffmpeg in main (Linux) Add support for decoding input with ffmpeg (Linux) May 16, 2024
@WilliamTambellini
Copy link
Contributor Author

@arthw review please

CMakeLists.txt Outdated Show resolved Hide resolved
examples/CMakeLists.txt Outdated Show resolved Hide resolved
examples/common.cpp Outdated Show resolved Hide resolved
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the underscore in the filename to a dash for consistency: ffmpeg-transcode.cpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, still uses underscore -> ffmpeg_transcode.cpp

@WilliamTambellini
Copy link
Contributor Author

@ggerganov retouched. Reready for final review.

@ggerganov
Copy link
Owner

Hm, I think you didn't push the correct revision - I don't see any changes since last time

- search for ffmpeg libs/headers at cmake time
- added ffmpeg-transcode.cpp into libcommon if ffmpeg on
- hooked ffmpeg trancoding in common read_wav(...)
- passed test:
./main -m ggml-base.en.bin -f samples/jfk.mp3
@WilliamTambellini
Copy link
Contributor Author

WilliamTambellini commented May 21, 2024

oops indeed @ggerganov . Just pushed the latest retouches.

@ggerganov ggerganov merged commit 1b51fdf into ggerganov:master May 21, 2024
48 checks passed
@WilliamTambellini
Copy link
Contributor Author

tks @ggerganov
Any way to do a new minor release soon (eg 1.6.1) ?

@ggerganov
Copy link
Owner

done

@data-man
Copy link

Unfortunately cannot be built with FFmpeg 7.0.

jiahansu pushed a commit to WiseSync/whisper.cpp that referenced this pull request May 28, 2024
…nov#2133)

- search for ffmpeg libs/headers at cmake time
- added ffmpeg-transcode.cpp into libcommon if ffmpeg on
- hooked ffmpeg trancoding in common read_wav(...)
- passed test:
./main -m ggml-base.en.bin -f samples/jfk.mp3
@clort81
Copy link

clort81 commented Jun 8, 2024

1b51fdf170714dcdd8fb9cfd02dcee684aac6150 is the first bad commit
commit 1b51fdf170714dcdd8fb9cfd02dcee684aac6150
Author: William Tambellini <[email protected]>
Date:   Tue May 21 08:31:41 2024 -0700

    examples : add support for decoding input with ffmpeg (Linux) (#2133)
    
/pr/Neural/Voice_Recognition_Whispr_GGML/good-whisper.cpp/examples/ffmpeg-transcode.cpp: In function ‘int decode_audio(audio_buffer*, s16**, int*)’:
/pr/Neural/Voice_Recognition_Whispr_GGML/good-whisper.cpp/examples/ffmpeg-transcode.cpp:207:5: error: ‘av_register_all’ was not declared in this scope
  207 |     av_register_all(); // from avformat. Still a must-have call for ffmpeg v3! (can be skipped for later versions)
      |     ^~~~~~~~~~~~~~~

ffmpeg                                        7:5.1.4-0+deb12u1 

devuan linux

@Displacer
Copy link

1b51fdf170714dcdd8fb9cfd02dcee684aac6150 is the first bad commit
commit 1b51fdf170714dcdd8fb9cfd02dcee684aac6150
Author: William Tambellini <[email protected]>
Date:   Tue May 21 08:31:41 2024 -0700

    examples : add support for decoding input with ffmpeg (Linux) (#2133)
    
/pr/Neural/Voice_Recognition_Whispr_GGML/good-whisper.cpp/examples/ffmpeg-transcode.cpp: In function ‘int decode_audio(audio_buffer*, s16**, int*)’:
/pr/Neural/Voice_Recognition_Whispr_GGML/good-whisper.cpp/examples/ffmpeg-transcode.cpp:207:5: error: ‘av_register_all’ was not declared in this scope
  207 |     av_register_all(); // from avformat. Still a must-have call for ffmpeg v3! (can be skipped for later versions)
      |     ^~~~~~~~~~~~~~~

ffmpeg                                        7:5.1.4-0+deb12u1 

devuan linux

can probably be fixed with:

#if LIBAVFORMAT_VERSION_MAJOR < 56
av_register_all(); // from avformat. Still a must-have call for ffmpeg v3! (can be skipped for later versions)
#endif

Change 56 to correct ffmpeg version. media-video/ffmpeg-4.4.4 seems to have 56 version (but i am not sure)

bygreencn added a commit to bygreencn/whisper.cpp that referenced this pull request Aug 9, 2024
* tag 'v1.6.2':
  release : v1.6.2
  Revert "whisper : remove extra backend instance (huh?)" (ggerganov#2182)
  server : fix typo (ggerganov#2181)
  ruby : update bindings (ggerganov#2154)
  release : v1.6.1
  examples : add support for decoding input with ffmpeg (Linux) (ggerganov#2133)
  node : add flash_attn param (ggerganov#2170)
  ci: Update build.yml to suppress warnings about node.js versions (ggerganov#2166)
  release : v1.6.0
  whisper : use flash attention (ggerganov#2152)
  talk-llama : reject runs without required arguments (ggerganov#2153)
  sync : ggml
  metal : support FA without mask + add asserts (llama/7278)
  ggml : add RPC backend (llama/6829)
  rm wait() (llama/7233)
  CUDA: add FP32 FlashAttention vector kernel (llama/7188)
  scripts : sync ggml-rpc
iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024
…nov#2133)

- search for ffmpeg libs/headers at cmake time
- added ffmpeg-transcode.cpp into libcommon if ffmpeg on
- hooked ffmpeg trancoding in common read_wav(...)
- passed test:
./main -m ggml-base.en.bin -f samples/jfk.mp3
iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024
…nov#2133)

- search for ffmpeg libs/headers at cmake time
- added ffmpeg-transcode.cpp into libcommon if ffmpeg on
- hooked ffmpeg trancoding in common read_wav(...)
- passed test:
./main -m ggml-base.en.bin -f samples/jfk.mp3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants