Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DTW memory access #2012

Merged
merged 2 commits into from
Apr 9, 2024

Conversation

bradmurray-dt
Copy link
Contributor

@bradmurray-dt bradmurray-dt commented Apr 1, 2024

The current release with dtw support can result in memory corruption due to out of bounds reading/writing. This is an attempt to correct this.

To verify the problem, enable address sanitization -fsanitize=address in both compilers and linkers and run the main example with dtw enabled. I tested using the small model. ./main -m ggml-small.bin -dtw small samples/jfk.wav

With no code changes or compiler changes (compiled using make on an m1 mbp), the command runs, but ends in a segfault.

With the address sanitizer enabled, execution stops before transcription

==74546==ERROR: AddressSanitizer: heap-use-after-free on address 0x000107303bd4 at pc 0x0001030a32a4 bp 0x00016cf944f0 sp 0x00016cf944e8
WRITE of size 4 at 0x000107303bd4 thread T0
    #0 0x1030a32a0 in whisper_init_state+0x2158 (main:arm64+0x10023b2a0)
    #1 0x1030bce34 in whisper_init_from_file_with_params+0xe4 (main:arm64+0x100254e34)
    #2 0x102e7e3c4 in main+0xe30 (main:arm64+0x1000163c4)
    #3 0x1861320dc  (<unknown module>)

0x000107303bd4 is located 20 bytes inside of 340-byte region [0x000107303bc0,0x000107303d14)

With this change and the address sanitizer enabled, the transcription completes, and there are no errors

@denersc Can you confirm that this fix matches your intent here?

@denersc
Copy link
Contributor

denersc commented Apr 1, 2024

Hey @bradmurray-dt, thanks for the catch! I was able to reproduce it here. Surprised that multiplication didn't break my previous tests.

Indeed that line is wrong. I think the correction you placed might also be incorrect, since when when do mask_data.data() + pos, mask_data.data()is a pointer to float, so pos would be multiplied by sizeof(float) one more time during the sum because of pointer arithmetic.

I've reworked the code segment to remove these not-so-clear pointer arithmetic and not use memcpy, since we are only setting one float at a time anyway. Ran with sanitizer and seems OK. Could you apply this diff on your end and see if it seems fixed?

diff --git a/whisper.cpp b/whisper.cpp
index d50c788..2f810d1 100644
--- a/whisper.cpp
+++ b/whisper.cpp
@@ -1148,26 +1148,31 @@ static bool aheads_masks_init(
     }
 
     // Set data on mask tensors
-    // Since this must be backend agnostic, we get tensor data with
-    // ggml_backend_tensor_get, copy our desired values and send it back
-    // to backend with ggml_backend_tensor_set
+    // Since this must be backend agnostic, we write our desired values on mask_data,
+    // and send it to backend with ggml_backend_tensor_set.
+    // Each mask in N_HEADS*N_ALIGNMENT_HEADS, one per text layer containing alignment
+    // heads. Each row of the mask "marks" one alignment head. E.g. if some text layer
+    // has a total of 10 heads and of those, heads 0,5,6 are alignment heads, the mask
+    // should read:
+    // 1 0 0 0 0 0 0 0 0 0
+    // 0 0 0 0 0 1 0 0 0 0
+    // 0 0 0 0 0 0 1 0 0 0
     std::vector<float> mask_data;
     for (int64_t il = 0; il < n_text_layer; ++il) {
         if (aheads_masks.m[il] != nullptr) {
             auto aheads = get_alignment_heads_by_layer(cparams, il, n_text_layer, n_head);
 
-            size_t data_size = aheads_masks.m[il]->ne[0] * aheads_masks.m[il]->ne[1] * sizeof(float);
+            size_t data_size = aheads_masks.m[il]->ne[0] * aheads_masks.m[il]->ne[1];
+            size_t data_size_bytes = data_size * sizeof(float);
             mask_data.resize(data_size);
-            ggml_backend_tensor_get(aheads_masks.m[il], mask_data.data(), 0, data_size);
-            memset(mask_data.data(), 0, data_size);
 
+            std::fill(mask_data.begin(), mask_data.end(), 0);
             for (size_t ih = 0; ih < aheads.size(); ++ih) {
-                size_t pos = (aheads[ih] + (ih * aheads_masks.m[il]->ne[0] * aheads[ih]));
-                float v = 1.0f;
-                memcpy(mask_data.data() + pos, &v, sizeof(float));
+                size_t pos = (aheads[ih] + (ih * aheads_masks.m[il]->ne[0]));
+                mask_data[pos] = 1.0f;
             }
 
-            ggml_backend_tensor_set(aheads_masks.m[il], mask_data.data(), 0, data_size);
+            ggml_backend_tensor_set(aheads_masks.m[il], mask_data.data(), 0, data_size_bytes);
         }
     }
 

I've additionally added some more comment about those masks (how they should look like) so if anybody needs to tinker with this in the future it should be clearer what that piece of code is trying to achieve.

@denersc
Copy link
Contributor

denersc commented Apr 1, 2024

Just adding the same diff but apply-able to your branch as a commit, since i pasted the previous one from whisper.cpp main:

0001-Fix-mistakes-on-aheads_masks_init-that-caused-memory.txt

@bradmurray-dt
Copy link
Contributor Author

Just adding the same diff but apply-able to your branch as a commit, since i pasted the previous one from whisper.cpp main:

0001-Fix-mistakes-on-aheads_masks_init-that-caused-memory.txt

Thanks! I'm going to run some tests with the above change and I'll let you know if I find anything.

@bradmurray-dt
Copy link
Contributor Author

I ran a fair bit of audio through this with address sanitization enabled. I'm not seeing any memory errors, and the results look better in a few places. @denersc If it works for you, I'll merge in your version and mark this as ready for review.

@denersc
Copy link
Contributor

denersc commented Apr 3, 2024

Happy to know!

If it works for you, I'll merge in your version and mark this as ready for review.

Sure, go ahead 😀

@rotemdan
Copy link
Contributor

rotemdan commented Apr 5, 2024

I'm currently testing the --dtw option on Windows 11 x64.

For the larger models like small and medium, I'm experiencing intermittent termination of the program during startup, (before anything is loaded), but apparently only when --dtw is specified.

There is no error message. I don't know if this is related to this particular pull request.

The version I'm using is a debug build of a clone of the current repository state, compiled with Visual Studio 2022.

Here's the command line that triggers the problem:

main --debug-mode --file timit1.wav --model models/ggml-medium.en.bin --dtw medium.en

And here's the amount of output I'm getting some part of the time (and sometimes it continues and succeeds):

whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-medium.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 4 (medium)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:      CPU total size =  1533.14 MB
whisper_model_load: model size    = 1533.14 MB
whisper_init_state: kv self size  =  132.12 MB
whisper_init_state: kv cross size =  147.46 MB

And here it terminates abruptly.

There's no visible error message. Strange, even with a debug build and debug mode enabled.

For that reason I thought I might post it here, instead of opening a new issue. Also due to the fact that it sometimes works fine and sometimes doesn't. I'm seeing the same issue with the large and small models, but more rarely with base and tiny.

@bradmurray-dt
Copy link
Contributor Author

@rotemdan This is extremely likely to be related to the memory issues in the current release. I'll push the previous changes and convert to a regular PR in the morning. I have been running a large amount of tests with multiple models with this update and address sanitization enabled and it appears solved.

@rotemdan
Copy link
Contributor

rotemdan commented Apr 6, 2024

I've applied the patch, and now continuously testing it with various models and inputs.

So far I have not seen the issue return.

I've confirmed, for the same models and inputs, the problem often occurs without the patch and stops occurring with it.

@bradmurray-dt bradmurray-dt marked this pull request as ready for review April 6, 2024 15:00
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you guys for analysing and fixing this!

@ggerganov ggerganov merged commit 5275074 into ggerganov:master Apr 9, 2024
jiahansu pushed a commit to WiseSync/whisper.cpp that referenced this pull request Apr 17, 2024
* Fix DTW memory access

* Memory fix - Apply changes from denersc
viktor-silakov pushed a commit to viktor-silakov/whisper_node_mic.cpp that referenced this pull request May 11, 2024
* Fix DTW memory access

* Memory fix - Apply changes from denersc
iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024
* Fix DTW memory access

* Memory fix - Apply changes from denersc
iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024
* Fix DTW memory access

* Memory fix - Apply changes from denersc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants