-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix DTW memory access #2012
Fix DTW memory access #2012
Conversation
Hey @bradmurray-dt, thanks for the catch! I was able to reproduce it here. Surprised that multiplication didn't break my previous tests. Indeed that line is wrong. I think the correction you placed might also be incorrect, since when when do I've reworked the code segment to remove these not-so-clear pointer arithmetic and not use memcpy, since we are only setting one float at a time anyway. Ran with sanitizer and seems OK. Could you apply this diff on your end and see if it seems fixed? diff --git a/whisper.cpp b/whisper.cpp
index d50c788..2f810d1 100644
--- a/whisper.cpp
+++ b/whisper.cpp
@@ -1148,26 +1148,31 @@ static bool aheads_masks_init(
}
// Set data on mask tensors
- // Since this must be backend agnostic, we get tensor data with
- // ggml_backend_tensor_get, copy our desired values and send it back
- // to backend with ggml_backend_tensor_set
+ // Since this must be backend agnostic, we write our desired values on mask_data,
+ // and send it to backend with ggml_backend_tensor_set.
+ // Each mask in N_HEADS*N_ALIGNMENT_HEADS, one per text layer containing alignment
+ // heads. Each row of the mask "marks" one alignment head. E.g. if some text layer
+ // has a total of 10 heads and of those, heads 0,5,6 are alignment heads, the mask
+ // should read:
+ // 1 0 0 0 0 0 0 0 0 0
+ // 0 0 0 0 0 1 0 0 0 0
+ // 0 0 0 0 0 0 1 0 0 0
std::vector<float> mask_data;
for (int64_t il = 0; il < n_text_layer; ++il) {
if (aheads_masks.m[il] != nullptr) {
auto aheads = get_alignment_heads_by_layer(cparams, il, n_text_layer, n_head);
- size_t data_size = aheads_masks.m[il]->ne[0] * aheads_masks.m[il]->ne[1] * sizeof(float);
+ size_t data_size = aheads_masks.m[il]->ne[0] * aheads_masks.m[il]->ne[1];
+ size_t data_size_bytes = data_size * sizeof(float);
mask_data.resize(data_size);
- ggml_backend_tensor_get(aheads_masks.m[il], mask_data.data(), 0, data_size);
- memset(mask_data.data(), 0, data_size);
+ std::fill(mask_data.begin(), mask_data.end(), 0);
for (size_t ih = 0; ih < aheads.size(); ++ih) {
- size_t pos = (aheads[ih] + (ih * aheads_masks.m[il]->ne[0] * aheads[ih]));
- float v = 1.0f;
- memcpy(mask_data.data() + pos, &v, sizeof(float));
+ size_t pos = (aheads[ih] + (ih * aheads_masks.m[il]->ne[0]));
+ mask_data[pos] = 1.0f;
}
- ggml_backend_tensor_set(aheads_masks.m[il], mask_data.data(), 0, data_size);
+ ggml_backend_tensor_set(aheads_masks.m[il], mask_data.data(), 0, data_size_bytes);
}
}
I've additionally added some more comment about those masks (how they should look like) so if anybody needs to tinker with this in the future it should be clearer what that piece of code is trying to achieve. |
Just adding the same diff but apply-able to your branch as a commit, since i pasted the previous one from whisper.cpp main: 0001-Fix-mistakes-on-aheads_masks_init-that-caused-memory.txt |
Thanks! I'm going to run some tests with the above change and I'll let you know if I find anything. |
I ran a fair bit of audio through this with address sanitization enabled. I'm not seeing any memory errors, and the results look better in a few places. @denersc If it works for you, I'll merge in your version and mark this as ready for review. |
Happy to know!
Sure, go ahead 😀 |
I'm currently testing the For the larger models like There is no error message. I don't know if this is related to this particular pull request. The version I'm using is a debug build of a clone of the current repository state, compiled with Visual Studio 2022. Here's the command line that triggers the problem:
And here's the amount of output I'm getting some part of the time (and sometimes it continues and succeeds):
And here it terminates abruptly. There's no visible error message. Strange, even with a debug build and debug mode enabled. For that reason I thought I might post it here, instead of opening a new issue. Also due to the fact that it sometimes works fine and sometimes doesn't. I'm seeing the same issue with the |
@rotemdan This is extremely likely to be related to the memory issues in the current release. I'll push the previous changes and convert to a regular PR in the morning. I have been running a large amount of tests with multiple models with this update and address sanitization enabled and it appears solved. |
I've applied the patch, and now continuously testing it with various models and inputs. So far I have not seen the issue return. I've confirmed, for the same models and inputs, the problem often occurs without the patch and stops occurring with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you guys for analysing and fixing this!
* Fix DTW memory access * Memory fix - Apply changes from denersc
* Fix DTW memory access * Memory fix - Apply changes from denersc
* Fix DTW memory access * Memory fix - Apply changes from denersc
* Fix DTW memory access * Memory fix - Apply changes from denersc
The current release with dtw support can result in memory corruption due to out of bounds reading/writing. This is an attempt to correct this.
To verify the problem, enable address sanitization
-fsanitize=address
in both compilers and linkers and run themain
example with dtw enabled. I tested using the small model../main -m ggml-small.bin -dtw small samples/jfk.wav
With no code changes or compiler changes (compiled using
make
on an m1 mbp), the command runs, but ends in a segfault.With the address sanitizer enabled, execution stops before transcription
With this change and the address sanitizer enabled, the transcription completes, and there are no errors
@denersc Can you confirm that this fix matches your intent here?