Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed bad memory access exception on ios 17 #3527

Closed
wants to merge 9 commits into from

Conversation

l3utterfly
Copy link
Contributor

On iOS 17 (tested on iPhone 15 Pro), I keep getting Bad Memory Access exception everywhere.

Digging deeper, the changes in this PR fixes it.

To be honest, I have no idea why it broke in the first place. It seems we can't use the struct constructor shorthand?

Writing the construction out by manually allocating memory, then zero-ing out the arrays fixes this. Tested on iPhone 12-15.

Perhaps someone more knowledgeable than me on memory management can shed some light on this.

@staviq
Copy link
Contributor

staviq commented Oct 7, 2023

( @cebtenzzre )

That looks like a magic number to me:

struct ggml_tensor * allocated_tensors[1024];

Whether it's related or not, I don't think it's supposed to be there, in this form.

@slaren
Copy link
Collaborator

slaren commented Oct 7, 2023

@staviq that buffer is only used for debugging, and it is perfectly fine to leave it at a fixed size.

Regarding this PR, I don't think that this is a reasonable approach to solve this. We need to understand what is the cause of the issue.

@l3utterfly
Copy link
Contributor Author

The only thing I can think of right now is the original method uses a compound literal assignment. This creates a temporary variable on the stack (maybe? see: https://stackoverflow.com/questions/70689138/compound-literals-in-c-do-they-create-duplicate-copies).

When debugging through the code, I see the fixed arrays are relatively large (hash table is fixed at 32771 length, with each element containing a ggml_tensor). Could an implementation detail on iOS stops the optimisation and makes it so a temporary variable is created on the stack first then copied to the heap?

Stack size is limited on iPhone per thread, so I can see some issues if this was duplicated on the stack first.

@slaren
Copy link
Collaborator

slaren commented Oct 7, 2023

That looks like could be the case. I would expect the compiler to optimize away the copies in the stack and write directly to the heap, but we cannot rely on that. The source of the issue is the increase of GGML_MAX_NODES to support training, but I think we need a better way to do this. There was some discussion about possible solutions for supporting larger graphs in ggerganov/ggml#467.

@jhen0409
Copy link
Collaborator

jhen0409 commented Oct 7, 2023

I found the problem is also happened on the server example with -O0 (LLAMA_DEBUG=1), so I think it is not a iOS specific issue.

@staviq
Copy link
Contributor

staviq commented Oct 8, 2023

I found the problem is also happened on the server example with -O0 (LLAMA_DEBUG=1), so I think it is not a iOS specific issue.

You mean it happens with server example on not iOS? I mean, people run debug builds all the time, I wouldn't dismiss platform specific problem just yet.

Can you post a bit more context for this server problem ?

@jhen0409
Copy link
Collaborator

jhen0409 commented Oct 8, 2023

You mean it happens with server example on not iOS? I mean, people run debug builds all the time, I wouldn't dismiss platform specific problem just yet.

Can you post a bit more context for this server problem ?

To be clarify, the problem should be -O0 within a thread (It should be similar to what was discussed above), so I mentioned this can also reproduce in server example on Mac. The swift package currently should use -O0 (default) as well, so I think this is why it was first defined as an iOS-specific issue.

We should used -O3 -DNDEBUG by default on make/cmake, not debug builds.

@l3utterfly
Copy link
Contributor Author

l3utterfly commented Oct 8, 2023 via email

@jhen0409
Copy link
Collaborator

jhen0409 commented Oct 8, 2023

This issue was observed even on TestFlight build. Those should be compiled with the release flags.

Is it possible to override the swift package unsafe flags by project? or have you already used -O3 on a fork? Otherwise it should always use -O0 even on release builds.

@l3utterfly
Copy link
Contributor Author

l3utterfly commented Oct 8, 2023 via email

@l3utterfly
Copy link
Contributor Author

The swift package currently should use -O0 (default) as well

Can you show me where this is defined? I looked in the swift package, and my xcode project settings, the release build has the normal -Os optimisations enabled.

@jhen0409
Copy link
Collaborator

jhen0409 commented Oct 8, 2023

The swift package currently should use -O0 (default) as well

Can you show me where this is defined? I looked in the swift package, and my xcode project settings, the release build has the normal -Os optimisations enabled.

The O0 is a default build setting, and currently the swift package is not defined any unsafe flag for O. As I know the root project compile flags will not apply to the package dependencies, but I'm not very sure.

We can add -O3 unsafe flag to the swift package and I think it should be.

EDIT: Also, I tested add llama_save_session_file in server.cpp, it also have this problem even with -O3. (The problem with -O0 is for llama_decode)

l3utterfly added a commit to l3utterfly/llama.cpp that referenced this pull request Oct 12, 2023
@l3utterfly
Copy link
Contributor Author

Further testing confirms adding -O3 flags fixes the memory issues in ggml.c and ggml-alloc.. However, save/load session still does not work.

This PR's changes to llama.cpp fixes the issue (tested on iPhone 15, iOS 17.0.3).

@ggerganov
Copy link
Owner

However, save/load session still does not work.

What is the error that you get?

@l3utterfly
Copy link
Contributor Author

l3utterfly commented Nov 3, 2023

However, save/load session still does not work.

What is the error that you get?

@ggerganov The app crashes with no errors. This could be due to the fact it's built in release mode (with the -O3 flag).

This PR is tested and confirmed to fix the issue.

The changes in this PR tentatively points to the fact that using the default constructor for ggml_cgraph constructs the variable on the stack because of the compile time defined array sizes in the struct fields (which are too big for the iOS). Re-writing it to manually alloc on the heap seems to fix this issue.

@ggerganov
Copy link
Owner

Can you test #3912 and see if it works there? The cgraph stack allocation has been reimplemented to work on the heap. If it works, then you'll just have to wait to get #3912 merged

@l3utterfly
Copy link
Contributor Author

Sure, will check. Looking at the code in the #3912, it should fix this issue. So I'll wait for the PR to be merged.

@l3utterfly l3utterfly closed this Nov 30, 2023
@l3utterfly l3utterfly deleted the mem-fix-ios17 branch November 30, 2023 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants