-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Segmentation fault after #5411 #5469
Comments
Can confirm that I too got a segfault when built with DLLAMA_SYCL_F16=ON. I will rebuild with OFF and report if it fails too. |
Segfaults even without that option. |
@akarshanbiswas @chsasank could you please re-try with this branch : https://github.com/abhilash1910/llama.cpp/tree/fix_sycl_arc (branch: fix_sycl_arc) and let me know if this addresses the issue ? |
@abhilash1910 Nope. Still crashing with segmentation fault with or without -DLLAMA_SYCL_F16=ON. Here is what I can get:
|
sycl support is broken otherwise. See upstream issue: ggerganov/llama.cpp#5469 Signed-off-by: Ettore Di Giacinto <[email protected]>
I can confirm here, JFYI pinning to commit f026f81 seems to work for me (tested with Intel Arc a770) |
sycl support is broken otherwise. See upstream issue: ggerganov/llama.cpp#5469 Signed-off-by: Ettore Di Giacinto <[email protected]>
@abhilash1910 that commit fails here with a core-dump |
Got better backtrace this time:
|
Thanks for the traceback. As @mudler confirmed that f026f81 seems to be building correctly which includes #5411 already. For the time being I would recommend rolling back to the commit until a fix is applied. |
@abhilash1910 Yes. It builds correctly but ends up in segfault with or without If you need any help in testing. Please do ping me. |
That is quite weird. It actually works here, it does not only build. If you want to try to reproduce, this is the LocalAI container image having llama.cpp pinned at f026f81 : You can run phi-2 configured for sycl with (f32): docker run -e DEBUG=true -ti -v $PWD/models:/build/models -p 8080:8080 -v /dev/dri:/dev/dri quay.io/go-skynet/local-ai@sha256:c6b5dfaff64c24a02f1be8f8e1cb5c0837b130b438753e49b349d70e3d6d1916 https://gist.githubusercontent.com/mudler/103de2576a8fd4b583f9bd53f4e4cefd/raw/9181d4add553326806b8fdbf4ff0cd65d2145bff/phi-2-sycl.yaml to test it: curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "phi-2",
"messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}]
}' To double-check the version you can run in the container:
I am actually running this in kubernetes, any images from master are pinned to that commit, leaving also my deployment for reference: apiVersion: v1
kind: Namespace
metadata:
name: local-ai
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: models-pvc
namespace: local-ai
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: local-ai
namespace: local-ai
labels:
app: local-ai
spec:
selector:
matchLabels:
app: local-ai
replicas: 1
template:
metadata:
labels:
app: local-ai
name: local-ai
spec:
containers:
- env:
- name: DEBUG
value: "true"
name: local-ai
args:
# phi-2 configuration
- https://gist.githubusercontent.com/mudler/103de2576a8fd4b583f9bd53f4e4cefd/raw/9181d4add553326806b8fdbf4ff0cd65d2145bff/phi-2-sycl.yaml
image: quay.io/go-skynet/local-ai:master-sycl-f32-core
imagePullPolicy: Always
resources:
limits:
gpu.intel.com/i915: 1
volumeMounts:
- name: models-volume
mountPath: /build/models
volumes:
- name: models-volume
persistentVolumeClaim:
claimName: models-pvc
---
apiVersion: v1
kind: Service
metadata:
name: local-ai
namespace: local-ai
spec:
selector:
app: local-ai
type: NodePort
ports:
- protocol: TCP
port: 8080
targetPort: 8080 |
@mudler could you please try with this branch and let me know if this fixes the segfault issue. If not then there are changes in ggml backend which may have caused this. |
@mudler Yes it is possible. I tried running with -ngl 0 and it worked. The error seems to be happening what I deduce from the backtrace is that buft object or its associated context (ctx) is not properly initialized or contains invalid data which is leading to segfault. Honestly, my knowledge in these areas is rusty. @abhilash1910 dada may know better. :) |
I too am getting the segfault. I don't know how to help though... I can test things if there is something to try. |
@akarshanbiswas @mudler @channeladam Could you please try to build from latest master and see if that works ? Thanks |
@abhilash1910 Still fails at the same location: |
Thanks @akarshanbiswas , could you again try building with : https://github.com/abhilash1910/llama.cpp/tree/fix_sycl_arc |
core dump backtrace: (of your fork fix_sycl_arc)
|
Thanks @channeladam @akarshanbiswas , requesting you to re-try the same branch if possible? Thanks |
@abhilash1910 This time the build failed
|
Yes @akarshanbiswas this was an obvious error on my part for the recent build error, could you re-try with the latest commit on the branch ? Thanks . It might throw some other exception but the results shown above rule out the possibility of arising in sycl code. It might be incoming from another forced typecast inside core headers. |
@abhilash1910 It failed again with a different error this time.
|
@akarshanbiswas @channeladam please try #5624 and let us know if issue persists? |
Works for me |
Actually with #5624 , I am getting another crash now upon usage (the previous crash was on startup).
Backtrace:
|
Seems unrelated to SYCL. (although symbols aren't properly loaded here) Please open a new issue. |
I can confirm that works here too, just tested with my Arc a770 against 201294a - thanks @abhilash1910 @airMeng ! |
Does master have these fixes now or do I still need to use this specific commit? |
in the master already. You can open a new issue? |
System: Arch Linux,
CPU: Intel i3 12th gen
GPU: Intel Arc A750
RAM: 16GB
llama.cpp version: b2134
Previously the build was failing with
-DLLAMA_SYCL_F16=ON
which has been fixed in #5411. Upon running this build, it crashes with segmentation fault.logs:
The build without-DLLAMA_SYCL_F16=ON
works.Confirmed: This crash started happening after #5411
The text was updated successfully, but these errors were encountered: