Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aborted (core dumped) in jt.randn/rand/random #612

Open
x0w3n opened this issue Dec 3, 2024 · 0 comments
Open

Aborted (core dumped) in jt.randn/rand/random #612

x0w3n opened this issue Dec 3, 2024 · 0 comments

Comments

@x0w3n
Copy link

x0w3n commented Dec 3, 2024

Describe the bug

A crash is triggered when iterating over a very large tensor.

Full Log

[i 1203 09:33:28.211986 80 compiler.py:956] Jittor(1.3.9.10) src: /home/miniconda3/envs/jittor/lib/python3.9/site-packages/jittor
[i 1203 09:33:28.220052 80 compiler.py:957] g++ at /usr/bin/g++(11.2.0)
[i 1203 09:33:28.220155 80 compiler.py:958] cache_path: /home/.cache/jittor/jt1.3.9/g++11.2.0/py3.9.12/Linux-5.15.0-1x30/INTELRXEONRGOLx51/ef26/default
[i 1203 09:33:28.287008 80 install_cuda.py:93] cuda_driver_version: [12, 2]
[i 1203 09:33:28.300922 80 __init__.py:412] Found /home/.cache/jittor/jtcuda/cuda12.2_cudnn8_linux/bin/nvcc(12.2.140) at /home/.cache/jittor/jtcuda/cuda12.2_cudnn8_linux/bin/nvcc.
[i 1203 09:33:28.711060 80 __init__.py:412] Found gdb(12.0.90) at /usr/bin/gdb.
[i 1203 09:33:28.713648 80 __init__.py:412] Found addr2line(2.38) at /usr/bin/addr2line.
[i 1203 09:33:28.975811 80 compiler.py:1013] cuda key:cu12.2.140_sm_89
[i 1203 09:33:29.862029 80 __init__.py:227] Total mem: 251.50GB, using 16 procs for compiling.
[i 1203 09:33:30.053263 80 jit_compiler.cc:28] Load cc_path: /usr/bin/g++
[i 1203 09:33:30.288767 80 init.cc:63] Found cuda archs: [89,]
Caught segfault at address 0x7fc58fc00000, thread_name: '', flush log...
[i 1203 09:33:31.123382 80 tracer.cc:149] stack trace for pid= 412266
[New LWP 412511]
[New LWP 412512]
[New LWP 412513]
[New LWP 412515]
[New LWP 412516]
[New LWP 412517]
[New LWP 412518]
[New LWP 412595]
[New LWP 412596]
[New LWP 412597]
[New LWP 412598]
[New LWP 412599]
[New LWP 412600]
[New LWP 412601]
[New LWP 412602]
[New LWP 412603]
[New LWP 412604]
[New LWP 412605]
[New LWP 412606]
[New LWP 412607]
[New LWP 412608]
[New LWP 412609]
[New LWP 412610]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fc5cef8e49f in __GI___wait4 (pid=412611, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30      ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
Traceback (most recent call last):
  File "/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30-gdb.py", line 60, in <module>
    from libstdcxx.v6 import register_libstdcxx_printers
ModuleNotFoundError: No module named 'libstdcxx'
[Current thread is 1 (Thread 0x7fc5cee9f740 (LWP 412266))]
#0  0x00007fc5cef8e49f in __GI___wait4 (pid=412611, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
#1  0x00007fc5cd6db7f7 in jittor::print_trace() () from /home/.cache/jittor/jt1.3.9/g++11.2.0/py3.9.12/Linux-5.15.0-1x30/INTELRXEONRGOLx51/ef26/default/jit_utils_core.cpython-39-x86_64-linux-gnu.so
#2  0x00007fc5cd6d8a65 in jittor::segfault_sigaction(int, siginfo_t*, void*) () from /home/.cache/jittor/jt1.3.9/g++11.2.0/py3.9.12/Linux-5.15.0-1x30/INTELRXEONRGOLx51/ef26/default/jit_utils_core.cpython-39-x86_64-linux-gnu.so
#3  <signal handler called>
#4  0x00007fc5bd846244 in jittor::RandomOp::jit_run (this=0x55c257fe96b0) at /home/.cache/jittor/jt1.3.9/g++11.2.0/py3.9.12/Linux-5.15.0-1x30/INTELRXEONRGOLx51/ef26/default/cu12.2.140_sm_89/jit/random__T_float32__R_uniform__JIT_1__JIT_cpu_1__index_t_int64_hash_e4e703e57e756a2_op.cc:31
#5  0x00007fc5c6be4520 in jittor::Profiler::record_and_run(void (*)(jittor::Op*), jittor::Op*, char const*) () from /home/.cache/jittor/jt1.3.9/g++11.2.0/py3.9.12/Linux-5.15.0-1x30/INTELRXEONRGOLx51/ef26/default/cu12.2.140_sm_89/jittor_core.cpython-39-x86_64-linux-gnu.so
#6  0x00007fc5c6bec12c in jittor::Executor::run_sync(std::vector<jittor::Var*, std::allocator<jittor::Var*> >, bool, bool) () from /home/.cache/jittor/jt1.3.9/g++11.2.0/py3.9.12/Linux-5.15.0-1x30/INTELRXEONRGOLx51/ef26/default/cu12.2.140_sm_89/jittor_core.cpython-39-x86_64-linux-gnu.so
#7  0x00007fc5c6ad6414 in jittor::sync(std::vector<jittor::VarHolder*, std::allocator<jittor::VarHolder*> > const&, bool, bool) () from /home/.cache/jittor/jt1.3.9/g++11.2.0/py3.9.12/Linux-5.15.0-1x30/INTELRXEONRGOLx51/ef26/default/cu12.2.140_sm_89/jittor_core.cpython-39-x86_64-linux-gnu.so
#8  0x00007fc5c6ad66f3 in jittor::VarHolder::sync(bool, bool) () from /home/.cache/jittor/jt1.3.9/g++11.2.0/py3.9.12/Linux-5.15.0-1x30/INTELRXEONRGOLx51/ef26/default/cu12.2.140_sm_89/jittor_core.cpython-39-x86_64-linux-gnu.so
#9  0x00007fc5c6ad6b20 in jittor::VarHolder::fetch_sync() () from /home/.cache/jittor/jt1.3.9/g++11.2.0/py3.9.12/Linux-5.15.0-1x30/INTELRXEONRGOLx51/ef26/default/cu12.2.140_sm_89/jittor_core.cpython-39-x86_64-linux-gnu.so
Undefined command: "py-bt".  Try "help".
[Inferior 1 (process 412266) detached]
Segfault, exit
terminate called without an active exception
Aborted (core dumped)

Minimal Reproduce

import jittor as jt
from jittor import *
x = jt.array([1, 2, 3]).float()

arrays = [
    jt.randn((sys.maxsize,))  
]

# arrays = [
#     jt.rand((sys.maxsize,)) 
# ]

# arrays = [
#     jt.random((sys.maxsize,))  
# ]

for array in arrays:
    print(array)

Expected behavior

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant