You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /usr/lib/x86_64-linux-gnu/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 115
CUDA SETUP: Loading binary /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda115.so...
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'.
The class this function is called from is 'LlamaTokenizer'.
./lora-Vicuna/checkpoint-final
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:13<00:00, 2.37it/s]
Running on local URL: http://0.0.0.0:4321
To create a public link, set `share=True` in `launch()`.
2 error log++++++++++++++++++
The server console prints an error message when I submit a request from the browser**
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 1199, in run_node
return nnmodule(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py", line 162, in forward
return F.embedding(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_stats.py", line 20, in wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_subclasses/fake_tensor.py", line 987, in __torch_dispatch__
return self.dispatch(func, types, args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_subclasses/fake_tensor.py", line 1066, in dispatch
args, kwargs = self.validate_and_convert_non_fake_tensors(
File "/usr/local/lib/python3.10/dist-packages/torch/_subclasses/fake_tensor.py", line 1220, in validate_and_convert_non_fake_tensors
return tree_map_only(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_pytree.py", line 266, in tree_map_only
return tree_map(map_only(ty)(fn), pytree)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_pytree.py", line 196, in tree_map
return tree_unflatten([fn(i) for i in flat_args], spec)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_pytree.py", line 196, in <listcomp>
return tree_unflatten([fn(i) for i in flat_args], spec)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_pytree.py", line 247, in inner
return f(x)
File "/usr/local/lib/python3.10/dist-packages/torch/_subclasses/fake_tensor.py", line 1212, in validate
raise Exception(
Exception: Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten.embedding.default(*(Parameter containing:
tensor([[ 9.8884e-05, -2.3329e-04, 5.8460e-04, ..., -3.4237e-04,
5.9724e-05, -1.1957e-04],
[ 1.5289e-02, -1.2154e-02, 1.2512e-02, ..., 1.3092e-02,
7.2174e-03, -6.8045e-04],
[ 1.7433e-03, 1.7633e-03, -1.4465e-02, ..., -1.1444e-02,
-1.2665e-02, 3.7289e-04],
...,
[-9.0179e-03, 3.0807e-02, -1.6708e-02, ..., -1.2680e-02,
1.0437e-02, 4.2343e-03],
[-1.1368e-02, -1.4801e-02, -3.5667e-03, ..., 6.5308e-03,
-2.2263e-02, -6.1455e-03],
[-1.3992e-02, 1.6985e-03, -2.1469e-02, ..., 1.3527e-02,
2.8290e-02, -8.9111e-03]], device='cuda:0', dtype=torch.float16), FakeTensor(FakeTensor(..., device='meta', size=(1, 24), dtype=torch.int64), cuda:0), 31999), **{})
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 1152, in get_fake_value
return wrap_fake_exception(
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 808, in wrap_fake_exception
return fn()
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 1153, in <lambda>
lambda: run_node(tx.output, node, args, kwargs, nnmodule)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 1206, in run_node
raise RuntimeError(
RuntimeError: Failed running call_module self_embed_tokens(*(FakeTensor(FakeTensor(..., device='meta', size=(1, 24), dtype=torch.int64), cuda:0),), **{}):
Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten.embedding.default(*(Parameter containing:
tensor([[ 9.8884e-05, -2.3329e-04, 5.8460e-04, ..., -3.4237e-04,
5.9724e-05, -1.1957e-04],
[ 1.5289e-02, -1.2154e-02, 1.2512e-02, ..., 1.3092e-02,
7.2174e-03, -6.8045e-04],
[ 1.7433e-03, 1.7633e-03, -1.4465e-02, ..., -1.1444e-02,
-1.2665e-02, 3.7289e-04],
...,
[-9.0179e-03, 3.0807e-02, -1.6708e-02, ..., -1.2680e-02,
1.0437e-02, 4.2343e-03],
[-1.1368e-02, -1.4801e-02, -3.5667e-03, ..., 6.5308e-03,
-2.2263e-02, -6.1455e-03],
[-1.3992e-02, 1.6985e-03, -2.1469e-02, ..., 1.3527e-02,
2.8290e-02, -8.9111e-03]], device='cuda:0', dtype=torch.float16), FakeTensor(FakeTensor(..., device='meta', size=(1, 24), dtype=torch.int64), cuda:0), 31999), **{})
(scroll up for backtrace)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/good/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 382, in __call__
result = fn(*args, **kwargs)
File "/home/good/Chinese-Vicuna/tools/Alpaca-LoRA-Serve/gen.py", line 117, in _infer
return model_fn(**kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 82, in forward
return self.dynamo_ctx(self._orig_mod.forward)(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 209, in _fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 160, in new_forward
args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in <graph break in new_forward>
output = old_forward(*args, **kwargs)
File "/home/good/.local/lib/python3.10/site-packages/peft/peft_model.py", line 575, in forward
return self.base_model(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 160, in new_forward
args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in <graph break in new_forward>
output = old_forward(*args, **kwargs)
File "/home/good/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 688, in forward
outputs = self.model(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 160, in new_forward
args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in <graph break in new_forward>
output = old_forward(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 337, in catch_errors
return callback(frame, cache_size, hooks)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 404, in _convert_frame
result = inner_convert(frame, cache_size, hooks)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 104, in _fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 262, in _convert_frame_assert
return _compile(
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 163, in time_wrapper
r = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 324, in _compile
out_code = transform_code_object(code, transform)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/bytecode_transformation.py", line 445, in transform_code_object
transformations(instructions, code_options)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 311, in transform
tracer.run()
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 1726, in run
super().run()
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 576, in run
and self.step()
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 540, in step
getattr(self, inst.opname)(inst)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 342, in wrapper
return inner_fn(self, inst)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 965, in CALL_FUNCTION
self.call_function(fn, args, {})
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 474, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/variables/nn_module.py", line 203, in call_function
return wrap_fx_proxy(
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/variables/builder.py", line 754, in wrap_fx_proxy
return wrap_fx_proxy_cls(
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/variables/builder.py", line 789, in wrap_fx_proxy_cls
example_value = get_fake_value(proxy.node, tx)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 1173, in get_fake_value
raise TorchRuntimeError() from e
torch._dynamo.exc.TorchRuntimeError:
from user code:
File "/home/good/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 531, in forward
inputs_embeds = self.embed_tokens(input_ids)
Set torch._dynamo.config.verbose=True for more information
You can suppress this exception and fall back to eager by setting:
torch._dynamo.config.suppress_errors = True
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/gradio/routes.py", line 393, in run_predict
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1108, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 929, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 490, in async_iteration
return next(iterator)
File "/home/good/Chinese-Vicuna/./tools/Alpaca-LoRA-Serve/app.py", line 53, in chat_stream
for tokens in bot_response:
File "/home/good/Chinese-Vicuna/tools/Alpaca-LoRA-Serve/gen.py", line 82, in __call__
for tokens in self.generate(
File "/home/good/Chinese-Vicuna/tools/Alpaca-LoRA-Serve/gen.py", line 217, in generate
outputs = self._infer(
File "/home/good/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/home/good/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 379, in __call__
do = self.iter(retry_state=retry_state)
File "/home/good/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 326, in iter
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x7f678691f6a0 state=finished raised TorchRuntimeError>]
1 OS info++++++++++++++++++
ubuntu 2204
The text was updated successfully, but these errors were encountered:
1 start log++++++++++++++++++
$ bash alpaca-serve.sh
2 error log++++++++++++++++++
1 OS info++++++++++++++++++
ubuntu 2204
The text was updated successfully, but these errors were encountered: