Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The server console prints an error message when I submit a request from the browser #109

Closed
ImGoodBai opened this issue Apr 24, 2023 · 2 comments

Comments

@ImGoodBai
Copy link

ImGoodBai commented Apr 24, 2023

1 start log++++++++++++++++++

$ bash alpaca-serve.sh

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /usr/lib/x86_64-linux-gnu/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 115
CUDA SETUP: Loading binary /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda115.so...
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'. 
The class this function is called from is 'LlamaTokenizer'.
./lora-Vicuna/checkpoint-final
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:13<00:00,  2.37it/s]
Running on local URL:  http://0.0.0.0:4321

To create a public link, set `share=True` in `launch()`.

2 error log++++++++++++++++++


The server console prints an error message when I submit a request from the browser**

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 1199, in run_node
    return nnmodule(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py", line 162, in forward
    return F.embedding(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_stats.py", line 20, in wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_subclasses/fake_tensor.py", line 987, in __torch_dispatch__
    return self.dispatch(func, types, args, kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_subclasses/fake_tensor.py", line 1066, in dispatch
    args, kwargs = self.validate_and_convert_non_fake_tensors(
  File "/usr/local/lib/python3.10/dist-packages/torch/_subclasses/fake_tensor.py", line 1220, in validate_and_convert_non_fake_tensors
    return tree_map_only(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_pytree.py", line 266, in tree_map_only
    return tree_map(map_only(ty)(fn), pytree)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_pytree.py", line 196, in tree_map
    return tree_unflatten([fn(i) for i in flat_args], spec)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_pytree.py", line 196, in <listcomp>
    return tree_unflatten([fn(i) for i in flat_args], spec)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_pytree.py", line 247, in inner
    return f(x)
  File "/usr/local/lib/python3.10/dist-packages/torch/_subclasses/fake_tensor.py", line 1212, in validate
    raise Exception(
Exception: Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten.embedding.default(*(Parameter containing:
tensor([[ 9.8884e-05, -2.3329e-04,  5.8460e-04,  ..., -3.4237e-04,
          5.9724e-05, -1.1957e-04],
        [ 1.5289e-02, -1.2154e-02,  1.2512e-02,  ...,  1.3092e-02,
          7.2174e-03, -6.8045e-04],
        [ 1.7433e-03,  1.7633e-03, -1.4465e-02,  ..., -1.1444e-02,
         -1.2665e-02,  3.7289e-04],
        ...,
        [-9.0179e-03,  3.0807e-02, -1.6708e-02,  ..., -1.2680e-02,
          1.0437e-02,  4.2343e-03],
        [-1.1368e-02, -1.4801e-02, -3.5667e-03,  ...,  6.5308e-03,
         -2.2263e-02, -6.1455e-03],
        [-1.3992e-02,  1.6985e-03, -2.1469e-02,  ...,  1.3527e-02,
          2.8290e-02, -8.9111e-03]], device='cuda:0', dtype=torch.float16), FakeTensor(FakeTensor(..., device='meta', size=(1, 24), dtype=torch.int64), cuda:0), 31999), **{}) 

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 1152, in get_fake_value
    return wrap_fake_exception(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 808, in wrap_fake_exception
    return fn()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 1153, in <lambda>
    lambda: run_node(tx.output, node, args, kwargs, nnmodule)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 1206, in run_node
    raise RuntimeError(
RuntimeError: Failed running call_module self_embed_tokens(*(FakeTensor(FakeTensor(..., device='meta', size=(1, 24), dtype=torch.int64), cuda:0),), **{}):
Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten.embedding.default(*(Parameter containing:
tensor([[ 9.8884e-05, -2.3329e-04,  5.8460e-04,  ..., -3.4237e-04,
          5.9724e-05, -1.1957e-04],
        [ 1.5289e-02, -1.2154e-02,  1.2512e-02,  ...,  1.3092e-02,
          7.2174e-03, -6.8045e-04],
        [ 1.7433e-03,  1.7633e-03, -1.4465e-02,  ..., -1.1444e-02,
         -1.2665e-02,  3.7289e-04],
        ...,
        [-9.0179e-03,  3.0807e-02, -1.6708e-02,  ..., -1.2680e-02,
          1.0437e-02,  4.2343e-03],
        [-1.1368e-02, -1.4801e-02, -3.5667e-03,  ...,  6.5308e-03,
         -2.2263e-02, -6.1455e-03],
        [-1.3992e-02,  1.6985e-03, -2.1469e-02,  ...,  1.3527e-02,
          2.8290e-02, -8.9111e-03]], device='cuda:0', dtype=torch.float16), FakeTensor(FakeTensor(..., device='meta', size=(1, 24), dtype=torch.int64), cuda:0), 31999), **{}) 
(scroll up for backtrace)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/good/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
  File "/home/good/Chinese-Vicuna/tools/Alpaca-LoRA-Serve/gen.py", line 117, in _infer
    return model_fn(**kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 82, in forward
    return self.dynamo_ctx(self._orig_mod.forward)(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 209, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 160, in new_forward
    args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in <graph break in new_forward>
    output = old_forward(*args, **kwargs)
  File "/home/good/.local/lib/python3.10/site-packages/peft/peft_model.py", line 575, in forward
    return self.base_model(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 160, in new_forward
    args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in <graph break in new_forward>
    output = old_forward(*args, **kwargs)
  File "/home/good/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 688, in forward
    outputs = self.model(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 160, in new_forward
    args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in <graph break in new_forward>
    output = old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 337, in catch_errors
    return callback(frame, cache_size, hooks)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 404, in _convert_frame
    result = inner_convert(frame, cache_size, hooks)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 104, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 262, in _convert_frame_assert
    return _compile(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 163, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 324, in _compile
    out_code = transform_code_object(code, transform)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/bytecode_transformation.py", line 445, in transform_code_object
    transformations(instructions, code_options)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 311, in transform
    tracer.run()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 1726, in run
    super().run()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 576, in run
    and self.step()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 540, in step
    getattr(self, inst.opname)(inst)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 342, in wrapper
    return inner_fn(self, inst)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 965, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 474, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/variables/nn_module.py", line 203, in call_function
    return wrap_fx_proxy(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/variables/builder.py", line 754, in wrap_fx_proxy
    return wrap_fx_proxy_cls(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/variables/builder.py", line 789, in wrap_fx_proxy_cls
    example_value = get_fake_value(proxy.node, tx)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 1173, in get_fake_value
    raise TorchRuntimeError() from e
torch._dynamo.exc.TorchRuntimeError: 

from user code:
   File "/home/good/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 531, in forward
    inputs_embeds = self.embed_tokens(input_ids)

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/gradio/routes.py", line 393, in run_predict
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1108, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 929, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 490, in async_iteration
    return next(iterator)
  File "/home/good/Chinese-Vicuna/./tools/Alpaca-LoRA-Serve/app.py", line 53, in chat_stream
    for tokens in bot_response:
  File "/home/good/Chinese-Vicuna/tools/Alpaca-LoRA-Serve/gen.py", line 82, in __call__
    for tokens in self.generate(
  File "/home/good/Chinese-Vicuna/tools/Alpaca-LoRA-Serve/gen.py", line 217, in generate
    outputs = self._infer(
  File "/home/good/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 289, in wrapped_f
    return self(f, *args, **kw)
  File "/home/good/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/good/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 326, in iter
    raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x7f678691f6a0 state=finished raised TorchRuntimeError>]



1 OS info++++++++++++++++++

ubuntu 2204

@ImGoodBai
Copy link
Author

thanks.

@Facico
Copy link
Owner

Facico commented Apr 26, 2023

We no longer provide maintenance for alpaca-serve, you can go to their repository and ask questions about it. Or you can just use our t script.

@Facico Facico closed this as completed Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants