-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
openai api extention can use long context model: if the model context is 16k, the api also can use this 16k context #3668
Comments
See #3153 for a workaround with openai |
thanks for your reply, i change the config.yml set the truncate_length=8192 ,the context seems to be actvitated ,but the completion call get error: |
Can you include the server logs for this error? it should have a full stack
trace. ideally, please enable OPENEDAI_DEBUG=1 Environment variable too.
…On Mon, Aug 28, 2023, 1:40 a.m. elven2016 ***@***.***> wrote:
See #3153 <#3153>
for a workaround with openai
thanks for your reply, i change the config.yml set the
truncate_length=8192 ,the context seems to be actvitated ,but the
completion call get error:
[image: image]
<https://user-images.githubusercontent.com/16677082/263590696-8791d2fc-437f-4e7b-87b4-64f797459fda.png>
—
Reply to this email directly, view it on GitHub
<#3668 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARO7ETIKDO635V6OHIKW4P3XXQK3BANCNFSM6AAAAAA34Q2ACM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
the log prints below: |
I meant the error logs from the server with OPENEDAI_DEBUG=1 set, can you
provide that?
…On Wed, Aug 30, 2023, 5:26 a.m. elven2016 ***@***.***> wrote:
Can you include the server logs for this error? it should have a full
stack trace. ideally, please enable OPENEDAI_DEBUG=1 Environment variable
too.
… <#m_-5870151089720743706_>
On Mon, Aug 28, 2023, 1:40 a.m. elven2016 *@*.*> wrote: See #3153
<#3153> <#3153
<#3153>> for a
workaround with openai thanks for your reply, i change the config.yml set
the truncate_length=8192 ,the context seems to be actvitated ,but the
completion call get error: [image: image]
https://user-images.githubusercontent.com/16677082/263590696-8791d2fc-437f-4e7b-87b4-64f797459fda.png
<https://user-images.githubusercontent.com/16677082/263590696-8791d2fc-437f-4e7b-87b4-64f797459fda.png>
— Reply to this email directly, view it on GitHub <#3668 (comment)
<#3668 (comment)>>,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ARO7ETIKDO635V6OHIKW4P3XXQK3BANCNFSM6AAAAAA34Q2ACM
<https://github.com/notifications/unsubscribe-auth/ARO7ETIKDO635V6OHIKW4P3XXQK3BANCNFSM6AAAAAA34Q2ACM>
. You are receiving this because you commented.Message ID: @.*>
the log prints below:
Traceback (most recent call last):
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py",
line 552, in _run_script
exec(code, module.*dict*)
File "/home/elven/finreport/finapp/app.py", line 182, in
main()
File "/home/elven/finreport/finapp/app.py", line 120, in main
handle_userinput(user_question)
File "/home/elven/finreport/finapp/app.py", line 82, in handle_userinput
response = st.session_state.conversation({'question': user_question})
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 282, in *call*
raise e
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 276, in *call*
self._call(inputs, run_manager=run_manager)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/conversational_retrieval/base.py",
line 141, in _call
answer = self.combine_docs_chain.run(
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 480, in run
return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 282, in *call*
raise e
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 276, in *call*
self._call(inputs, run_manager=run_manager)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py",
line 105, in _call
output, extra_return_dict = self.combine_docs(
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py",
line 171, in combine_docs
return self.llm_chain.predict(callbacks=callbacks, **inputs), {}
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/llm.py",
line 255, in predict
return self(kwargs, callbacks=callbacks)[self.output_key]
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 282, in *call*
raise e
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 276, in *call*
self._call(inputs, run_manager=run_manager)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/llm.py",
line 91, in _call
response = self.generate([inputs], run_manager=run_manager)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/llm.py",
line 101, in generate
return self.llm.generate_prompt(
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/base.py",
line 414, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks,
**kwargs)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/base.py",
line 309, in generate
raise e
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/base.py",
line 299, in generate
self._generate_with_cache(
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/base.py",
line 446, in _generate_with_cache
return self._generate(
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/openai.py",
line 345, in _generate
response = self.completion_with_retry(
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/openai.py",
line 278, in completion_with_retry
return _completion_with_retry(**kwargs)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/tenacity/
*init*.py", line 289, in wrapped_f
return self(f, *args, **kw)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/tenacity/
*init*.py", line 379, in *call*
do = self.iter(retry_state=retry_state)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/tenacity/
*init*.py", line 325, in iter
raise retry_exc.reraise()
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/tenacity/
*init*.py", line 158, in reraise
raise self.last_attempt.result()
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/concurrent/futures/_base.py",
line 451, in result
return self.__get_result()
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/concurrent/futures/_base.py",
line 403, in __get_result
raise self._exception
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/tenacity/
*init*.py", line 382, in *call*
result = fn(*args, **kwargs)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/openai.py",
line 276, in _completion_with_retry
return self.client.create(**kwargs)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/openai/api_resources/chat_completion.py",
line 25, in create
return super().create(
*args, **kwargs) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py",
line 153, in create response, _, api_key = requestor.request( File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/openai/api_requestor.py",
line 298, in request resp, got_stream = self._interpret_response(result,
stream) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/openai/api_requestor.py",
line 700, in _interpret_response self._interpret_response_line( File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/openai/api_requestor.py",
line 765, in _interpret_response_line raise self.handle_error_response(
openai.error.APIError: UnboundLocalError("local variable 'tokens'
referenced before assignment") {"error": {"message":
"UnboundLocalError("local variable 'tokens' referenced before
assignment")", "code": 500, "type": "OpenAIError", "param": ""}} 500
{'error': {'message': 'UnboundLocalError("local variable 'tokens'
referenced before assignment")', 'code': 500, 'type': 'OpenAIError',
'param': ''}} {'Connection': 'close', 'Content-Length': '150',
'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Headers':
'Origin, Accept, X-Requested-With, Content-Type,
Access-Control-Request-Method, Access-Control-Request-Headers,
Authorization', 'Access-Control-Allow-Methods':
'GET,HEAD,OPTIONS,POST,PUT', 'Access-Control-Allow-Origin': '*',
'Content-Type': 'application/json', 'Date': 'Mon, 28 Aug 2023 08:14:40
GMT', 'Server': 'BaseHTTP/0.6 Python/3.10.10'}
—
Reply to this email directly, view it on GitHub
<#3668 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARO7ETMPQQL5YPJOYYWFMQ3XX3W2LANCNFSM6AAAAAA34Q2ACM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
ignore the message output,the error prints seems it's the issue of context too long and the cuda oom,the GPU memory seems is enough free 127.0.0.1 - - [31/Aug/2023 15:03:02] "POST /v1/chat/completions HTTP/1.1" 500 - {'messages': [{'role': 'system', 'content': "Use the following pieces of context to answer the users question. \nIf you don't know the answer, just say that you don't know, don't try to make up an answer.\n----------------\n XXXXXX(igonre the long content)'}], 'model': 'chatglm2-6b', 'max_tokens': None, 'stream': False, 'n': 1, 'temperature': 0.0} 127.0.0.1 - - [31/Aug/2023 11:48:15] "POST /v1/chat/completions HTTP/1.1" 500 - another thing is that he cuda have enouch resoures when the api returns error |
Yes, looks like CUDA OOM is the real problem, try freeing up more space
(you can see what's using space with the nvidia-smi command) and restart
the server.
…On Thu, Aug 31, 2023, 1:25 a.m. elven2016 ***@***.***> wrote:
I meant the error logs from the server with OPENEDAI_DEBUG=1 set, can you
provide that?
… <#m_-7063823997669950787_>
On Wed, Aug 30, 2023, 5:26 a.m. elven2016 *@*.**> wrote: Can you include
the server logs for this error? it should have a full stack trace. ideally,
please enable OPENEDAI_DEBUG=1 Environment variable too. …
<#m_-5870151089720743706_> On Mon, Aug 28, 2023, 1:40 a.m. elven2016 @.>
wrote: See #3153
<#3153> <#3153
<#3153>> <#3153
<#3153> <#3153
<#3153>>> for a
workaround with openai thanks for your reply, i change the config.yml set
the truncate_length=8192 ,the context seems to be actvitated ,but the
completion call get error: [image: image]
https://user-images.githubusercontent.com/16677082/263590696-8791d2fc-437f-4e7b-87b4-64f797459fda.png
<https://user-images.githubusercontent.com/16677082/263590696-8791d2fc-437f-4e7b-87b4-64f797459fda.png>
https://user-images.githubusercontent.com/16677082/263590696-8791d2fc-437f-4e7b-87b4-64f797459fda.png
<https://user-images.githubusercontent.com/16677082/263590696-8791d2fc-437f-4e7b-87b4-64f797459fda.png>
— Reply to this email directly, view it on GitHub <#3668
<#3668> (comment)
<#3668 (comment)
<#3668 (comment)>>>,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ARO7ETIKDO635V6OHIKW4P3XXQK3BANCNFSM6AAAAAA34Q2ACM
<https://github.com/notifications/unsubscribe-auth/ARO7ETIKDO635V6OHIKW4P3XXQK3BANCNFSM6AAAAAA34Q2ACM>
https://github.com/notifications/unsubscribe-auth/ARO7ETIKDO635V6OHIKW4P3XXQK3BANCNFSM6AAAAAA34Q2ACM
<https://github.com/notifications/unsubscribe-auth/ARO7ETIKDO635V6OHIKW4P3XXQK3BANCNFSM6AAAAAA34Q2ACM>
. You are receiving this because you commented.Message ID: @.*> the log
prints below: Traceback (most recent call last): File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py",
line 552, in _run_script exec(code, module.*dict*) File
"/home/elven/finreport/finapp/app.py", line 182, in main() File
"/home/elven/finreport/finapp/app.py", line 120, in main
handle_userinput(user_question) File "/home/elven/finreport/finapp/app.py",
line 82, in handle_userinput response =
st.session_state.conversation({'question': user_question}) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 282, in *call* raise e File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 276, in *call* self._call(inputs, run_manager=run_manager) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/conversational_retrieval/base.py",
line 141, in _call answer = self.combine_docs_chain.run( File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 480, in run return self(kwargs, callbacks=callbacks, tags=tags,
metadata=metadata)[ File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 282, in *call* raise e File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 276, in *call* self._call(inputs, run_manager=run_manager) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py",
line 105, in _call output, extra_return_dict = self.combine_docs( File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py",
line 171, in combine_docs return
self.llm_chain.predict(callbacks=callbacks, **inputs), {} File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/llm.py",
line 255, in predict return self(kwargs,
callbacks=callbacks)[self.output_key] File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 282, in *call* raise e File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/base.py",
line 276, in *call* self._call(inputs, run_manager=run_manager) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/llm.py",
line 91, in _call response = self.generate([inputs],
run_manager=run_manager) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chains/llm.py",
line 101, in generate return self.llm.generate_prompt( File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/base.py",
line 414, in generate_prompt return self.generate(prompt_messages,
stop=stop, callbacks=callbacks, **kwargs) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/base.py",
line 309, in generate raise e File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/base.py",
line 299, in generate self._generate_with_cache( File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/base.py",
line 446, in _generate_with_cache return self._generate( File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/openai.py",
line 345, in _generate response = self.completion_with_retry( File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/openai.py",
line 278, in completion_with_retry return _completion_with_retry(**kwargs)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/tenacity/
*init*.py", line 289, in wrapped_f return self(f, *args, **kw) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/tenacity/
*init*.py", line 379, in *call* do = self.iter(retry_state=retry_state)
File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/tenacity/
*init*.py", line 325, in iter raise retry_exc.reraise() File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/tenacity/
*init*.py", line 158, in reraise raise self.last_attempt.result() File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/concurrent/futures/_base.py",
line 451, in result return self.__get_result() File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/concurrent/futures/_base.py",
line 403, in __get_result raise self._exception File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/tenacity/
*init*.py", line 382, in *call* result = fn(*args, **kwargs) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/langchain/chat_models/openai.py",
line 276, in _completion_with_retry return self.client.create(kwargs) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/openai/api_resources/chat_completion.py",
line 25, in create return super().create( args, **kwargs) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py",
line 153, in create response, _, api_key = requestor.request( File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/openai/api_requestor.py",
line 298, in request resp, got_stream = self._interpret_response(result,
stream) File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/openai/api_requestor.py",
line 700, in _interpret_response self._interpret_response_line( File
"/home/elven/miniconda3/envs/finreport/lib/python3.10/site-packages/openai/api_requestor.py",
line 765, in _interpret_response_line raise self.handle_error_response(
openai.error.APIError: UnboundLocalError("local variable 'tokens'
referenced before assignment") {"error": {"message":
"UnboundLocalError("local variable 'tokens' referenced before
assignment")", "code": 500, "type": "OpenAIError", "param": ""}} 500
{'error': {'message': 'UnboundLocalError("local variable 'tokens'
referenced before assignment")', 'code': 500, 'type': 'OpenAIError',
'param': ''}} {'Connection': 'close', 'Content-Length': '150',
'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Headers':
'Origin, Accept, X-Requested-With, Content-Type,
Access-Control-Request-Method, Access-Control-Request-Headers,
Authorization', 'Access-Control-Allow-Methods':
'GET,HEAD,OPTIONS,POST,PUT', 'Access-Control-Allow-Origin': '',
'Content-Type': 'application/json', 'Date': 'Mon, 28 Aug 2023 08:14:40
GMT', 'Server': 'BaseHTTP/0.6 Python/3.10.10'} — Reply to this email
directly, view it on GitHub <#3668 (comment)
<#3668 (comment)>>,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ARO7ETMPQQL5YPJOYYWFMQ3XX3W2LANCNFSM6AAAAAA34Q2ACM
<https://github.com/notifications/unsubscribe-auth/ARO7ETMPQQL5YPJOYYWFMQ3XX3W2LANCNFSM6AAAAAA34Q2ACM>
. You are receiving this because you commented.Message ID: @.*>
*ignore the message ,the error output seems it's issue of context too long
and the cuda oom*
Traceback (most recent call last):
File "/ssd_data01/text-generation-webui/modules/callbacks.py", line 56, in
gentask
ret = self.mfunc(callback=_callback, *args, **self.kwargs)
File "/ssd_data01/text-generation-webui/modules/text_generation.py", line
321, in generate_with_callback
shared.model.generate(**kwargs)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/torch/utils/_contextlib.py",
line 115, in decorate_context
return func(*args, **kwargs)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/generation/utils.py",
line 1642, in generate
return self.sample(
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/generation/utils.py",
line 2724, in sample
outputs = self(
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/torch/nn/modules/module.py",
line 1501, in _call_impl
return forward_call(*args, **kwargs)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/accelerate/hooks.py",
line 165, in new_forward
output = old_forward(*args, **kwargs)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py",
line 809, in forward
outputs = self.model(
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/torch/nn/modules/module.py",
line 1501, in _call_impl
return forward_call(*args, **kwargs)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py",
line 697, in forward
layer_outputs = decoder_layer(
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/torch/nn/modules/module.py",
line 1501, in _call_impl
return forward_call(*args, **kwargs)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/accelerate/hooks.py",
line 165, in new_forward
output = old_forward(*args, **kwargs)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py",
line 413, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/torch/nn/modules/module.py",
line 1501, in _call_impl
return forward_call(*args, **kwargs)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/accelerate/hooks.py",
line 165, in new_forward
output = old_forward(*args, **kwargs)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py",
line 335, in forward
attn_weights = torch.matmul(query_states, key_states.transpose(2, 3)) /
math.sqrt(self.head_dim)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 13.61
GiB (GPU 0; 23.65 GiB total capacity; 5.44 GiB already allocated; 8.93 GiB
free; 6.77 GiB reserved in total by PyTorch) If reserved memory is >>
allocated memory try setting max_split_size_mb to avoid fragmentation. See
documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Output generated in 0.25 seconds (0.00 tokens/s, 0 tokens, context 15109,
seed 867992534)
OpenAIError UnboundLocalError("local variable 'tokens' referenced before
assignment")
Traceback (most recent call last):
File "/ssd_data01/text-generation-webui/extensions/openai/script.py", line
101, in wrapper
func(self)
File "/ssd_data01/text-generation-webui/extensions/openai/script.py", line
172, in do_POST
response = OAIcompletions.chat_completions(body, is_legacy=is_legacy)
File "/ssd_data01/text-generation-webui/extensions/openai/completions.py",
line 295, in chat_completions
completion_token_count = len(encode(answer)[0])
File "/ssd_data01/text-generation-webui/modules/text_generation.py", line
113, in encode
input_ids = shared.tokenizer.encode(str(prompt), return_tensors='pt',
add_special_tokens=add_special_tokens)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/tokenization_utils_base.py",
line 2373, in encode
encoded_inputs = self.encode_plus(
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/tokenization_utils_base.py",
line 2781, in encode_plus
return self._encode_plus(
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/tokenization_utils.py",
line 656, in _encode_plus
first_ids = get_input_ids(text)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/tokenization_utils.py",
line 623, in get_input_ids
tokens = self.tokenize(text, **kwargs)
File
"/home/elven/miniconda3/envs/tgweb/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py",
line 208, in tokenize
if tokens[0] == SPIECE_UNDERLINE and tokens[1] in self.all_special_tokens:
UnboundLocalError: local variable 'tokens' referenced before assignment
127.0.0.1 - - [31/Aug/2023 11:48:15] "POST /v1/chat/completions HTTP/1.1"
500 -
b'{"error": {"message": "UnboundLocalError(\"local variable 'tokens'
referenced before assignment\")", "code": 500, "type": "OpenAIError",
"param": ""}}
—
Reply to this email directly, view it on GitHub
<#3668 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARO7ETOXQGIXIY3ULN4QH7DXYADL3ANCNFSM6AAAAAA34Q2ACM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment. |
Description
when use long context model, the api only support 4k max tokens, 16kax tokens is need ,please add this feature,
A clear and concise description of what you want to be implemented.
Additional Context
If applicable, please provide any extra information, external links, or screenshots that could be useful.
The text was updated successfully, but these errors were encountered: