ZeroDivisionError: Weights sum to zero, can't be normalized #3

mrwadepro · 2023-08-13T16:58:51Z

First off, thanks for taking the time to post this package. I am getting this error when asking a question after I uploaded the PDF.

Using embedded DuckDB without persistence: data will be transient
Traceback (most recent call last):
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/gradio/routes.py", line 401, in run_predict
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1302, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1039, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/gradio/utils.py", line 491, in async_iteration
    return next(iterator)
           ^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/app.py", line 80, in get_response
    chain = app(file)
            ^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/app.py", line 46, in __call__
    self.chain = self.build_chain(file)
                 ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/app.py", line 69, in build_chain
    pdfsearch = Chroma.from_documents(documents, embeddings, collection_name= file_name,)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/langchain/vectorstores/chroma.py", line 347, in from_documents
    return cls.from_texts(
           ^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/langchain/vectorstores/chroma.py", line 315, in from_texts
    chroma_collection.add_texts(texts=texts, metadatas=metadatas, ids=ids)
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/langchain/vectorstores/chroma.py", line 121, in add_texts
    embeddings = self._embedding_function.embed_documents(list(texts))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/langchain/embeddings/openai.py", line 228, in embed_documents
    return self._get_len_safe_embeddings(texts, engine=self.deployment)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/langchain/embeddings/openai.py", line 189, in _get_len_safe_embeddings
    average = np.average(results[i], axis=0, weights=lens[i])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/john_appleseed/Documents/Pdf-GPT/venv/lib/python3.11/site-packages/numpy/lib/function_base.py", line 550, in average
    raise ZeroDivisionError(
ZeroDivisionError: Weights sum to zero, can't be normalized

The text was updated successfully, but these errors were encountered:

akmcax · 2023-08-18T07:22:55Z

Hello Sunil Kumar ji,

Thanks for this excellent git repo.
While testing your code I am getting below error, what can be the possible reason--

Using embedded DuckDB without persistence: data will be transient
Traceback (most recent call last):
File "/home/rtx/akm/lib/python3.8/site-packages/gradio/routes.py", line 401, in run_predict
output = await app.get_blocks().process_api(
File "/home/rtx/akm/lib/python3.8/site-packages/gradio/blocks.py", line 1302, in process_api
result = await self.call_function(
File "/home/rtx/akm/lib/python3.8/site-packages/gradio/blocks.py", line 1039, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/rtx/akm/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/rtx/akm/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/rtx/akm/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/home/rtx/akm/lib/python3.8/site-packages/gradio/utils.py", line 491, in async_iteration
return next(iterator)
File "/tmp/ipykernel_29949/1995911808.py", line 85, in get_response
chain = app(file)
File "/tmp/ipykernel_29949/1995911808.py", line 44, in call
self.chain = self.build_chain(file)
File "/tmp/ipykernel_29949/1995911808.py", line 74, in build_chain
pdfsearch = Chroma.from_documents(documents, embeddings, collection_name= file_name,)
File "/home/rtx/akm/lib/python3.8/site-packages/langchain/vectorstores/chroma.py", line 613, in from_documents
return cls.from_texts(
File "/home/rtx/akm/lib/python3.8/site-packages/langchain/vectorstores/chroma.py", line 568, in from_texts
chroma_collection = cls(
File "/home/rtx/akm/lib/python3.8/site-packages/langchain/vectorstores/chroma.py", line 126, in init
self._collection = self._client.get_or_create_collection(
File "/home/rtx/akm/lib/python3.8/site-packages/chromadb/api/local.py", line 79, in get_or_create_collection
return self.create_collection(name, metadata, embedding_function, get_or_create=True)
File "/home/rtx/akm/lib/python3.8/site-packages/chromadb/api/local.py", line 66, in create_collection
check_index_name(name)
File "/home/rtx/akm/lib/python3.8/site-packages/chromadb/api/local.py", line 41, in check_index_name
raise ValueError(msg)
ValueError: Expected collection name that (1) contains 3-63 characters, (2) starts and ends with an alphanumeric character, (3) otherwise contains only alphanumeric characters, underscores or hyphens (-), (4) contains no two consecutive periods (..) and (5) is not a valid IPv4 address

Kindly note that OpenAI API key has been considered while running the code. Also, the number of characters in the file name is only 10.

sunilkumardash9 · 2023-09-25T17:55:23Z

hi @akmcax, it probably has to do with the name of the Chroma collection. Check if it complies with the naming convention. Your collection name might have an underscore or hyphen at the end.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZeroDivisionError: Weights sum to zero, can't be normalized #3

ZeroDivisionError: Weights sum to zero, can't be normalized #3

mrwadepro commented Aug 13, 2023

akmcax commented Aug 18, 2023 •

edited

Loading

sunilkumardash9 commented Sep 25, 2023 •

edited

Loading

ZeroDivisionError: Weights sum to zero, can't be normalized #3

ZeroDivisionError: Weights sum to zero, can't be normalized #3

Comments

mrwadepro commented Aug 13, 2023

akmcax commented Aug 18, 2023 • edited Loading

Thanks for this excellent git repo. While testing your code I am getting below error, what can be the possible reason--

sunilkumardash9 commented Sep 25, 2023 • edited Loading

akmcax commented Aug 18, 2023 •

edited

Loading

Thanks for this excellent git repo.
While testing your code I am getting below error, what can be the possible reason--

sunilkumardash9 commented Sep 25, 2023 •

edited

Loading