Privacy and security concernes #142

giachat · 2023-04-06T16:53:09Z

giachat
Apr 6, 2023

Great project! I had a question in regards to privacy and security. Is uploaded pdf data in anyway sandboxed or protected? Have anyone thought how to implement this when it comes to proprietary company data?

Thanks!

bschleter · 2023-04-12T05:51:37Z

bschleter
Apr 12, 2023

Correct me if I'm wrong anybody, but any data is what you may publish on your git repository to train, or what is used in questions/prompts to OpenAI/other LLM. OpenAI's data policy for API is private in sense it is not used to train the model, and kept only for 30 days it appears for criminal proceedings/law purposes.

Theoretically, your text data is also stored into vector databases like pinecone, but it is stored in vectors with IDs, so not actually text until converted back and would need an LLM agent and some other info to convert back. This would only be the chain data to make the model, but this is what I assume is proprietary data initially. From what I know, text is converted to vectors before arriving to pinecone.

2 replies

gzajko May 9, 2023

it looks like there is actually full text in the pinecone database bundled as "metadata" with the vector. or am i wrong?

3koozy Aug 12, 2023

and by definition all embeddings should be universal m ie anyone with an embedding model like (openai) can transform them back to text.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Privacy and security concernes #142

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Privacy and security concernes #142

giachat Apr 6, 2023

Replies: 1 comment · 2 replies

bschleter Apr 12, 2023

gzajko May 9, 2023

3koozy Aug 12, 2023

giachat
Apr 6, 2023

Replies: 1 comment 2 replies

bschleter
Apr 12, 2023