Skip to content

Commit

Permalink
Create batch files with a maximum of 500 files in batch (#12)
Browse files Browse the repository at this point in the history
* Create batch files with a maximum of 500 files in batch for vector store upload. Update the README.md file with the Apify's badge. Update dependencies to the latest version
  • Loading branch information
jirispilka authored Nov 26, 2024
1 parent 8ab66ba commit 8b01858
Show file tree
Hide file tree
Showing 17 changed files with 5,842 additions and 1,160 deletions.
2 changes: 1 addition & 1 deletion .actor/actor.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"name": "openai-assistant-files-integration",
"title": "OpenAI Assistant Files Integration",
"description": "The Apify OpenAI Assistant Actor allows to dynamically update the AI Assistant files.",
"version": "0.1",
"version": "0.1.1",
"meta": {
"templateId": "python-beautifulsoup"
},
Expand Down
2 changes: 1 addition & 1 deletion .actor/input_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@
"saveCrawledFiles": {
"title": "Save crawled files (docs, pdf, pptx) to OpenAI File Store",
"type": "boolean",
"description": "Save files from Apify's key-value store to OpenAI's file store. Useful when utilizing Apify’s website content crawler with the 'saveFiles' option, allowing the found files to be directly store and used in the assistant.",
"description": "Save files from Apify's key-value store to OpenAI's file store. Useful when utilizing Apify’s website content crawler with the 'saveFiles' option, allowing the found files to be directly stored.",
"default": true
},
"datasetId": {
Expand Down
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Change Log

## 0.2.3 (2024-11-26)

- Create batch files with a maximum of 500 files in batch for vector store upload.
- Update the README.md file with the Apify's badge.
- Update dependencies to the latest version

## 0.2.2 (2024-10-09)

- Add emojis to the README.md file.
Expand Down
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ format:
poetry run ruff check --fix $(DIRS_WITH_CODE)
poetry run ruff format $(DIRS_WITH_CODE)

test:
poetry run pytest --with-integration --vcr-record=none

pydantic-model:
datamodel-codegen --input .actor/input_schema.json --output $(DIRS_WITH_CODE)/input_model.py --input-file-type jsonschema --field-constraints

8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
# OpenAI Vector Store Integration (OpenAI Assistant)

[![OpenAI Vector Store Integration](https://apify.com/actor-badge?actor=jiri.spilka/openai-vector-store-integration)](https://apify.com/jiri.spilka/openai-vector-store-integration)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://github.com/jirispilka/openai-vector-store-integration/blob/main/LICENSE)
[![Build & Unit Tests](https://github.com/jirispilka/openai-vector-store-integration/actions/workflows/main.yml/badge.svg?branch=main)](https://github.com/jirispilka/openai-vector-store-integration/actions/workflows/main.yml)


The Apify [OpenAI Vector Store integration](https://apify.com/jiri.spilka/openai-vector-store-integration) uploads data from Apify Actors to the OpenAI Vector Store (connected to the OpenAI Assistant).
It assumes that you have already created an [OpenAI Assistant](https://platform.openai.com/docs/assistants/overview/agents) and [OpenAI Vector Store](https://platform.openai.com/docs/assistants/tools/file-search/vector-stores) and you need to regularly update the files to provide up-to-date responses.
It assumes that you have already created a [OpenAI Vector Store](https://platform.openai.com/docs/assistants/tools/file-search/vector-stores) and you need to regularly update the files to provide up-to-date responses.

💡 **Note**: This Actor is meant to be used together with other Actors' integration sections.
For instance, if you are using the [Website Content Crawler](https://apify.com/apify/website-content-crawler), you can activate Vector Store Files integration to save web content (including docx, pptx, pdf and other [files](https://platform.openai.com/docs/assistants/tools/file-search/supported-files)) for your OpenAI assistant.
Expand All @@ -26,7 +28,7 @@ The following image illustrates the Apify-OpenAI Vector Store integration:
The integration process includes:
- Loading data from an Apify Actor
- Processing the data to comply with OpenAI Assistant limits (max. 1000 files, max 5,000,000 tokens)
- Creating OpenAI files [OpenAI Files](https://platform.openai.com/docs/api-reference/files)
- Creating [OpenAI Files](https://platform.openai.com/docs/api-reference/files)
- _[Optional]_ Removing existing files from the Vector Store (specified by `fileIdsToDelete` and/or `filePrefix`)
- Adding the newly created files to the vector store.
- _[Optional]_ Deleting existing files from the OpenAI files (specified by `fileIdsToDelete` and/or `filePrefix`)
Expand All @@ -44,7 +46,7 @@ To use this integration, ensure you have:

- An OpenAI account and an `OpenAI API KEY`. Create a free account at [OpenAI](https://beta.openai.com/).
- Created an [OpenAI Vector Store](https://platform.openai.com/docs/assistants/tools/file-search/vector-stores). You will need `vectorStoreId` to run this integration.
- Created an [OpenAI Assistant](https://platform.openai.com/docs/assistants/overview).
- _[Optional]_ Created an [OpenAI Assistant](https://platform.openai.com/docs/assistants/overview).

## ➡️ Inputs

Expand Down
Loading

0 comments on commit 8b01858

Please sign in to comment.