diff --git a/argilla/docs/how_to_guides/index.md b/argilla/docs/how_to_guides/index.md index d1141d2efc..c590a730b0 100644 --- a/argilla/docs/how_to_guides/index.md +++ b/argilla/docs/how_to_guides/index.md @@ -82,6 +82,22 @@ These guides provide step-by-step instructions for common scenarios, including d
+- __Use webhooks to respond to server events__ + + --- + + Learn how to use Argilla webhooks to receive notifications about events in your Argilla Server. + + [:octicons-arrow-right-24: How-to guide](webhooks.md) + +- __Webhooks internals__ + + --- + + Learn how Argilla webhooks are implented under the hood and the structure of the different events. + + [:octicons-arrow-right-24: How-to guide](webhooks_internals.md) + - __Use Markdown to format rich content__ --- @@ -98,4 +114,4 @@ These guides provide step-by-step instructions for common scenarios, including d [:octicons-arrow-right-24: How-to guide](migrate_from_legacy_datasets.md) -
\ No newline at end of file + diff --git a/argilla/docs/how_to_guides/webhooks.md b/argilla/docs/how_to_guides/webhooks.md new file mode 100644 index 0000000000..0b0ea0f214 --- /dev/null +++ b/argilla/docs/how_to_guides/webhooks.md @@ -0,0 +1,160 @@ +--- +description: In this section, we will provide a step-by-step guide to create a webhook in Argilla. +--- + +# Use Argilla webhooks + +This guide provides an overview of how to create and use webhooks in Argilla. + +A **webhook** allows an application to submit real-time information to other applications whenever a specific event occurs. Unlike traditional APIs, you won’t need to poll for data very frequently in order to get it in real time. This makes webhooks much more efficient for both the provider and the consumer. + +## Creating a webhook listener in Argilla + +The python SDK provides a simple way to create a webhook in Argilla. It allows you to focus on the use case of the webhook and not on the implementation details. You only need to create your event handler function with the `webhook_listener` decorator. + +```python +import argilla as rg + +from datetime import datetime +from argilla import webhook_listener + +@webhook_listener(events="dataset.created") +async def my_webhook_handler(dataset: rg.Dataset, type: str, timestamp: datetime): + print(dataset, type, timestamp) +``` + +In the example above, we have created a webhook that listens to the `dataset.created` event. +> You can find the list of events in the [Events](#events) section. + +The python SDK will automatically create a webhook in Argilla and listen to the specified event. When the event is triggered, +the `my_webhook_handler` function will be called with the event data. The SDK will also parse the incoming webhook event into +a proper resource object (`rg.Dataset`, `rg.Record`, and `rg.Response`). The SDK will also take care of request authentication and error handling. + +## Running the webhook server + +Under the hood, the SDK uses the `FastAPI` framework to create the webhook server and the POST endpoint to receive the webhook events. + +To run the webhook, you need to define the webhook server in your code and start it using the `uvicorn` command. + +```python +# my_webhook.py file +from argilla import get_webhook_server + +server = get_webhook_server() +``` + +```bash +uvicorn my_webhook:server +``` + +You can explore the Swagger UI to explore your defined webhooks by visiting `http://localhost:8000/docs`. + + +The `uvicorn` command will start the webhook server on the default port `8000`. + +By default, the Python SDK will register the webhook using the server URL `http://127.0.0.1:8000/`. If you want to use a different server URL, you can set the `WEBHOOK_SERVER_URL` environment variable. + +```bash +export WEBHOOK_SERVER_URL=http://my-webhook-server.com +``` + +All incoming webhook events will be sent to the specified server URL. + +## Webhooks management + +The Python SDK provides a simple way to manage webhooks in Argilla. You can create, list, update, and delete webhooks using the SDK. + +### Create a webhook + +To create a new webhook in Argilla, you can define it in the `Webhook` class and then call the `create` method. + +```python +import argilla as rg + +client = rg.Argilla(api_url="", api_key="") + +webhook = rg.Webhook( + url="http://127.0.0.1:8000", + events=["dataset.created"], + description="My webhook" +) + +webhook.create() + +``` + +### List webhooks + +You can list all the existing webhooks in Argilla by accessing the `webhooks` attribute on the Argilla class and iterating over them. + +```python +import argilla as rg + +client = rg.Argilla(api_url="", api_key="") + +for webhook in client.webhooks: + print(webhook) + +``` + +### Update a webhook + +You can update a webhook using the `update` method. + +```python +import argilla as rg + +client = rg.Argilla(api_url="", api_key="") + +webhook = rg.Webhook( + url="http://127.0.0.1:8000", + events=["dataset.created"], + description="My webhook" +).create() + +webhook.events = ["dataset.updated"] +webhook.update() + +``` +> You should use IP address instead of localhost since the webhook validation expect a Top Level Domain (TLD) in the URL. + +### Delete a webhook + +You can delete a webhook using the `delete` method. + +```python +import argilla as rg + +client = rg.Argilla(api_url="", api_key="") + +for webhook in client.webhooks: + webhook.delete() + +``` + +## Deploying a webhooks server in a Hugging Face Space + +You can deploy your webhook in a Hugging Face Space. You can visit this [link](https://huggingface.co/spaces/argilla/argilla-webhooks/tree/main) to explore an example of a webhook server deployed in a Hugging Face Space. + + +## Events + +The following is a list of events that you can listen to in Argilla, grouped by resource type. + +### Dataset events + +- `dataset.created`: The Dataset resource was created. +- `dataset.updated`: The Dataset resource was updated. +- `dataset.deleted`: The Dataset resource was deleted. +- `dataset.published`: The Dataset resource was published. + +### Record events +- `record.created`: The Record resource was created. +- `record.updated`: The Record resource was updated. +- `record.deleted`: The Record resource was deleted. +- `record.completed`: The Record resource was completed (status="completed"). + +### Response events +- `response.created`: The Response resource was created. +- `response.updated`: The Response resource was updated. +- `response.deleted`: The Response resource was deleted. diff --git a/argilla/docs/how_to_guides/webhooks_internals.md b/argilla/docs/how_to_guides/webhooks_internals.md new file mode 100644 index 0000000000..180d9a0e28 --- /dev/null +++ b/argilla/docs/how_to_guides/webhooks_internals.md @@ -0,0 +1,1863 @@ +# Webhooks internal + +Argilla Webhooks implements [Standard Webhooks](https://www.standardwebhooks.com) to facilitate the integration of Argilla with listeners written in any language and ensure consistency and security. If you need to do a custom integration with Argilla webhooks take a look to the [specs](https://github.com/standard-webhooks/standard-webhooks/blob/main/spec/standard-webhooks.md) to have a better understanding of how to implement such integration. + +## Events payload + +The payload is the core part of every webhook. It is the actual data being sent as part of the webhook, and usually consists of important information about the event and related information. + +The payloads sent by Argilla webhooks will be a POST request with a JSON body with the following structure: + +```json +{ + "type": "example.event", + "version": 1, + "timestamp": "2022-11-03T20:26:10.344522Z", + "data": { + "foo": "bar", + } +} +``` + +Your listener must return any `2XX` status code value to indicate to Argilla that the webhook message has been successfully received. If a different status code is returned Argilla will retry up to 3 times. You have up to 20 seconds to give a response to an Argilla webhook request. + +The payload attributes are: + +* `type`: a full-stop delimited type string associated with the event. The type indicates the type of the event being sent. (e.g `"dataset.created"` or `"record.completed"`), indicates the schema of the payload (passed in `data` attribute).The following are the values that can be present on this attribute: + * `dataset.created` + * `dataset.updated` + * `dataset.deleted` + * `dataset.published` + * `record.created` + * `record.updated` + * `record.deleted` + * `record.completed` + * `response.created` + * `response.updated` + * `response.deleted` +* `version`: an integer with the version of the webhook payload sent. Right now we only support version `1`. +* `timestamp`: the timestamp of when the event occurred. +* `data`: the actual event data associated with the event. + +## Events payload examples + +In this section we will show payload examples for all the events emitted by Argilla webhooks. + +### Dataset events + +#### Created + +```json +{ + "type": "dataset.created", + "version": 1, + "timestamp": "2024-09-26T14:17:20.488053Z", + "data": { + "id": "3d673549-ad31-4485-97eb-31f9dcd0df71", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": true, + "status": "draft", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [], + "fields": [], + "metadata_properties": [], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:17:20.477163", + "inserted_at": "2024-09-26T14:17:20.477163", + "updated_at": "2024-09-26T14:17:20.477163" + } +} +``` + +#### Updated + +```json +{ + "type": "dataset.updated", + "version": 1, + "timestamp": "2024-09-26T14:17:20.504483Z", + "data": { + "id": "3d673549-ad31-4485-97eb-31f9dcd0df71", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": false, + "status": "draft", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [], + "fields": [ + { + "id": "77578693-9925-4c3d-a921-8c964cdd7acd", + "name": "text", + "title": "text", + "required": true, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-26T14:17:20.528738", + "updated_at": "2024-09-26T14:17:20.528738" + } + ] + "metadata_properties": [], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:17:20.497343", + "inserted_at": "2024-09-26T14:17:20.477163", + "updated_at": "2024-09-26T14:17:20.497343" + } +} +``` + +#### Deleted + +```json +{ + "type": "dataset.deleted", + "version": 1, + "timestamp": "2024-09-26T14:21:44.261872Z", + "data": { + "id": "3d673549-ad31-4485-97eb-31f9dcd0df71", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": false, + "status": "ready", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [ + { + "id": "80069251-4792-49e7-b58a-69a6117e8d32", + "name": "int_score", + "title": "Rate the quality of the text", + "description": null, + "required": true, + "settings": { + "type": "rating", + "options": [ + { + "value": 0 + }, + { + "value": 1 + }, + { + "value": 2 + }, + { + "value": 3 + }, + { + "value": 4 + }, + { + "value": 5 + } + ] + }, + "inserted_at": "2024-09-26T14:17:20.541716", + "updated_at": "2024-09-26T14:17:20.541716" + }, + { + "id": "5e7b45c3-b863-48c8-a1e8-2caa279b71e7", + "name": "comments", + "title": "Comments:", + "description": null, + "required": false, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-26T14:17:20.551750", + "updated_at": "2024-09-26T14:17:20.551750" + } + ], + "fields": [ + { + "id": "77578693-9925-4c3d-a921-8c964cdd7acd", + "name": "text", + "title": "text", + "required": true, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-26T14:17:20.528738", + "updated_at": "2024-09-26T14:17:20.528738" + } + ], + "metadata_properties": [ + { + "id": "284945d9-4bda-4fde-9ca0-b3928282ce83", + "name": "dump", + "title": "dump", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.560704", + "updated_at": "2024-09-26T14:17:20.560704" + }, + { + "id": "5b8f17e5-1be5-4d99-b3d3-567cfaf33fe3", + "name": "url", + "title": "url", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.570162", + "updated_at": "2024-09-26T14:17:20.570162" + }, + { + "id": "a18c60ca-0212-4b22-b1f4-ab3e0fc5ae95", + "name": "language", + "title": "language", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.578088", + "updated_at": "2024-09-26T14:17:20.578088" + }, + { + "id": "c5f6d407-87b7-4678-9c7b-28cd002fcefb", + "name": "language_score", + "title": "language_score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.585319", + "updated_at": "2024-09-26T14:17:20.585319" + }, + { + "id": "ed3ee682-5d12-4c58-91a2-b1cca89fe62b", + "name": "token_count", + "title": "token_count", + "settings": { + "min": null, + "max": null, + "type": "integer" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.593545", + "updated_at": "2024-09-26T14:17:20.593545" + }, + { + "id": "c807d5dd-3cf0-47b9-b07e-bcf03176115f", + "name": "score", + "title": "score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.601316", + "updated_at": "2024-09-26T14:17:20.601316" + } + ], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:17:20.675364", + "inserted_at": "2024-09-26T14:17:20.477163", + "updated_at": "2024-09-26T14:17:20.675364" + } +} +``` + +#### Published + +```json +{ + "type": "dataset.published", + "version": 1, + "timestamp": "2024-09-26T14:17:20.680921Z", + "data": { + "id": "3d673549-ad31-4485-97eb-31f9dcd0df71", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": false, + "status": "ready", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [ + { + "id": "80069251-4792-49e7-b58a-69a6117e8d32", + "name": "int_score", + "title": "Rate the quality of the text", + "description": null, + "required": true, + "settings": { + "type": "rating", + "options": [ + { + "value": 0 + }, + { + "value": 1 + }, + { + "value": 2 + }, + { + "value": 3 + }, + { + "value": 4 + }, + { + "value": 5 + } + ] + }, + "inserted_at": "2024-09-26T14:17:20.541716", + "updated_at": "2024-09-26T14:17:20.541716" + }, + { + "id": "5e7b45c3-b863-48c8-a1e8-2caa279b71e7", + "name": "comments", + "title": "Comments:", + "description": null, + "required": false, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-26T14:17:20.551750", + "updated_at": "2024-09-26T14:17:20.551750" + } + ], + "fields": [ + { + "id": "77578693-9925-4c3d-a921-8c964cdd7acd", + "name": "text", + "title": "text", + "required": true, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-26T14:17:20.528738", + "updated_at": "2024-09-26T14:17:20.528738" + } + ], + "metadata_properties": [ + { + "id": "284945d9-4bda-4fde-9ca0-b3928282ce83", + "name": "dump", + "title": "dump", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.560704", + "updated_at": "2024-09-26T14:17:20.560704" + }, + { + "id": "5b8f17e5-1be5-4d99-b3d3-567cfaf33fe3", + "name": "url", + "title": "url", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.570162", + "updated_at": "2024-09-26T14:17:20.570162" + }, + { + "id": "a18c60ca-0212-4b22-b1f4-ab3e0fc5ae95", + "name": "language", + "title": "language", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.578088", + "updated_at": "2024-09-26T14:17:20.578088" + }, + { + "id": "c5f6d407-87b7-4678-9c7b-28cd002fcefb", + "name": "language_score", + "title": "language_score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.585319", + "updated_at": "2024-09-26T14:17:20.585319" + }, + { + "id": "ed3ee682-5d12-4c58-91a2-b1cca89fe62b", + "name": "token_count", + "title": "token_count", + "settings": { + "min": null, + "max": null, + "type": "integer" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.593545", + "updated_at": "2024-09-26T14:17:20.593545" + }, + { + "id": "c807d5dd-3cf0-47b9-b07e-bcf03176115f", + "name": "score", + "title": "score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.601316", + "updated_at": "2024-09-26T14:17:20.601316" + } + ], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:17:20.675364", + "inserted_at": "2024-09-26T14:17:20.477163", + "updated_at": "2024-09-26T14:17:20.675364" + } +} +``` + +### Record events + +#### Created + +```json +{ + "type": "record.created", + "version": 1, + "timestamp": "2024-09-26T14:17:43.078165Z", + "data": { + "id": "49e0acda-df13-4f65-8137-2274b3e33c9c", + "status": "pending", + "fields": { + "text": "Taking Play Seriously\nBy ROBIN MARANTZ HENIG\nPublished: February 17, 2008\nOn a drizzly Tuesday night in late January, 200 people came out to hear a psychiatrist talk rhapsodically about play -- not just the intense, joyous play of children, but play for all people, at all ages, at all times." + }, + "metadata": { + "dump": "CC-MAIN-2013-20", + "url": "http://query.nytimes.com/gst/fullpage.html?res=9404E7DA1339F934A25751C0A96E9C8B63&scp=2&sq=taking%20play%20seriously&st=cse", + "language": "en", + "language_score": 0.9614589214324951, + "token_count": 1055, + "score": 2.5625 + }, + "external_id": "", + "dataset": { + "id": "3d673549-ad31-4485-97eb-31f9dcd0df71", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": false, + "status": "ready", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [ + { + "id": "80069251-4792-49e7-b58a-69a6117e8d32", + "name": "int_score", + "title": "Rate the quality of the text", + "description": null, + "required": true, + "settings": { + "type": "rating", + "options": [ + { + "value": 0 + }, + { + "value": 1 + }, + { + "value": 2 + }, + { + "value": 3 + }, + { + "value": 4 + }, + { + "value": 5 + } + ] + }, + "inserted_at": "2024-09-26T14:17:20.541716", + "updated_at": "2024-09-26T14:17:20.541716" + }, + { + "id": "5e7b45c3-b863-48c8-a1e8-2caa279b71e7", + "name": "comments", + "title": "Comments:", + "description": null, + "required": false, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-26T14:17:20.551750", + "updated_at": "2024-09-26T14:17:20.551750" + } + ], + "fields": [ + { + "id": "77578693-9925-4c3d-a921-8c964cdd7acd", + "name": "text", + "title": "text", + "required": true, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-26T14:17:20.528738", + "updated_at": "2024-09-26T14:17:20.528738" + } + ], + "metadata_properties": [ + { + "id": "284945d9-4bda-4fde-9ca0-b3928282ce83", + "name": "dump", + "title": "dump", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.560704", + "updated_at": "2024-09-26T14:17:20.560704" + }, + { + "id": "5b8f17e5-1be5-4d99-b3d3-567cfaf33fe3", + "name": "url", + "title": "url", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.570162", + "updated_at": "2024-09-26T14:17:20.570162" + }, + { + "id": "a18c60ca-0212-4b22-b1f4-ab3e0fc5ae95", + "name": "language", + "title": "language", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.578088", + "updated_at": "2024-09-26T14:17:20.578088" + }, + { + "id": "c5f6d407-87b7-4678-9c7b-28cd002fcefb", + "name": "language_score", + "title": "language_score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.585319", + "updated_at": "2024-09-26T14:17:20.585319" + }, + { + "id": "ed3ee682-5d12-4c58-91a2-b1cca89fe62b", + "name": "token_count", + "title": "token_count", + "settings": { + "min": null, + "max": null, + "type": "integer" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.593545", + "updated_at": "2024-09-26T14:17:20.593545" + }, + { + "id": "c807d5dd-3cf0-47b9-b07e-bcf03176115f", + "name": "score", + "title": "score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-26T14:17:20.601316", + "updated_at": "2024-09-26T14:17:20.601316" + } + ], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:17:20.675364", + "inserted_at": "2024-09-26T14:17:20.477163", + "updated_at": "2024-09-26T14:17:20.675364" + }, + "inserted_at": "2024-09-26T14:17:43.026852", + "updated_at": "2024-09-26T14:17:43.026852" + } +} +``` + +#### Updated + +```json +{ + "type": "record.updated", + "version": 1, + "timestamp": "2024-09-26T14:05:30.231988Z", + "data": { + "id": "88654411-4eec-4d17-ad73-e5baf59d0efb", + "status": "completed", + "fields": { + "text": "Throughout life there are many times when outside influences change or influence decision-making. The young child has inner motivation to learn and explore, but as he matures, finds outside sources to be a motivating force for development, as well." + }, + "metadata": { + "dump": "CC-MAIN-2013-20", + "url": "http://www.funderstanding.com/category/child-development/brain-child-development/", + "language": "en", + "language_score": 0.9633054733276367, + "token_count": 1062, + "score": 3.8125 + }, + "external_id": "", + "dataset": { + "id": "ae2961f0-18a4-49d5-ba0c-40fa863fc8f2", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": false, + "status": "ready", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [ + { + "id": "faeea416-5390-4721-943c-de7d0212ba20", + "name": "int_score", + "title": "Rate the quality of the text", + "description": null, + "required": true, + "settings": { + "type": "rating", + "options": [ + { + "value": 0 + }, + { + "value": 1 + }, + { + "value": 2 + }, + { + "value": 3 + }, + { + "value": 4 + }, + { + "value": 5 + } + ] + }, + "inserted_at": "2024-09-20T09:39:20.481193", + "updated_at": "2024-09-20T09:39:20.481193" + }, + { + "id": "0e14a758-a6d0-43ff-af5b-39f4e4d031ab", + "name": "comments", + "title": "Comments:", + "description": null, + "required": false, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.490851", + "updated_at": "2024-09-20T09:39:20.490851" + } + ], + "fields": [ + { + "id": "a4e81325-7d11-4dcf-af23-d3c867c75c9c", + "name": "text", + "title": "text", + "required": true, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.468254", + "updated_at": "2024-09-20T09:39:20.468254" + } + ], + "metadata_properties": [ + { + "id": "1259d700-2ff6-4315-a3c7-703bce3d65d7", + "name": "dump", + "title": "dump", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.499466", + "updated_at": "2024-09-20T09:39:20.499466" + }, + { + "id": "9d135f00-5a51-4506-a607-bc463dce1c2f", + "name": "url", + "title": "url", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.507944", + "updated_at": "2024-09-20T09:39:20.507944" + }, + { + "id": "98eced0d-d92f-486c-841c-a55085c7538b", + "name": "language", + "title": "language", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.517551", + "updated_at": "2024-09-20T09:39:20.517551" + }, + { + "id": "b9f9a3b9-7186-4e23-9147-b5aa52d0d045", + "name": "language_score", + "title": "language_score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.526219", + "updated_at": "2024-09-20T09:39:20.526219" + }, + { + "id": "0585c420-5885-4fce-9757-82c5199304bc", + "name": "token_count", + "title": "token_count", + "settings": { + "min": null, + "max": null, + "type": "integer" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.534559", + "updated_at": "2024-09-20T09:39:20.534559" + }, + { + "id": "ae31acb5-f198-4f0b-8d6c-13fcc80d10d1", + "name": "score", + "title": "score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.544562", + "updated_at": "2024-09-20T09:39:20.544562" + } + ], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:05:30.129734", + "inserted_at": "2024-09-20T09:39:20.433798", + "updated_at": "2024-09-26T14:05:30.130662" + }, + "inserted_at": "2024-09-20T09:39:23.148539", + "updated_at": "2024-09-26T14:05:30.224076" + } +} +``` + +#### Deleted + +```json +{ + "type": "record.deleted", + "version": 1, + "timestamp": "2024-09-26T14:45:30.464503Z", + "data": { + "id": "5b285767-18c9-46ab-a4ec-5e0ee4e26de9", + "status": "pending", + "fields": { + "text": "This tutorial shows how to send modifications of code in the right way: by using patches.\nThe word developer is used here for someone having a KDE SVN account.\nWe suppose that you have modified some code in KDE and that you are ready to share it. First a few important points:\nNow you have the modification as a source file. Sending the source file will not be helpful, as probably someone else has done other modifications to the original file in the meantime. So your modified file could not replace it." + }, + "metadata": { + "dump": "CC-MAIN-2013-20", + "url": "http://techbase.kde.org/index.php?title=Contribute/Send_Patches&oldid=40759", + "language": "en", + "language_score": 0.9597765207290649, + "token_count": 2482, + "score": 3.0625 + }, + "external_id": "", + "dataset": { + "id": "ae2961f0-18a4-49d5-ba0c-40fa863fc8f2", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": false, + "status": "ready", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [ + { + "id": "faeea416-5390-4721-943c-de7d0212ba20", + "name": "int_score", + "title": "Rate the quality of the text", + "description": null, + "required": true, + "settings": { + "type": "rating", + "options": [ + { + "value": 0 + }, + { + "value": 1 + }, + { + "value": 2 + }, + { + "value": 3 + }, + { + "value": 4 + }, + { + "value": 5 + } + ] + }, + "inserted_at": "2024-09-20T09:39:20.481193", + "updated_at": "2024-09-20T09:39:20.481193" + }, + { + "id": "0e14a758-a6d0-43ff-af5b-39f4e4d031ab", + "name": "comments", + "title": "Comments:", + "description": null, + "required": false, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.490851", + "updated_at": "2024-09-20T09:39:20.490851" + } + ], + "fields": [ + { + "id": "a4e81325-7d11-4dcf-af23-d3c867c75c9c", + "name": "text", + "title": "text", + "required": true, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.468254", + "updated_at": "2024-09-20T09:39:20.468254" + } + ], + "metadata_properties": [ + { + "id": "1259d700-2ff6-4315-a3c7-703bce3d65d7", + "name": "dump", + "title": "dump", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.499466", + "updated_at": "2024-09-20T09:39:20.499466" + }, + { + "id": "9d135f00-5a51-4506-a607-bc463dce1c2f", + "name": "url", + "title": "url", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.507944", + "updated_at": "2024-09-20T09:39:20.507944" + }, + { + "id": "98eced0d-d92f-486c-841c-a55085c7538b", + "name": "language", + "title": "language", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.517551", + "updated_at": "2024-09-20T09:39:20.517551" + }, + { + "id": "b9f9a3b9-7186-4e23-9147-b5aa52d0d045", + "name": "language_score", + "title": "language_score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.526219", + "updated_at": "2024-09-20T09:39:20.526219" + }, + { + "id": "0585c420-5885-4fce-9757-82c5199304bc", + "name": "token_count", + "title": "token_count", + "settings": { + "min": null, + "max": null, + "type": "integer" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.534559", + "updated_at": "2024-09-20T09:39:20.534559" + }, + { + "id": "ae31acb5-f198-4f0b-8d6c-13fcc80d10d1", + "name": "score", + "title": "score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.544562", + "updated_at": "2024-09-20T09:39:20.544562" + } + ], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:15:11.139023", + "inserted_at": "2024-09-20T09:39:20.433798", + "updated_at": "2024-09-26T14:15:11.141067" + }, + "inserted_at": "2024-09-20T09:39:23.148687", + "updated_at": "2024-09-20T09:39:23.148687" + } +} +``` + +#### Completed + +```json +{ + "type": "record.completed", + "version": 1, + "timestamp": "2024-09-26T14:05:30.236958Z", + "data": { + "id": "88654411-4eec-4d17-ad73-e5baf59d0efb", + "status": "completed", + "fields": { + "text": "Throughout life there are many times when outside influences change or influence decision-making. The young child has inner motivation to learn and explore, but as he matures, finds outside sources to be a motivating force for development, as well." + }, + "metadata": { + "dump": "CC-MAIN-2013-20", + "url": "http://www.funderstanding.com/category/child-development/brain-child-development/", + "language": "en", + "language_score": 0.9633054733276367, + "token_count": 1062, + "score": 3.8125 + }, + "external_id": "", + "dataset": { + "id": "ae2961f0-18a4-49d5-ba0c-40fa863fc8f2", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": false, + "status": "ready", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [ + { + "id": "faeea416-5390-4721-943c-de7d0212ba20", + "name": "int_score", + "title": "Rate the quality of the text", + "description": null, + "required": true, + "settings": { + "type": "rating", + "options": [ + { + "value": 0 + }, + { + "value": 1 + }, + { + "value": 2 + }, + { + "value": 3 + }, + { + "value": 4 + }, + { + "value": 5 + } + ] + }, + "inserted_at": "2024-09-20T09:39:20.481193", + "updated_at": "2024-09-20T09:39:20.481193" + }, + { + "id": "0e14a758-a6d0-43ff-af5b-39f4e4d031ab", + "name": "comments", + "title": "Comments:", + "description": null, + "required": false, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.490851", + "updated_at": "2024-09-20T09:39:20.490851" + } + ], + "fields": [ + { + "id": "a4e81325-7d11-4dcf-af23-d3c867c75c9c", + "name": "text", + "title": "text", + "required": true, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.468254", + "updated_at": "2024-09-20T09:39:20.468254" + } + ], + "metadata_properties": [ + { + "id": "1259d700-2ff6-4315-a3c7-703bce3d65d7", + "name": "dump", + "title": "dump", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.499466", + "updated_at": "2024-09-20T09:39:20.499466" + }, + { + "id": "9d135f00-5a51-4506-a607-bc463dce1c2f", + "name": "url", + "title": "url", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.507944", + "updated_at": "2024-09-20T09:39:20.507944" + }, + { + "id": "98eced0d-d92f-486c-841c-a55085c7538b", + "name": "language", + "title": "language", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.517551", + "updated_at": "2024-09-20T09:39:20.517551" + }, + { + "id": "b9f9a3b9-7186-4e23-9147-b5aa52d0d045", + "name": "language_score", + "title": "language_score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.526219", + "updated_at": "2024-09-20T09:39:20.526219" + }, + { + "id": "0585c420-5885-4fce-9757-82c5199304bc", + "name": "token_count", + "title": "token_count", + "settings": { + "min": null, + "max": null, + "type": "integer" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.534559", + "updated_at": "2024-09-20T09:39:20.534559" + }, + { + "id": "ae31acb5-f198-4f0b-8d6c-13fcc80d10d1", + "name": "score", + "title": "score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.544562", + "updated_at": "2024-09-20T09:39:20.544562" + } + ], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:05:30.129734", + "inserted_at": "2024-09-20T09:39:20.433798", + "updated_at": "2024-09-26T14:05:30.130662" + }, + "inserted_at": "2024-09-20T09:39:23.148539", + "updated_at": "2024-09-26T14:05:30.224076" + } +} +``` + +### Response events + +#### Created + +```json +{ + "type": "response.created", + "version": 1, + "timestamp": "2024-09-26T14:05:30.182364Z", + "data": { + "id": "7164a58e-3611-4b0a-98cc-9184bc92dc5a", + "values": { + "int_score": { + "value": 3 + } + }, + "status": "submitted", + "record": { + "id": "88654411-4eec-4d17-ad73-e5baf59d0efb", + "status": "pending", + "fields": { + "text": "Throughout life there are many times when outside influences change or influence decision-making. The young child has inner motivation to learn and explore, but as he matures, finds outside sources to be a motivating force for development, as well." + }, + "metadata": { + "dump": "CC-MAIN-2013-20", + "url": "http://www.funderstanding.com/category/child-development/brain-child-development/", + "language": "en", + "language_score": 0.9633054733276367, + "token_count": 1062, + "score": 3.8125 + }, + "external_id": "", + "dataset": { + "id": "ae2961f0-18a4-49d5-ba0c-40fa863fc8f2", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": false, + "status": "ready", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [ + { + "id": "faeea416-5390-4721-943c-de7d0212ba20", + "name": "int_score", + "title": "Rate the quality of the text", + "description": null, + "required": true, + "settings": { + "type": "rating", + "options": [ + { + "value": 0 + }, + { + "value": 1 + }, + { + "value": 2 + }, + { + "value": 3 + }, + { + "value": 4 + }, + { + "value": 5 + } + ] + }, + "inserted_at": "2024-09-20T09:39:20.481193", + "updated_at": "2024-09-20T09:39:20.481193" + }, + { + "id": "0e14a758-a6d0-43ff-af5b-39f4e4d031ab", + "name": "comments", + "title": "Comments:", + "description": null, + "required": false, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.490851", + "updated_at": "2024-09-20T09:39:20.490851" + } + ], + "fields": [ + { + "id": "a4e81325-7d11-4dcf-af23-d3c867c75c9c", + "name": "text", + "title": "text", + "required": true, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.468254", + "updated_at": "2024-09-20T09:39:20.468254" + } + ], + "metadata_properties": [ + { + "id": "1259d700-2ff6-4315-a3c7-703bce3d65d7", + "name": "dump", + "title": "dump", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.499466", + "updated_at": "2024-09-20T09:39:20.499466" + }, + { + "id": "9d135f00-5a51-4506-a607-bc463dce1c2f", + "name": "url", + "title": "url", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.507944", + "updated_at": "2024-09-20T09:39:20.507944" + }, + { + "id": "98eced0d-d92f-486c-841c-a55085c7538b", + "name": "language", + "title": "language", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.517551", + "updated_at": "2024-09-20T09:39:20.517551" + }, + { + "id": "b9f9a3b9-7186-4e23-9147-b5aa52d0d045", + "name": "language_score", + "title": "language_score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.526219", + "updated_at": "2024-09-20T09:39:20.526219" + }, + { + "id": "0585c420-5885-4fce-9757-82c5199304bc", + "name": "token_count", + "title": "token_count", + "settings": { + "min": null, + "max": null, + "type": "integer" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.534559", + "updated_at": "2024-09-20T09:39:20.534559" + }, + { + "id": "ae31acb5-f198-4f0b-8d6c-13fcc80d10d1", + "name": "score", + "title": "score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.544562", + "updated_at": "2024-09-20T09:39:20.544562" + } + ], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:05:30.129734", + "inserted_at": "2024-09-20T09:39:20.433798", + "updated_at": "2024-09-23T11:08:30.392833" + }, + "inserted_at": "2024-09-20T09:39:23.148539", + "updated_at": "2024-09-20T09:39:23.148539" + }, + "user": { + "id": "df114042-958d-42c6-9f03-ab49bd451c6c", + "first_name": "", + "last_name": null, + "username": "argilla", + "role": "owner", + "inserted_at": "2024-09-05T11:39:20.376463", + "updated_at": "2024-09-05T11:39:20.376463" + }, + "inserted_at": "2024-09-26T14:05:30.128332", + "updated_at": "2024-09-26T14:05:30.128332" + } +} +``` + +#### Updated + +```json +{ + "type": "response.updated", + "version": 1, + "timestamp": "2024-09-26T14:13:22.256501Z", + "data": { + "id": "38e4d537-c768-4ced-916e-31b74b220c36", + "values": { + "int_score": { + "value": 5 + } + }, + "status": "discarded", + "record": { + "id": "54b137ae-68a4-4aa4-ab2f-ef350ca96a6b", + "status": "completed", + "fields": { + "text": "Bolivia: Coca-chewing protest outside US embassy\nIndigenous activists in Bolivia have been holding a mass coca-chewing protest as part of campaign to end an international ban on the practice.\nHundreds of people chewed the leaf outside the US embassy in La Paz and in other cities across the country." + }, + "metadata": { + "dump": "CC-MAIN-2013-20", + "url": "http://www.bbc.co.uk/news/world-latin-america-12292661", + "language": "en", + "language_score": 0.9660392999649048, + "token_count": 484, + "score": 2.703125 + }, + "external_id": "", + "dataset": { + "id": "ae2961f0-18a4-49d5-ba0c-40fa863fc8f2", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": false, + "status": "ready", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [ + { + "id": "faeea416-5390-4721-943c-de7d0212ba20", + "name": "int_score", + "title": "Rate the quality of the text", + "description": null, + "required": true, + "settings": { + "type": "rating", + "options": [ + { + "value": 0 + }, + { + "value": 1 + }, + { + "value": 2 + }, + { + "value": 3 + }, + { + "value": 4 + }, + { + "value": 5 + } + ] + }, + "inserted_at": "2024-09-20T09:39:20.481193", + "updated_at": "2024-09-20T09:39:20.481193" + }, + { + "id": "0e14a758-a6d0-43ff-af5b-39f4e4d031ab", + "name": "comments", + "title": "Comments:", + "description": null, + "required": false, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.490851", + "updated_at": "2024-09-20T09:39:20.490851" + } + ], + "fields": [ + { + "id": "a4e81325-7d11-4dcf-af23-d3c867c75c9c", + "name": "text", + "title": "text", + "required": true, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.468254", + "updated_at": "2024-09-20T09:39:20.468254" + } + ], + "metadata_properties": [ + { + "id": "1259d700-2ff6-4315-a3c7-703bce3d65d7", + "name": "dump", + "title": "dump", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.499466", + "updated_at": "2024-09-20T09:39:20.499466" + }, + { + "id": "9d135f00-5a51-4506-a607-bc463dce1c2f", + "name": "url", + "title": "url", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.507944", + "updated_at": "2024-09-20T09:39:20.507944" + }, + { + "id": "98eced0d-d92f-486c-841c-a55085c7538b", + "name": "language", + "title": "language", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.517551", + "updated_at": "2024-09-20T09:39:20.517551" + }, + { + "id": "b9f9a3b9-7186-4e23-9147-b5aa52d0d045", + "name": "language_score", + "title": "language_score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.526219", + "updated_at": "2024-09-20T09:39:20.526219" + }, + { + "id": "0585c420-5885-4fce-9757-82c5199304bc", + "name": "token_count", + "title": "token_count", + "settings": { + "min": null, + "max": null, + "type": "integer" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.534559", + "updated_at": "2024-09-20T09:39:20.534559" + }, + { + "id": "ae31acb5-f198-4f0b-8d6c-13fcc80d10d1", + "name": "score", + "title": "score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.544562", + "updated_at": "2024-09-20T09:39:20.544562" + } + ], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:13:22.204670", + "inserted_at": "2024-09-20T09:39:20.433798", + "updated_at": "2024-09-26T14:07:09.788573" + }, + "inserted_at": "2024-09-20T09:39:23.148505", + "updated_at": "2024-09-26T14:06:06.726296" + }, + "user": { + "id": "df114042-958d-42c6-9f03-ab49bd451c6c", + "first_name": "", + "last_name": null, + "username": "argilla", + "role": "owner", + "inserted_at": "2024-09-05T11:39:20.376463", + "updated_at": "2024-09-05T11:39:20.376463" + }, + "inserted_at": "2024-09-26T14:06:06.672138", + "updated_at": "2024-09-26T14:13:22.206179" + } +} +``` + +#### Deleted + +```json +{ + "type": "response.deleted", + "version": 1, + "timestamp": "2024-09-26T14:15:11.138363Z", + "data": { + "id": "7164a58e-3611-4b0a-98cc-9184bc92dc5a", + "values": { + "int_score": { + "value": 3 + } + }, + "status": "submitted", + "record": { + "id": "88654411-4eec-4d17-ad73-e5baf59d0efb", + "status": "completed", + "fields": { + "text": "Throughout life there are many times when outside influences change or influence decision-making. The young child has inner motivation to learn and explore, but as he matures, finds outside sources to be a motivating force for development, as well." + }, + "metadata": { + "dump": "CC-MAIN-2013-20", + "url": "http://www.funderstanding.com/category/child-development/brain-child-development/", + "language": "en", + "language_score": 0.9633054733276367, + "token_count": 1062, + "score": 3.8125 + }, + "external_id": "", + "dataset": { + "id": "ae2961f0-18a4-49d5-ba0c-40fa863fc8f2", + "name": "fineweb-edu-min", + "guidelines": null, + "allow_extra_metadata": false, + "status": "ready", + "distribution": { + "strategy": "overlap", + "min_submitted": 1 + }, + "workspace": { + "id": "350bc020-2cd2-4a67-8b23-37a15c4d8139", + "name": "argilla", + "inserted_at": "2024-09-05T11:39:20.377192", + "updated_at": "2024-09-05T11:39:20.377192" + }, + "questions": [ + { + "id": "faeea416-5390-4721-943c-de7d0212ba20", + "name": "int_score", + "title": "Rate the quality of the text", + "description": null, + "required": true, + "settings": { + "type": "rating", + "options": [ + { + "value": 0 + }, + { + "value": 1 + }, + { + "value": 2 + }, + { + "value": 3 + }, + { + "value": 4 + }, + { + "value": 5 + } + ] + }, + "inserted_at": "2024-09-20T09:39:20.481193", + "updated_at": "2024-09-20T09:39:20.481193" + }, + { + "id": "0e14a758-a6d0-43ff-af5b-39f4e4d031ab", + "name": "comments", + "title": "Comments:", + "description": null, + "required": false, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.490851", + "updated_at": "2024-09-20T09:39:20.490851" + } + ], + "fields": [ + { + "id": "a4e81325-7d11-4dcf-af23-d3c867c75c9c", + "name": "text", + "title": "text", + "required": true, + "settings": { + "type": "text", + "use_markdown": false + }, + "inserted_at": "2024-09-20T09:39:20.468254", + "updated_at": "2024-09-20T09:39:20.468254" + } + ], + "metadata_properties": [ + { + "id": "1259d700-2ff6-4315-a3c7-703bce3d65d7", + "name": "dump", + "title": "dump", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.499466", + "updated_at": "2024-09-20T09:39:20.499466" + }, + { + "id": "9d135f00-5a51-4506-a607-bc463dce1c2f", + "name": "url", + "title": "url", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.507944", + "updated_at": "2024-09-20T09:39:20.507944" + }, + { + "id": "98eced0d-d92f-486c-841c-a55085c7538b", + "name": "language", + "title": "language", + "settings": { + "type": "terms", + "values": null + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.517551", + "updated_at": "2024-09-20T09:39:20.517551" + }, + { + "id": "b9f9a3b9-7186-4e23-9147-b5aa52d0d045", + "name": "language_score", + "title": "language_score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.526219", + "updated_at": "2024-09-20T09:39:20.526219" + }, + { + "id": "0585c420-5885-4fce-9757-82c5199304bc", + "name": "token_count", + "title": "token_count", + "settings": { + "min": null, + "max": null, + "type": "integer" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.534559", + "updated_at": "2024-09-20T09:39:20.534559" + }, + { + "id": "ae31acb5-f198-4f0b-8d6c-13fcc80d10d1", + "name": "score", + "title": "score", + "settings": { + "min": null, + "max": null, + "type": "float" + }, + "visible_for_annotators": true, + "inserted_at": "2024-09-20T09:39:20.544562", + "updated_at": "2024-09-20T09:39:20.544562" + } + ], + "vectors_settings": [], + "last_activity_at": "2024-09-26T14:13:22.204670", + "inserted_at": "2024-09-20T09:39:20.433798", + "updated_at": "2024-09-26T14:13:22.207478" + }, + "inserted_at": "2024-09-20T09:39:23.148539", + "updated_at": "2024-09-26T14:05:30.224076" + }, + "user": { + "id": "df114042-958d-42c6-9f03-ab49bd451c6c", + "first_name": "", + "last_name": null, + "username": "argilla", + "role": "owner", + "inserted_at": "2024-09-05T11:39:20.376463", + "updated_at": "2024-09-05T11:39:20.376463" + }, + "inserted_at": "2024-09-26T14:05:30.128332", + "updated_at": "2024-09-26T14:05:30.128332" + } +} +``` + +## How to implement a listener + +Argilla webhooks implements [Standard Webhooks](https://www.standardwebhooks.com) so you can use one of their libraries to implement the verification of webhooks events coming from Argilla, available in many different languages. + +The following example is a simple listener written in Ruby, using [sinatra](https://sinatrarb.com) and [standardwebhooks Ruby library](https://github.com/standard-webhooks/standard-webhooks/tree/main/libraries/ruby): + +```ruby +require "sinatra" +require "standardwebhooks" + +post "/webhook" do + wh = StandardWebhooks::Webhook.new("YOUR_SECRET") + + headers = { + "webhook-id" => env["HTTP_WEBHOOK_ID"], + "webhook-signature" => env["HTTP_WEBHOOK_SIGNATURE"], + "webhook-timestamp" => env["HTTP_WEBHOOK_TIMESTAMP"], + } + + puts wh.verify(request.body.read.to_s, headers) +end +``` + +In the previous snippet we are creating a [sinatra](https://sinatrarb.com) application that listens for `POST` requests on `/webhook` endpoint. We are using the [standardwebhooks Ruby library](https://github.com/standard-webhooks/standard-webhooks/tree/main/libraries/ruby) to verify the incoming webhook event and printing the verified payload on the console. diff --git a/argilla/docs/reference/argilla/SUMMARY.md b/argilla/docs/reference/argilla/SUMMARY.md index cfe33198e5..49d0ce459d 100644 --- a/argilla/docs/reference/argilla/SUMMARY.md +++ b/argilla/docs/reference/argilla/SUMMARY.md @@ -15,4 +15,5 @@ * [rg.Vector](records/vectors.md) * [rg.Metadata](records/metadata.md) * [rg.Query](search.md) +* [Webhooks](webhooks.md) * [rg.markdown](markdown.md) diff --git a/argilla/docs/reference/argilla/webhooks.md b/argilla/docs/reference/argilla/webhooks.md new file mode 100644 index 0000000000..3f71a4fb32 --- /dev/null +++ b/argilla/docs/reference/argilla/webhooks.md @@ -0,0 +1,61 @@ +--- +hide: footer +--- + +# `argilla.webhooks` + +Webhooks are a way for web applications to notify each other when something happens. For example, you might want to be +notified when a new dataset is created in Argilla. + +## Usage Examples + +To listen for incoming webhooks, you can use the `webhook_listener` decorator function to register a function to be called +when a webhook is received: + +```python +from argilla.webhooks import webhook_listener + +@webhook_listener(events="dataset.created") +async def my_webhook_listener(dataset): + print(dataset) +``` + +To manually create a new webhook, instantiate the `Webhook` object with the client and the name: + +```python +webhook = rg.Webhook( + url="https://somehost.com/webhook", + events=["dataset.created"], + description="My webhook" +) +webhook.create() +``` + +To retrieve a list of existing webhooks, use the `client.webhooks` attribute: + +```python +for webhook in client.webhooks(): + print(webhook) +``` + +--- + +::: src.argilla.webhooks._resource.Webhook + +::: src.argilla.webhooks._helpers.webhook_listener + +::: src.argilla.webhooks._helpers.get_webhook_server + +::: src.argilla.webhooks._helpers.set_webhook_server + +::: src.argilla.webhooks._handler.WebhookHandler + +::: src.argilla.webhooks._event.WebhookEvent + +::: src.argilla.webhooks._event.DatasetEvent + +::: src.argilla.webhooks._event.RecordEvent + +::: src.argilla.webhooks._event.UserResponseEvent + + diff --git a/argilla/mkdocs.yml b/argilla/mkdocs.yml index 9ce3d019b6..efa8a8e4b0 100644 --- a/argilla/mkdocs.yml +++ b/argilla/mkdocs.yml @@ -173,6 +173,7 @@ nav: - Query and filter records: how_to_guides/query.md - Import and export datasets: how_to_guides/import_export.md - Advanced: + - Use webhooks to respond to server events: how_to_guides/webhooks.md - Use Markdown to format rich content: how_to_guides/use_markdown_to_format_rich_content.md - Migrate users, workspaces and datasets to Argilla V2: how_to_guides/migrate_from_legacy_datasets.md - Tutorials: diff --git a/argilla/pdm.lock b/argilla/pdm.lock index a7bdf0b856..4c9ea2939e 100644 --- a/argilla/pdm.lock +++ b/argilla/pdm.lock @@ -5,7 +5,7 @@ groups = ["default", "dev"] strategy = ["inherit_metadata"] lock_version = "4.5.0" -content_hash = "sha256:6f49f28d670ea9bd2cddad3e1b380bad7b6b236a6d15a8ea443bca0c3882b19c" +content_hash = "sha256:344f869c491801601ba6165c094fea325ce823713cd0761164744a94431abf60" [[metadata.targets]] requires_python = ">=3.9,<3.13" @@ -611,7 +611,7 @@ name = "deprecated" version = "1.2.14" requires_python = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*" summary = "Python @deprecated decorator to deprecate old python classes, functions or methods." -groups = ["dev"] +groups = ["default", "dev"] dependencies = [ "wrapt<2,>=1.10", ] @@ -1860,7 +1860,7 @@ name = "pillow" version = "10.4.0" requires_python = ">=3.8" summary = "Python Imaging Library (Fork)" -groups = ["dev"] +groups = ["default", "dev"] files = [ {file = "pillow-10.4.0-cp310-cp310-macosx_10_10_x86_64.whl", hash = "sha256:4d9667937cfa347525b319ae34375c37b9ee6b525440f3ef48542fcf66f2731e"}, {file = "pillow-10.4.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:543f3dc61c18dafb755773efc89aae60d06b6596a63914107f75459cf984164d"}, @@ -2753,6 +2753,24 @@ files = [ {file = "stack_data-0.6.3.tar.gz", hash = "sha256:836a778de4fec4dcd1dcd89ed8abff8a221f58308462e1c4aa2a3cf30148f0b9"}, ] +[[package]] +name = "standardwebhooks" +version = "1.0.0" +requires_python = ">=3.6" +summary = "Standard Webhooks" +groups = ["default"] +dependencies = [ + "Deprecated", + "attrs>=21.3.0", + "httpx>=0.23.0", + "python-dateutil", + "types-Deprecated", + "types-python-dateutil", +] +files = [ + {file = "standardwebhooks-1.0.0.tar.gz", hash = "sha256:d94b99c0dcea84156e03adad94f8dba32d5454cc68e12ec2c824051b55bb67ff"}, +] + [[package]] name = "tinycss2" version = "1.3.0" @@ -2839,6 +2857,28 @@ files = [ {file = "typer-0.9.4.tar.gz", hash = "sha256:f714c2d90afae3a7929fcd72a3abb08df305e1ff61719381384211c4070af57f"}, ] +[[package]] +name = "types-deprecated" +version = "1.2.9.20240311" +requires_python = ">=3.8" +summary = "Typing stubs for Deprecated" +groups = ["default"] +files = [ + {file = "types-Deprecated-1.2.9.20240311.tar.gz", hash = "sha256:0680e89989a8142707de8103f15d182445a533c1047fd9b7e8c5459101e9b90a"}, + {file = "types_Deprecated-1.2.9.20240311-py3-none-any.whl", hash = "sha256:d7793aaf32ff8f7e49a8ac781de4872248e0694c4b75a7a8a186c51167463f9d"}, +] + +[[package]] +name = "types-python-dateutil" +version = "2.9.0.20240906" +requires_python = ">=3.8" +summary = "Typing stubs for python-dateutil" +groups = ["default"] +files = [ + {file = "types-python-dateutil-2.9.0.20240906.tar.gz", hash = "sha256:9706c3b68284c25adffc47319ecc7947e5bb86b3773f843c73906fd598bc176e"}, + {file = "types_python_dateutil-2.9.0.20240906-py3-none-any.whl", hash = "sha256:27c8cc2d058ccb14946eebcaaa503088f4f6dbc4fb6093d3d456a49aef2753f6"}, +] + [[package]] name = "typing-extensions" version = "4.12.2" @@ -2963,7 +3003,7 @@ name = "wrapt" version = "1.14.1" requires_python = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,>=2.7" summary = "Module for decorators, wrappers and monkey patching." -groups = ["dev"] +groups = ["default", "dev"] files = [ {file = "wrapt-1.14.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:80bb5c256f1415f747011dc3604b59bc1f91c6e7150bd7db03b19170ee06b320"}, {file = "wrapt-1.14.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:07f7a7d0f388028b2df1d916e94bbb40624c59b48ecc6cbc232546706fac74c2"}, diff --git a/argilla/pyproject.toml b/argilla/pyproject.toml index c0bda31039..47678f11ed 100644 --- a/argilla/pyproject.toml +++ b/argilla/pyproject.toml @@ -18,6 +18,7 @@ dependencies = [ "rich>=10.0.0", "datasets>=2.0.0", "pillow>=9.5.0", + "standardwebhooks>=1.0.0", ] legacy = [ diff --git a/argilla/src/argilla/__init__.py b/argilla/src/argilla/__init__.py index 02614d336f..737818f92d 100644 --- a/argilla/src/argilla/__init__.py +++ b/argilla/src/argilla/__init__.py @@ -22,3 +22,4 @@ from argilla.responses import * # noqa from argilla.records import * # noqa from argilla.vectors import * # noqa +from argilla.webhooks import * # noqa diff --git a/argilla/src/argilla/_api/_webhooks.py b/argilla/src/argilla/_api/_webhooks.py index f09ef800fe..868abeb1c7 100644 --- a/argilla/src/argilla/_api/_webhooks.py +++ b/argilla/src/argilla/_api/_webhooks.py @@ -14,28 +14,13 @@ __all__ = ["WebhooksAPI"] -from typing import List, Optional +from typing import List import httpx -from pydantic import ConfigDict, Field from argilla._api._base import ResourceAPI from argilla._exceptions import api_error_handler -from argilla._models import ResourceModel - - -class WebhookModel(ResourceModel): - url: str - events: List[str] - enabled: bool = True - description: Optional[str] = None - - secret: Optional[str] = Field(None, description="Webhook secret. Read-only.") - - model_config = ConfigDict( - validate_assignment=True, - str_strip_whitespace=True, - ) +from argilla._models._webhook import WebhookModel class WebhooksAPI(ResourceAPI[WebhookModel]): diff --git a/argilla/src/argilla/_helpers/_resource_repr.py b/argilla/src/argilla/_helpers/_resource_repr.py index 1da5f67f89..0982e63653 100644 --- a/argilla/src/argilla/_helpers/_resource_repr.py +++ b/argilla/src/argilla/_helpers/_resource_repr.py @@ -26,6 +26,7 @@ # "len_column": "datasets", }, "User": {"columns": ["username", "id", "role", "updated_at"], "table_name": "Users"}, + "Webhook": {"columns": ["url", "id", "events", "enabled", "updated_at"], "table_name": "Webhooks"}, } diff --git a/argilla/src/argilla/_models/__init__.py b/argilla/src/argilla/_models/__init__.py index 0e3f21ded0..a6e14bd919 100644 --- a/argilla/src/argilla/_models/__init__.py +++ b/argilla/src/argilla/_models/__init__.py @@ -63,3 +63,4 @@ IntegerMetadataPropertySettings, ) from argilla._models._settings._vectors import VectorFieldModel +from argilla._models._webhook import WebhookModel, EventType diff --git a/argilla/src/argilla/_models/_webhook.py b/argilla/src/argilla/_models/_webhook.py new file mode 100644 index 0000000000..747162aec9 --- /dev/null +++ b/argilla/src/argilla/_models/_webhook.py @@ -0,0 +1,72 @@ +# Copyright 2024-present, Argilla, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from enum import Enum +from typing import List, Optional + +from pydantic import Field, ConfigDict + +from argilla._models._base import ResourceModel + + +class EventType(str, Enum): + dataset_created = "dataset.created" + dataset_updated = "dataset.updated" + dataset_deleted = "dataset.deleted" + dataset_published = "dataset.published" + + record_created = "record.created" + record_updated = "record.updated" + record_deleted = "record.deleted" + record_completed = "record.completed" + + response_created = "response.created" + response_updated = "response.updated" + response_deleted = "response.deleted" + + @property + def resource(self) -> str: + """ + Get the instance type of the event. + + Returns: + str: The instance type. It can be "dataset", "record", or "response". + + """ + return self.split(".")[0] + + @property + def action(self) -> str: + """ + Get the action type of the event. + + Returns: + str: The action type. It can be "created", "updated", "deleted", "published", or "completed". + + """ + return self.split(".")[1] + + +class WebhookModel(ResourceModel): + url: str + events: List[EventType] + enabled: bool = True + description: Optional[str] = None + + secret: Optional[str] = Field(None, description="Webhook secret. Read-only.") + + model_config = ConfigDict( + validate_assignment=True, + str_strip_whitespace=True, + ) diff --git a/argilla/src/argilla/client.py b/argilla/src/argilla/client.py index 94a997b7ce..0585468fb1 100644 --- a/argilla/src/argilla/client.py +++ b/argilla/src/argilla/client.py @@ -22,13 +22,14 @@ from argilla import _api from argilla._api._base import ResourceAPI from argilla._api._client import DEFAULT_HTTP_CONFIG +from argilla._api._webhooks import WebhookModel from argilla._exceptions import ArgillaError, NotFoundError from argilla._helpers import GenericIterator from argilla._helpers._resource_repr import ResourceHTMLReprMixin from argilla._models import DatasetModel, ResourceModel, UserModel, WorkspaceModel if TYPE_CHECKING: - from argilla import Dataset, User, Workspace + from argilla import Dataset, User, Workspace, Webhook __all__ = ["Argilla"] @@ -87,6 +88,11 @@ def users(self) -> "Users": """A collection of users on the server.""" return Users(client=self) + @property + def webhooks(self) -> "Webhooks": + """A collection of webhooks on the server.""" + return Webhooks(client=self) + @cached_property def me(self) -> "User": from argilla.users import User @@ -395,6 +401,69 @@ def _from_model(self, model: DatasetModel) -> "Dataset": return Dataset.from_model(model=model, client=self._client) +class Webhooks(Sequence["Webhook"], ResourceHTMLReprMixin): + """A webhooks class. It can be used to create a new webhook or to get an existing one.""" + + class _Iterator(GenericIterator["Webhook"]): + pass + + def __init__(self, client: "Argilla") -> None: + self._client = client + self._api = client.api.webhooks + + def __call__(self, id: Union[UUID, str]) -> Optional["Webhook"]: + """Get a webhook by id if exists. Otherwise, returns `None`""" + + model = _get_model_by_id(self._api, id) + if model: + return self._from_model(model) # noqa + warnings.warn(f"Webhook with id {id!r} not found") + + def __iter__(self): + return self._Iterator(self.list()) + + @overload + @abstractmethod + def __getitem__(self, index: int) -> "Webhook": ... + + @overload + @abstractmethod + def __getitem__(self, index: slice) -> Sequence["Webhook"]: ... + + def __getitem__(self, index) -> "Webhook": + model = self._api.list()[index] + return self._from_model(model) + + def __len__(self) -> int: + return len(self._api.list()) + + def add(self, webhook: "Webhook") -> "Webhook": + """Add a new webhook to the Argilla platform. + Args: + webhook: Webhook object. + + Returns: + Webhook: The created webhook. + """ + webhook._client = self._client + return webhook.create() + + def list(self) -> List["Webhook"]: + return [self._from_model(model) for model in self._api.list()] + + ############################ + # Private methods + ############################ + + def _repr_html_(self) -> str: + return self._represent_as_html(resources=self.list()) + + def _from_model(self, model: WebhookModel) -> "Webhook": + from argilla.webhooks import Webhook + + return Webhook.from_model(client=self._client, model=model) + + def _get_model_by_id(api: ResourceAPI, resource_id: Union[UUID, str]) -> Optional[ResourceModel]: """Get a resource model by id if found. Otherwise, `None`.""" try: diff --git a/argilla/src/argilla/responses.py b/argilla/src/argilla/responses.py index 2e4915e2f9..807627f624 100644 --- a/argilla/src/argilla/responses.py +++ b/argilla/src/argilla/responses.py @@ -189,6 +189,16 @@ def record(self, record: "Record") -> None: """Sets the record associated with the response""" self._record = record + @property + def record(self) -> "Record": + """Returns the record associated with the UserResponse""" + return self._record + + @record.setter + def record(self, record: "Record") -> None: + """Sets the record associated with the UserResponse""" + self._record = record + @classmethod def from_model(cls, model: UserResponseModel, record: "Record") -> "UserResponse": """Creates a UserResponse from a ResponseModel""" diff --git a/argilla/src/argilla/webhooks/__init__.py b/argilla/src/argilla/webhooks/__init__.py new file mode 100644 index 0000000000..4055cfb96b --- /dev/null +++ b/argilla/src/argilla/webhooks/__init__.py @@ -0,0 +1,43 @@ +# Copyright 2024-present, Argilla, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import TYPE_CHECKING + +from argilla.webhooks._event import RecordEvent, DatasetEvent, UserResponseEvent, WebhookEvent +from argilla.webhooks._handler import WebhookHandler +from argilla.webhooks._helpers import ( + webhook_listener, + get_webhook_server, + set_webhook_server, + start_webhook_server, + stop_webhook_server, +) +from argilla.webhooks._resource import Webhook + +if TYPE_CHECKING: + pass + +__all__ = [ + "Webhook", + "WebhookHandler", + "RecordEvent", + "DatasetEvent", + "UserResponseEvent", + "WebhookEvent", + "webhook_listener", + "get_webhook_server", + "set_webhook_server", + "start_webhook_server", + "stop_webhook_server", +] diff --git a/argilla/src/argilla/webhooks/_event.py b/argilla/src/argilla/webhooks/_event.py new file mode 100644 index 0000000000..8c329e22e7 --- /dev/null +++ b/argilla/src/argilla/webhooks/_event.py @@ -0,0 +1,179 @@ +# Copyright 2024-present, Argilla, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from datetime import datetime +from typing import TYPE_CHECKING, Union +from uuid import UUID + +from pydantic import BaseModel, ConfigDict + +from argilla import Dataset, Record, UserResponse, Workspace +from argilla._exceptions import ArgillaAPIError +from argilla._models import RecordModel, UserResponseModel, WorkspaceModel, EventType + +if TYPE_CHECKING: + from argilla import Argilla + +__all__ = ["RecordEvent", "DatasetEvent", "UserResponseEvent", "WebhookEvent"] + + +class RecordEvent(BaseModel): + """ + A parsed record event. + + Attributes: + type (EventType): The type of the event. + timestamp (datetime): The timestamp of the event. + record (Record): The record of the event. + """ + + type: EventType + timestamp: datetime + record: Record + + model_config = ConfigDict(arbitrary_types_allowed=True) + + +class DatasetEvent(BaseModel): + """ + A parsed dataset event. + + Attributes: + type (EventType): The type of the event. + timestamp (datetime): The timestamp of the event. + dataset (Dataset): The dataset of the event. + """ + + type: EventType + timestamp: datetime + dataset: Dataset + + model_config = ConfigDict(arbitrary_types_allowed=True) + + +class UserResponseEvent(BaseModel): + """ + A parsed user response event. + + Attributes: + type (EventType): The type of the event. + timestamp (datetime): The timestamp of the event. + response (UserResponse): The user response of the event. + """ + + type: EventType + timestamp: datetime + response: UserResponse + + model_config = ConfigDict(arbitrary_types_allowed=True) + + +class WebhookEvent(BaseModel): + """ + A webhook event. + + Attributes: + type (EventType): The type of the event. + timestamp (datetime): The timestamp of the event. + data (dict): The data of the event. + """ + + type: EventType + timestamp: datetime + data: dict + + def parsed(self, client: "Argilla") -> Union[RecordEvent, DatasetEvent, UserResponseEvent, "WebhookEvent"]: + """ + Parse the webhook event. + + Args: + client: The Argilla client. + + Returns: + Event: The parsed event. + + """ + resource = self.type.resource + data = self.data or {} + + if resource == "dataset": + dataset = self._parse_dataset_from_webhook_data(data, client) + return DatasetEvent( + type=self.type, + timestamp=self.timestamp, + dataset=dataset, + ) + + elif resource == "record": + record = self._parse_record_from_webhook_data(data, client) + return RecordEvent( + type=self.type, + timestamp=self.timestamp, + record=record, + ) + + elif resource == "response": + user_response = self._parse_response_from_webhook_data(data, client) + return UserResponseEvent( + type=self.type, + timestamp=self.timestamp, + response=user_response, + ) + + return self + + @classmethod + def _parse_dataset_from_webhook_data(cls, data: dict, client: "Argilla") -> Dataset: + workspace = Workspace.from_model(WorkspaceModel.model_validate(data["workspace"]), client=client) + # TODO: Parse settings from the data + # settings = Settings._from_dict(data) + + dataset = Dataset(name=data["name"], workspace=workspace, client=client) + dataset.id = UUID(data["id"]) + + try: + dataset.get() + except ArgillaAPIError as _: + # TODO: Show notification + pass + finally: + return dataset + + @classmethod + def _parse_record_from_webhook_data(cls, data: dict, client: "Argilla") -> Record: + dataset = cls._parse_dataset_from_webhook_data(data["dataset"], client) + + record = Record.from_model(RecordModel.model_validate(data), dataset=dataset) + try: + record.get() + except ArgillaAPIError as _: + # TODO: Show notification + pass + finally: + return record + + @classmethod + def _parse_response_from_webhook_data(cls, data: dict, client: "Argilla") -> UserResponse: + record = cls._parse_record_from_webhook_data(data["record"], client) + + # TODO: Link the user resource to the response + user_response = UserResponse.from_model( + model=UserResponseModel(**data, user_id=data["user"]["id"]), + record=record, + ) + + return user_response + + +Event = Union[RecordEvent, DatasetEvent, UserResponseEvent, WebhookEvent] diff --git a/argilla/src/argilla/webhooks/_handler.py b/argilla/src/argilla/webhooks/_handler.py new file mode 100644 index 0000000000..ca6ca9a915 --- /dev/null +++ b/argilla/src/argilla/webhooks/_handler.py @@ -0,0 +1,78 @@ +# Copyright 2024-present, Argilla, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Callable, TYPE_CHECKING + +from argilla.webhooks._event import WebhookEvent + +if TYPE_CHECKING: + from fastapi import Request + from argilla.webhooks._resource import Webhook + + +class WebhookHandler: + """ + The `WebhookHandler` class is used to handle incoming webhook requests. This class handles the + request verification and event object creation. + + Attributes: + webhook (Webhook): The webhook object. + """ + + def __init__(self, webhook: "Webhook"): + self.webhook = webhook + + def handle(self, func: Callable, raw_event: bool = False) -> Callable: + """ + This method handles the incoming webhook requests and calls the provided function. + + Parameters: + func (Callable): The function to be called when a webhook event is received. + raw_event (bool): Whether to pass the raw event object to the function. + + Returns: + + """ + from fastapi import Request + + async def request_handler(request: Request): + event = await self._verify_request(request) + if event.type not in self.webhook.events: + return + + if raw_event: + return await func(event) + + return await func(**event.parsed(self.webhook._client).model_dump()) + + return request_handler + + async def _verify_request(self, request: "Request") -> WebhookEvent: + """ + Verify the request signature and return the event object. + + Arguments: + request (Request): The request object. + + Returns: + WebhookEvent: The event object. + """ + + from standardwebhooks.webhooks import Webhook + + body = await request.body() + headers = dict(request.headers) + + json = Webhook(whsecret=self.webhook.secret).verify(body, headers) + return WebhookEvent.model_validate(json) diff --git a/argilla/src/argilla/webhooks/_helpers.py b/argilla/src/argilla/webhooks/_helpers.py new file mode 100644 index 0000000000..f25c834d55 --- /dev/null +++ b/argilla/src/argilla/webhooks/_helpers.py @@ -0,0 +1,202 @@ +# Copyright 2024-present, Argilla, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +import time +import warnings +from threading import Thread +from typing import TYPE_CHECKING, Optional, Callable, Union, List + +import argilla as rg +from argilla import Argilla +from argilla.webhooks._handler import WebhookHandler +from argilla.webhooks._resource import Webhook + +if TYPE_CHECKING: + from fastapi import FastAPI + +__all__ = ["webhook_listener", "get_webhook_server", "set_webhook_server", "start_webhook_server"] + + +def _compute_default_webhook_server_url() -> str: + """ + Compute the webhook server URL. + + Returns: + str: The webhook server URL. If the environment variable `SPACE_HOST` is set, it will return `https://`. + Otherwise, it will return the value of the environment variable `WEBHOOK_SERVER_URL` or `http://127.0.0.1:8000`. + + """ + if space_host := os.getenv("SPACE_HOST"): + return f"https://{space_host}" + + return os.getenv("WEBHOOK_SERVER_URL", "http://127.0.0.1:8000") + + +def _webhook_url_for_func(func: Callable) -> str: + """ + Compute the full webhook URL for a given function. + + Parameters: + func (Callable): The function to compute the webhook URL for. + + Returns: + str: The full webhook URL. + + """ + webhook_server_url = _compute_default_webhook_server_url() + + return f"{webhook_server_url}/{func.__name__}" + + +def webhook_listener( + events: Union[str, List[str]], + description: Optional[str] = None, + client: Optional["Argilla"] = None, + server: Optional["FastAPI"] = None, + raw_event: bool = False, +) -> Callable: + """ + Decorator to create a webhook listener for a function. + + Parameters: + events (Union[str, List[str]]): The events to listen to. + description (Optional[str]): The description of the webhook. + client (Optional[Argilla]): The Argilla client to use. Defaults to the default client. + server (Optional[FastAPI]): The FastAPI server to use. Defaults to the default server. + raw_event (bool): Whether to pass the raw event to the function. Defaults to False. + + Returns: + Callable: The decorated function. + + """ + + client = client or rg.Argilla._get_default() + server = server or get_webhook_server() + + if isinstance(events, str): + events = [events] + + def wrapper(func: Callable) -> Callable: + webhook_url = _webhook_url_for_func(func) + + webhook = None + for argilla_webhook in client.webhooks: + if argilla_webhook.url == webhook_url and argilla_webhook.events == events: + warnings.warn(f"Found existing webhook with for URL {argilla_webhook.url}: {argilla_webhook}") + webhook = argilla_webhook + webhook.description = description or webhook.description + webhook.enabled = True + webhook.update() + break + + if not webhook: + webhook = Webhook( + url=webhook_url, + events=events, + description=description or f"Webhook for {func.__name__}", + ).create() + + request_handler = WebhookHandler(webhook).handle(func, raw_event) + server.post(f"/{func.__name__}", tags=["Argilla Webhooks"])(request_handler) + + return request_handler + + return wrapper + + +def get_webhook_server() -> "FastAPI": + """ + Get the current webhook server. If it does not exist, it will create one. + + Returns: + FastAPI: The webhook server. + + """ + from fastapi import FastAPI + + global _server + if not _server: + _server = FastAPI() + return _server + + +def set_webhook_server(app: "FastAPI"): + """ + Set the webhook server. This should only be called once. + + Parameters: + app (FastAPI): The webhook server. + + """ + global _server + + if _server: + raise ValueError("Server already set") + + _server = app + + +class _WebhookServerRunner: + """ + Class to run the webhook server in a separate thread. + """ + + def __init__(self, server: "FastAPI"): + import uvicorn + + self._server = uvicorn.Server(uvicorn.Config(app=server)) + self._thread = Thread(target=self._server.run, daemon=True) + + def start(self): + """Start the webhook server""" + self._thread.start() + while not self._server.started and self._thread.is_alive(): + time.sleep(1e-3) + + def stop(self): + """Stop the webhook server""" + self._server.should_exit = True + self._thread.join() + + +def start_webhook_server(): + """Start the webhook runner.""" + + global _server_runner + + if _server_runner: + warnings.warn("Server already started") + else: + server = get_webhook_server() + + _server_runner = _WebhookServerRunner(server) + _server_runner.start() + + +def stop_webhook_server(): + """Stop the webhook runner.""" + + global _server_runner + + if not _server_runner: + warnings.warn("Server not started") + else: + try: + _server_runner.stop() + finally: + _server_runner = None + + +_server: Optional["FastAPI"] = None +_server_runner: Optional[_WebhookServerRunner] = None diff --git a/argilla/src/argilla/webhooks/_resource.py b/argilla/src/argilla/webhooks/_resource.py new file mode 100644 index 0000000000..61c8302b4c --- /dev/null +++ b/argilla/src/argilla/webhooks/_resource.py @@ -0,0 +1,98 @@ +# Copyright 2024-present, Argilla, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import List, Optional + +from argilla import Argilla +from argilla._api._webhooks import WebhookModel, WebhooksAPI +from argilla._models import EventType +from argilla._resource import Resource + + +class Webhook(Resource): + """ + The `Webhook` resource. It represents a webhook that can be used to receive events from the Argilla Server. + + Args: + url (str): The URL of the webhook endpoint. + events (List[EventType]): The events that the webhook is subscribed to. + description (Optional[str]): The description of the webhook. + _client (Argilla): The client used to interact with the Argilla Server. + + """ + + _model: WebhookModel + _api: WebhooksAPI + + def __init__(self, url: str, events: List[EventType], description: Optional[str] = None, _client: Argilla = None): + client = _client or Argilla._get_default() + api = client.api.webhooks + events = events or [] + + super().__init__(api=api, client=client) + + self._model = WebhookModel(url=url, events=list(events), description=description) + + @property + def url(self) -> str: + """The URL of the webhook.""" + return self._model.url + + @url.setter + def url(self, value: str): + self._model.url = value + + @property + def events(self) -> List[EventType]: + """The events that the webhook is subscribed to.""" + return self._model.events + + @events.setter + def events(self, value: List[EventType]): + self._model.events = value + + @property + def enabled(self) -> bool: + """Whether the webhook is enabled.""" + return self._model.enabled + + @enabled.setter + def enabled(self, value: bool): + self._model.enabled = value + + @property + def description(self) -> Optional[str]: + """The description of the webhook.""" + return self._model.description + + @description.setter + def description(self, value: Optional[str]): + self._model.description = value + + @property + def secret(self) -> str: + """The secret of the webhook.""" + return self._model.secret + + @classmethod + def from_model(cls, model: WebhookModel, client: Optional["Argilla"] = None) -> "Webhook": + instance = cls(url=model.url, events=model.events, _client=client) + instance._model = model + + return instance + + def _with_client(self, client: "Argilla") -> "Webhook": + self._client = client + self._api = client.api.webhooks + + return self diff --git a/examples/webhooks/basic-webhooks/README.md b/examples/webhooks/basic-webhooks/README.md new file mode 100644 index 0000000000..89fa305276 --- /dev/null +++ b/examples/webhooks/basic-webhooks/README.md @@ -0,0 +1,31 @@ +## Description + +This is a basic webhook example to show how to setup webhook listeners using the argilla SDK + +## Running the app + +1. Start argilla server and argilla worker +```bash +pdm server start +pdm worker +``` + +2. Add the `localhost.org` alias in the `/etc/hosts` file to comply with the Top Level Domain URL requirement. +``` +## +# Host Database +# +# localhost is used to configure the loopback interface +# when the system is booting. Do not change this entry. +## +127.0.0.1 localhost localhost.org +``` + +2. Start the app +```bash +uvicorn main:server +``` + +## Testing the app + +You can see in se server logs traces when working with dataset, records and responses in the argilla server diff --git a/examples/webhooks/basic-webhooks/main.py b/examples/webhooks/basic-webhooks/main.py new file mode 100644 index 0000000000..7b0050de2c --- /dev/null +++ b/examples/webhooks/basic-webhooks/main.py @@ -0,0 +1,76 @@ +import os +from datetime import datetime + +import argilla as rg + +# Environment variables with defaults +API_KEY = os.environ.get("ARGILLA_API_KEY", "argilla.apikey") +API_URL = os.environ.get("ARGILLA_API_URL", "http://localhost:6900") + +# Initialize Argilla client +client = rg.Argilla(api_key=API_KEY, api_url=API_URL) + +# Show the existing webhooks in the argilla server +for webhook in client.webhooks: + print(webhook.url) + + +# Create a webhook listener using the decorator +# This decorator will : +# 1. Create the webhook in the argilla server +# 2. Create a POST endpoint in the server +# 3. Handle the incoming requests to verify the webhook signature +# 4. Ignoring the events other than the ones specified in the `events` argument +# 5. Parse the incoming request and call the decorated function with the parsed data +# +# Each event will be passed as a keyword argument to the decorated function depending on the event type. +# The event types are: +# - record: created, updated, deleted and completed +# - response: created, updated, deleted +# - dataset: created, updated, published, deleted +# Related resources will be passed as keyword arguments to the decorated function +# (for example the dataset for a record-related event, or the record for a response-related event) +# When a resource is deleted +@rg.webhook_listener(events=["record.created", "record.completed"]) +async def listen_record( + record: rg.Record, dataset: rg.Dataset, type: str, timestamp: datetime +): + print(f"Received record event of type {type} at {timestamp}") + + action = "completed" if type == "record.completed" else "created" + print(f"A record with id {record.id} has been {action} for dataset {dataset.name}!") + + +@rg.webhook_listener(events="response.updated") +async def trigger_something_on_response_updated(response: rg.UserResponse, **kwargs): + print( + f"The user response {response.id} has been updated with the following responses:" + ) + print([response.serialize() for response in response.responses]) + + +@rg.webhook_listener(events=["dataset.created", "dataset.updated", "dataset.published"]) +async def with_raw_payload( + type: str, + timestamp: datetime, + dataset: rg.Dataset, + **kwargs, +): + print(f"Event type {type} at {timestamp}") + print(dataset.settings) + + +@rg.webhook_listener(events="dataset.deleted") +async def on_dataset_deleted( + data: dict, + **kwargs, +): + print(f"Dataset {data} has been deleted!") + + +# Set the webhook server. The server is a FastAPI instance, so you need to expose it in order to run it using uvicorn: +# ```bash +# uvicorn main:webhook_server --reload +# ``` + +server = rg.get_webhook_server() diff --git a/examples/webhooks/basic-webhooks/requirements.txt b/examples/webhooks/basic-webhooks/requirements.txt new file mode 100644 index 0000000000..11f77bdd21 --- /dev/null +++ b/examples/webhooks/basic-webhooks/requirements.txt @@ -0,0 +1,3 @@ +argilla @ git+https://github.com/argilla-io/argilla.git@feat/argilla/working-with-webhooks#subdirectory=argilla +fastapi +uvicorn[standard]