Skip to content

Commit

Permalink
Merge branch 'main' into releases/2.4.1
Browse files Browse the repository at this point in the history
  • Loading branch information
frascuchon authored Nov 8, 2024
2 parents ad521ae + ae8d2ba commit 4f8ba1f
Show file tree
Hide file tree
Showing 7 changed files with 18 additions and 14 deletions.
2 changes: 1 addition & 1 deletion argilla/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ The community uses Argilla to create amazing open-source [datasets](https://hugg
AI teams from companies like [the Red Cross](https://510.global/), [Loris.ai](https://loris.ai/) and [Prolific](https://www.prolific.com/) use Argilla to improve the quality and efficiency of AI projects. They shared their experiences in our [AI community meetup](https://lu.ma/embed-checkout/evt-IQtRiSuXZCIW6FB).

- AI for good: [the Red Cross presentation](https://youtu.be/ZsCqrAhzkFU?feature=shared) showcases how the Red Cross domain experts and AI team collaborated by classifying and redirecting requests from refugees of the Ukrainian crisis to streamline the support processes of the Red Cross.
- Customer support: during [the Loris meetup](https://youtu.be/jWrtgf2w4VU?feature=shared) they showed how their AI team uses unsupervised and few-shot contrastive learning to help them quickly validate and gain labelled samples for a huge amount of multi-label classifiers.
- Customer support: during [the Loris meetup](https://youtu.be/jWrtgf2w4VU?feature=shared) they showed how their AI team uses unsupervised and few-shot contrastive learning to help them quickly validate and gain labeled samples for a huge amount of multi-label classifiers.
- Research studies: [the showcase from Prolific](https://youtu.be/ePDlhIxnuAs?feature=shared) announced their integration with our platform. They use it to actively distribute data collection projects among their annotating workforce. This allows Prolific to quickly and efficiently collect high-quality data for research studies.

## 👨‍💻 Getting started
Expand Down
8 changes: 6 additions & 2 deletions argilla/docs/getting_started/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,15 @@ hide: toc

??? Question "Does Argilla cost money?"

No. Argilla is an open-source project and is free to use. You can deploy Argilla on your own infrastructure or use our cloud offering.
No. Argilla is an open-source project and is free to use. You can deploy Argilla on the HF Spaces or your own infrastructure.

??? Question "What data types does Argilla support?"

Text data, mostly. Argilla natively supports textual data, however, we do support rich text, which means you can represent different types of data in Argilla as long as you can convert it to text. For example, you can store images, audio, video, and any other type of data as long as you can convert it to their base64 representation or render them as HTML in for example an IFrame.
Text and images. However, you can use a [custom field](../how_to_guides/custom_fields.md), which means you can represent different types of data in Argilla. For example, you can store audio or video, and any other type of data as long as you can convert it to their base64 representation or render them as HTML in for example an IFrame.

??? Question "Does Argilla generate synthetic data?"

No. However, we provide a side library for that: [distilabel](https://github.com/argilla-io/distilabel), a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

??? Question "Does Argilla train models?"

Expand Down
11 changes: 4 additions & 7 deletions argilla/docs/getting_started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Argilla is a free, open-source, self-hosted tool. This means you need to deploy
- Leave the default Space owner (your personal account)
- Leave `USERNAME` and `PASSWORD` secrets empty since you'll sign in with your HF user as the Argilla Space `owner`.
- Click create Space to launch Argilla 🚀.
- Once you see the Argilla UI, [go to the Sign in into the Argilla UI section](#sign-in-into-the-argilla-ui). If you see the `Building` message for longer than 2-3 min refresh the page.
- Once you see the Argilla UI, [go to the Sign in into the Argilla UI section](#sign-in-to-the-argilla-ui). If you see the `Building` message for longer than 2-3 min refresh the page.

=== "Python SDK"

Expand All @@ -39,7 +39,7 @@ Argilla is a free, open-source, self-hosted tool. This means you need to deploy

Next, we can use the `Argilla.deploy_on_spaces` method, which will create a Space in [the Hugging Face Hub](https://huggingface.co/). This method will automatically do the following:

- Deploy an Argilla Space on the Hugging Face Hub with [OAuth sign-in](#sign-in-into-the-argilla-ui) and a URL like `https://<your-username>-argilla.hf.space`, which takes around 2-3 minutes.
- Deploy an Argilla Space on the Hugging Face Hub with [OAuth sign-in](#sign-in-to-the-argilla-ui) and a URL like `https://<your-username>-argilla.hf.space`, which takes around 2-3 minutes.
- Create a default workspace called `argilla` with an owner called `<your-username>` and an Argilla token set to `api_key`.
- Automatically return the authenticated Argilla client, which can directly be used to interact with your Argilla server.

Expand All @@ -49,11 +49,8 @@ Argilla is a free, open-source, self-hosted tool. This means you need to deploy
authenticated_client = rg.Argilla.deploy_on_spaces(api_key="<api_key>")
```

Learn how to [create your first dataset](create-your-first-dataset.md).


!!! tip "Argilla API Key"
Your Argilla API key can be found in the `My Settings` page of your Argilla Space. Take a look at the [sign in to the UI section](#sign-in-into-the-argilla-ui) to learn how to retrieve it.
Your Argilla API key can be found in the `My Settings` page of your Argilla Space. Take a look at the [sign in to the UI section](#sign-in-to-the-argilla-ui) to learn how to retrieve it.

!!! warning "Persistent storage `SMALL`"
Not setting persistent storage to `SMALL` means that **you will loose your data when the Space restarts**. Spaces get restarted due to maintainance, inactivity, and every time you change your Spaces settings. If you want to **use the Space just for testing** you can use `FREE` temporarily.
Expand Down Expand Up @@ -88,7 +85,7 @@ Congrats! Your Argilla server is ready to start your first project.

The quickest way to start exploring the tool and create your first dataset is by importing an exiting one from the Hugging Face Hub.

To do this, log in to the Argilla UI and in the Home page click on "Import from Hub". You can choose one of the sample datasets or paste a repo id in the input. This will look something like `stanfordnlp/imdb`.
To do this, log in to the Argilla UI and in the Home page click on "Import dataset from Hugging Face". You can choose one of the sample datasets or paste a repo id in the input. This will look something like `stanfordnlp/imdb`.

Argilla will automatically interpret the columns in the dataset to map them to Fields and Questions.

Expand Down
4 changes: 2 additions & 2 deletions argilla/docs/how_to_guides/annotate.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ If you agree with the suggestions, you just need to click on the `Submit` button
This is the default view to annotate your dataset linearly, displaying one record after another.

!!! tip
You should use this view if you have a large number of required questions or need a strong focus on the record content to be labelled. This is also the recommended view for annotating a dataset sample to avoid potential biases introduced by using filters, search, sorting and bulk labelling.
You should use this view if you have a large number of required questions or need a strong focus on the record content to be labeled. This is also the recommended view for annotating a dataset sample to avoid potential biases introduced by using filters, search, sorting and bulk labeling.

Once you submit your first response, the next record will appear automatically. To see again your submitted response, just click on `Prev`.

Expand Down Expand Up @@ -130,7 +130,7 @@ You can also track your own progress in real time expanding the right-bottom pan

## Use search, filters, and sort

The UI offers various features designed for data exploration and understanding. Combining these features with bulk labelling can save you and your team hours of time.
The UI offers various features designed for data exploration and understanding. Combining these features with bulk labeling can save you and your team hours of time.

!!! tip
You should use this when you are familiar with your data and have large volumes to annotate based on verified beliefs and experience.
Expand Down
2 changes: 1 addition & 1 deletion argilla/docs/how_to_guides/custom_fields.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ When `advanced_mode=True`, you can use the `template` argument to pass a full HT

### Usage example

Let's reproduce example from the [Without advanced mode](#without-advanced-mode) section but this time we will insert the [handlebars syntax engine](https://handlebarsjs.com/) into the template ourselves.
Let's reproduce example from the [Without advanced mode](#using-handlebars-in-your-template) section but this time we will insert the [handlebars syntax engine](https://handlebarsjs.com/) into the template ourselves.

```python
template = """
Expand Down
2 changes: 1 addition & 1 deletion argilla/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,5 +82,5 @@ Argilla is a tool that can be used to achieve and keep **high-quality data stand
AI teams from companies like [the Red Cross](https://510.global/), [Loris.ai](https://loris.ai/) and [Prolific](https://www.prolific.com/) use Argilla to **improve the quality and efficiency of AI** projects. They shared their experiences in the [AI community meetup](https://lu.ma/embed-checkout/evt-IQtRiSuXZCIW6FB).

- AI for good: [the Red Cross presentation](https://youtu.be/ZsCqrAhzkFU?feature=shared) showcases **how their experts and AI team collaborate** by classifying and redirecting requests from refugees of the Ukrainian crisis to streamline the support processes of the Red Cross.
- Customer support: during [the Loris meetup](https://youtu.be/jWrtgf2w4VU?feature=shared) they showed how their AI team uses unsupervised and few-shot contrastive learning to help them **quickly validate and gain labelled samples for a huge amount of multi-label classifiers**.
- Customer support: during [the Loris meetup](https://youtu.be/jWrtgf2w4VU?feature=shared) they showed how their AI team uses unsupervised and few-shot contrastive learning to help them **quickly validate and gain labeled samples for a huge amount of multi-label classifiers**.
- Research studies: [the showcase from Prolific](https://youtu.be/ePDlhIxnuAs?feature=shared) announced their integration with Argilla. They use it to actively **distribute data collection projects** among their annotating workforce. This allows them to quickly and **efficiently collect high-quality data** for their research studies.
3 changes: 3 additions & 0 deletions argilla/docs/reference/argilla/client.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,5 +55,8 @@ for dataset in my_workspace.datasets:
---

::: src.argilla.client.Argilla
::: src.argilla.client.Users
::: src.argilla.client.Workspaces
::: src.argilla.client.Datasets

::: src.argilla._helpers._deploy.SpacesDeploymentMixin

0 comments on commit 4f8ba1f

Please sign in to comment.