Skip to content

Commit

Permalink
Merge branch 'main' into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
frascuchon committed Oct 3, 2024
2 parents f41b643 + 1e54a48 commit 9677435
Show file tree
Hide file tree
Showing 20 changed files with 482 additions and 41 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/argilla.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
build:
services:
argilla-server:
image: argilladev/argilla-hf-spaces:pr-5422
image: argilladev/argilla-hf-spaces:develop
ports:
- 6900:6900
env:
Expand Down
2 changes: 2 additions & 0 deletions argilla-frontend/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ These are the section headers that we use:

## [Unreleased]()

## [2.3.0](https://github.com/argilla-io/argilla/compare/v2.2.0...v2.3.0)

### Added

- Added new field `CustomField` [#5462](https://github.com/argilla-io/argilla/pull/5462)
Expand Down
4 changes: 4 additions & 0 deletions argilla-server/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,17 @@ These are the section headers that we use:

## [Unreleased]()

## [2.3.0](https://github.com/argilla-io/argilla/compare/v2.2.0...v2.3.0)

### Added

- Added support for `CustomField`. ([#5422](https://github.com/argilla-io/argilla/pull/5422))
- Added helm chart for argilla. ([#5512](https://github.com/argilla-io/argilla/pull/5512))

### Fixed

- Fixed error when creating default user with existing default workspace. ([#5558](https://github.com/argilla-io/argilla/pull/5558))
- Fixed the deployment yaml used to create a new Argilla server in K8s. Added `USERNAME` and `PASSWORD` to the environment variables of pod template. ([#5434](https://github.com/argilla-io/argilla/issues/5434))

## [2.2.0](https://github.com/argilla-io/argilla/compare/v2.1.0...v2.2.0)

Expand Down
24 changes: 18 additions & 6 deletions argilla-server/src/argilla_server/search_engine/commons.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,9 +177,7 @@ def es_mapping_for_field(field: Field) -> dict:
elif field.is_custom:
return {
es_field_for_record_field(field.name): {
"type": "object",
"dynamic": True,
"properties": {},
"type": "text",
}
}
elif field.is_image:
Expand Down Expand Up @@ -532,17 +530,19 @@ def _inverse_vector(vector_value: List[float]) -> List[float]:
return [vector_value[i] * -1 for i in range(0, len(vector_value))]

def _map_record_to_es_document(self, record: Record) -> Dict[str, Any]:
dataset = record.dataset

document = {
"id": str(record.id),
"external_id": record.external_id,
"fields": record.fields,
"fields": self._map_record_fields_to_es(record.fields, dataset.fields),
"status": record.status,
"inserted_at": record.inserted_at,
"updated_at": record.updated_at,
}

if record.metadata_:
document["metadata"] = self._map_record_metadata_to_es(record.metadata_, record.dataset.metadata_properties)
document["metadata"] = self._map_record_metadata_to_es(record.metadata_, dataset.metadata_properties)
if record.responses:
document["responses"] = self._map_record_responses_to_es(record.responses)
if record.suggestions:
Expand Down Expand Up @@ -662,7 +662,7 @@ def _build_text_query(dataset: Dataset, text: Optional[Union[TextQuery, str]] =
if field is None:
raise Exception(f"Field {text.field} not found in dataset {dataset.id}")

if field.is_chat or field.is_custom:
if field.is_chat:
field_name = f"{text.field}.*"
else:
field_name = text.field
Expand Down Expand Up @@ -833,6 +833,18 @@ def _map_record_response_to_es(response: Response) -> Dict[str, Any]:
},
}

@classmethod
def _map_record_fields_to_es(cls, fields: dict, dataset_fields: List[Field]) -> dict:
for field in dataset_fields:
if field.is_image:
fields[field.name] = None
elif field.is_custom:
fields[field.name] = str(fields.get(field.name, ""))
else:
fields[field.name] = fields.get(field.name, "")

return fields

async def __terms_aggregation(self, index_name: str, field_name: str, query: dict, size: int) -> List[dict]:
aggregation_name = "terms_agg"

Expand Down
42 changes: 38 additions & 4 deletions argilla-server/tests/unit/search_engine/test_commons.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@
VectorSettingsFactory,
ImageFieldFactory,
ChatFieldFactory,
CustomFieldFactory,
)


Expand Down Expand Up @@ -623,6 +624,34 @@ async def test_search_for_chat_field(self, search_engine: BaseElasticAndOpenSear
assert len(result.items) == 2
assert result.total == 2

async def test_search_for_custom_field(self, search_engine: BaseElasticAndOpenSearchEngine, opensearch: OpenSearch):
custom_field = await CustomFieldFactory.create(name="field")

dataset = await DatasetFactory.create(fields=[custom_field])

records = await RecordFactory.create_batch(
size=2,
dataset=dataset,
fields={
custom_field.name: {
"a": "This is a value",
"b": 100,
}
},
)

await refresh_dataset(dataset)
await refresh_records(records)

await search_engine.create_index(dataset)
await search_engine.index_records(dataset, records)

for query in ["value", 100]:
result = await search_engine.search(dataset, query=TextQuery(q=query, field=custom_field.name))

assert len(result.items) == 2
assert result.total == 2

@pytest.mark.parametrize(
"statuses, expected_items",
[
Expand Down Expand Up @@ -1064,11 +1093,16 @@ async def test_index_records_with_metadata(
async def test_index_records_with_vectors(
self, search_engine: BaseElasticAndOpenSearchEngine, opensearch: OpenSearch
):
dataset = await DatasetFactory.create()
text_fields = await TextFieldFactory.create_batch(size=5, dataset=dataset)
vectors_settings = await VectorSettingsFactory.create_batch(size=5, dataset=dataset, dimensions=5)
text_fields = await TextFieldFactory.create_batch(size=5)
vectors_settings = await VectorSettingsFactory.create_batch(size=5, dimensions=5)

dataset = await DatasetFactory.create(fields=text_fields, vectors_settings=vectors_settings, questions=[])

records = await RecordFactory.create_batch(
size=5, fields={field.name: f"This is the value for {field.name}" for field in text_fields}, responses=[]
size=5,
fields={field.name: f"This is the value for {field.name}" for field in text_fields},
dataset=dataset,
responses=[],
)

for record in records:
Expand Down
10 changes: 8 additions & 2 deletions argilla/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,19 +16,25 @@ These are the section headers that we use:

## [Unreleased]()

## [2.3.0](https://github.com/argilla-io/argilla/compare/v2.2.2...v2.3.0)

### Added

- Added support for `CustomField`. ([#5422](https://github.com/argilla-io/argilla/pull/5422))
- Added `inserted_at` and `updated_at` to `Resource` model as properties. ([#5540](https://github.com/argilla-io/argilla/pull/5540))
- Added `limit` argument when fetching records. ([#5525](https://github.com/argilla-io/argilla/pull/5525)
- Added similarity search support. ((#5546)[https://github.com/argilla-io/argilla/pull/5546])
- Added similarity search support. ([#5546](https://github.com/argilla-io/argilla/pull/5546))
- Added filter support for `id`, `_server_id`, `inserted_at` and `updated_at` record attributes. ([#5545](https://github.com/argilla-io/argilla/pull/5545))
- Added support to read argilla credentials from colab secrets. ([#5541](https://github.com/argilla-io/argilla/pull/5541)))

### Changed

- Changed the __repr__ method for `SettingsProperties` to display the details of all the properties in `Setting` object. ([#5380](https://github.com/argilla-io/argilla/issues/5380))
- Changed error messages when creating datasets with insufficient permissions. ([#5540](https://github.com/argilla-io/argilla/pull/5554))

### Fixed

- Fixed the deployment yaml used to create a new Argilla server in K8s. Added `USERNAME` and `PASSWORD` to the environment variables of pod template. ([#5434](https://github.com/argilla-io/argilla/issues/5434))
- Fixed serialization of `ChatField` when collecting records from the hub and exporting to `datasets`. ([#5554](https://github.com/argilla-io/argilla/pull/5553))

## [2.2.2](https://github.com/argilla-io/argilla/compare/v2.2.1...v2.2.2)

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 9677435

Please sign in to comment.