Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PTDT-2863]: Feature schema attributes #1930

Merged
merged 20 commits into from
Dec 18, 2024
Merged

[PTDT-2863]: Feature schema attributes #1930

merged 20 commits into from
Dec 18, 2024

Conversation

Tim-Kerr
Copy link
Contributor

@Tim-Kerr Tim-Kerr commented Dec 16, 2024

Description

Add feature schema attribute support to the Tool and Classification objects in the sdk.

The sdk now supports feature schema attributes either using the OntologyBuilder or by passing in raw JSON containing the attributes.

Test the following endpoints in the client:

  • get_feature_schema
  • get_feature_schemas
  • create_ontology_from_feature_schemas
  • delete_unused_feature_schema
  • update_feature_schema_title
  • upsert_feature_schema
  • create_feature_schema

Test the creation of ontologies using the ontology builder with nodes containing feature schema attributes

Testing script:

import os
from labelbox import Client
from labelbox.schema.ontology import OntologyBuilder, Tool, PromptIssueTool
from labelbox.schema.tool_building.classification import Classification, Option
from labelbox.schema.tool_building.types import FeatureSchemaAttribute
from labelbox.schema.media_type import MediaType
from labelbox.schema.ontology_kind import OntologyKind


client = Client(
    api_key=os.environ.get('STAGE_API_KEY'),
    endpoint="https://app.lb-stage.xyz/api/_gql/graphql",
    rest_endpoint="https://app.lb-stage.xyz/api/api/v1")

builder = OntologyBuilder(

  tools=[
    Tool(
      name="Auto OCR",
      tool=Tool.Type.BBOX,
      attributes=[
        FeatureSchemaAttribute(
          attributeName="auto-ocr",
          attributeValue="true"
        )
      ],
      classifications=[
        Classification(
          name="Auto ocr text class value",
          instructions="This is an auto OCR text value classification",
          class_type=Classification.Type.TEXT,
          scope=Classification.Scope.GLOBAL,
          attributes=[
            FeatureSchemaAttribute(
              attributeName="auto-ocr-text-value",
              attributeValue="true"
            )
          ]
        )
      ]
    )
  ]
)

# client.create_ontology("Auto OCR ontology from sdk", builder.asdict(), media_type=MediaType.Document)

builder = OntologyBuilder(
  classifications=[
    Classification(
      name="prompt message scope text classification",
      instructions="This is a prompt message scoped text classification",
      class_type=Classification.Type.TEXT,
      scope=Classification.Scope.INDEX,
      attributes=[
        FeatureSchemaAttribute(
          attributeName="prompt-message-scope",
          attributeValue="true"
        )
      ]
    )
  ]
)

# client.create_ontology('MMC Ontology with prompt message scope class', builder.asdict(), media_type=MediaType.Conversational, ontology_kind=OntologyKind.ModelEvaluation)

builder = OntologyBuilder(
  classifications=[
    Classification(
      name="Requires connection checklist classification",
      instructions="This is a requires connection checklist classification",
      class_type=Classification.Type.CHECKLIST,
      scope=Classification.Scope.GLOBAL,
      attributes=[
        FeatureSchemaAttribute(
          attributeName="requires-connection",
          attributeValue="true"
        )
      ],
      options=[
        Option(
          value='First option'
        ),
        Option(
          value='Second option'
        )
      ]
    )
  ]
);

# client.create_ontology('Image ontology with requires connection classes', builder.asdict(), media_type=MediaType.Image)


# feature_schema = client.upsert_feature_schema(Tool(name='Testing', tool=Tool.Type.BBOX, attributes=[FeatureSchemaAttribute(attributeName='auto-ocr', attributeValue='true')]).asdict())
# print(feature_schema)
# fetched_feature_schema = client.get_feature_schema(feature_schema.uid)
# feature_schemas_with_name = client.get_feature_schemas('Auto OCR')

# # Iterate over the feature schemas
# for schema in feature_schemas_with_name:
#     print(schema)

# ontology = client.create_ontology_from_feature_schemas('Ontology from feature schemas', ['cm4rc1nl90h36070782v9hlpt'])

# feature_schema = client.update_feature_schema_title('cm4rc1nl90h36070782v9hlpt', 'This is a new title - did it remove the feature schema attributes? UPDATED')
# client.delete_unused_feature_schema('cm4rhzhn7026e07wm2az681di')

# feature_schema = client.create_feature_schema(normalized={'tool': 'rectangle',  'name': 'cat', 'color': 'black', 'attributes': [{'attributeName': 'auto-ocr', 'attributeValue': 'true'}]})
# print(feature_schema)

# classification = Classification.from_dict({
#     "name": "Test Classification",
#     "instructions": "Test instructions",
#     "type": "text",  # or "checklist" or other valid Classification.Type values
#     "scope": "index",  # or "index" for Classification.Scope values
#     "required": False,  # optional
#     "attributes": [  # optional
#         {
#             "attributeName": "prompt-message-scope",
#             "attributeValue": "true"
#         }
#     ],
#     "options": []
# })

# tool = Tool.from_dict({
#     "name": "Test Tool",
#     "type": "rectangle",  # or "checklist" or other valid Classification.Type values
#     "required": False,  # optional
#     "attributes": [  # optional
#         {
#             "attributeName": "auto-ocr",
#             "attributeValue": "true"
#         }
#     ],
#     "options": []
# })

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Document change (fix typo or modifying any markdown files, code comments or anything in the examples folder only)

All Submissions

  • Have you followed the guidelines in our Contributing document?
  • Have you provided a description?
  • Are your changes properly formatted?

New Feature Submissions

  • Does your submission pass tests?
  • Have you added thorough tests for your new feature?
  • Have you commented your code, particularly in hard-to-understand areas?
  • Have you added a Docstring?

Changes to Core Features

  • Have you written new tests for your core changes, as applicable?
  • Have you successfully run tests with your changes locally?
  • Have you updated any code comments, as applicable?

@Tim-Kerr Tim-Kerr requested a review from a team as a code owner December 16, 2024 20:01
from typing import TypedDict


class FeatureSchemaAttribute(TypedDict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this into a data class? We already use Pydantic and dataclasses, so adding another paradigm feels unnecessary. Also, I’ve found TypedDict has quirks that make it less reliable. Personally, I treat it as a temporary internal type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

attributeValue: str


FeatureSchemaAttriubte = Annotated[FeatureSchemaAttribute, Field()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FeatureSchemaAttriubte - typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes ty, fixed

def __post_init__(self):
if self.attributes is not None:
warnings.warn(
"Attributes are an experimental feature and may change in the future."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vbrodsky
vbrodsky previously approved these changes Dec 17, 2024
@Tim-Kerr Tim-Kerr merged commit 00f704d into develop Dec 18, 2024
15 of 23 checks passed
@Tim-Kerr Tim-Kerr deleted the PTDT-2863 branch December 18, 2024 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants