Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[formrecognizer] initial business cards #14026

Merged
merged 19 commits into from
Oct 12, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions sdk/formrecognizer/azure-ai-formrecognizer/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ This version of the SDK defaults to the latest supported API version, which curr

**New features**

- New methods `begin_recognize_business_cards` and `begin_recognize_business_cards_from_url` introduced to the SDK. Use these
methods to recognize data from business cards.
- Recognize receipt methods now take keyword argument `locale` to optionally indicate the locale of the receipt for
improved results
- Added ability to create a composed model from the `FormTrainingClient` by calling method `begin_create_composed_model()`
Expand Down Expand Up @@ -59,13 +61,13 @@ Previously `value_data` returned a `FieldData` with all its attributes set to `N
CustomFormModel` and `CustomFormModelInfo`
- `FieldText` has been renamed to `FieldData`
- `FormContent` has been renamed to `FormElement`
- Parameter `include_text_content` has been renamed to `include_field_elements` for
- Parameter `include_text_content` has been renamed to `include_field_elements` for
`begin_recognize_receipts`, `begin_recognize_receipts_from_url`, `begin_recognize_custom_forms`, and `begin_recognize_custom_forms_from_url`
- `text_content` has been renamed to `field_elements` on `FieldData` and `FormTableCell`

**Fixes and improvements**

- Fixes a bug where `text_angle` was being returned out of the specified interval (-180, 180]
- Fixes a bug where `text_angle` was being returned out of the specified interval (-180, 180]

## 1.0.0b3 (2020-06-10)

Expand Down
56 changes: 47 additions & 9 deletions sdk/formrecognizer/azure-ai-formrecognizer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ from form documents. It includes the following main functionalities:
* Custom models - Recognize field values and table data from forms. These models are trained with your own data, so they're tailored to your forms.
* Content API - Recognize text and table structures, along with their bounding box coordinates, from documents. Corresponds to the REST service's Layout API.
* Prebuilt receipt model - Recognize data from USA sales receipts using a prebuilt model.
* Prebuilt business card model - Recognize data from business cards using a prebuilt model.

[Source code][python-fr-src] | [Package (PyPI)][python-fr-pypi] | [API reference documentation][python-fr-ref-docs]| [Product documentation][python-fr-product-docs] | [Samples][python-fr-samples]

Expand Down Expand Up @@ -62,7 +63,7 @@ az cognitiveservices account create \
```

### Authenticate the client
In order to interact with the Form Recognizer service, you will need to create an instance of a client.
In order to interact with the Form Recognizer service, you will need to create an instance of a client.
An **endpoint** and **credential** are necessary to instantiate the client object.


Expand Down Expand Up @@ -98,17 +99,17 @@ form_recognizer_client = FormRecognizerClient(endpoint, credential)

#### Create the client with an Azure Active Directory credential

`AzureKeyCredential` authentication is used in the examples in this getting started guide, but you can also
`AzureKeyCredential` authentication is used in the examples in this getting started guide, but you can also
authenticate with Azure Active Directory using the [azure-identity][azure_identity] library.
Note that regional endpoints do not support AAD authentication. Create a [custom subdomain][custom_subdomain]
name for your resource in order to use this type of authentication.

To use the [DefaultAzureCredential][default_azure_credential] type shown below, or other credential types provided
To use the [DefaultAzureCredential][default_azure_credential] type shown below, or other credential types provided
with the Azure SDK, please install the `azure-identity` package:

```pip install azure-identity```

You will also need to [register a new AAD application and grant access][register_aad_app] to
You will also need to [register a new AAD application and grant access][register_aad_app] to
Form Recognizer by assigning the `"Cognitive Services User"` role to your service principal.

Once completed, set the values of the client ID, tenant ID, and client secret of the AAD application as environment variables:
Expand All @@ -132,6 +133,7 @@ form_recognizer_client = FormRecognizerClient(

- Recognizing form fields and content using custom models trained to recognize your custom forms. These values are returned in a collection of `RecognizedForm` objects.
- Recognizing common fields from US receipts, using a pre-trained receipt model. These fields and metadata are returned in a collection of `RecognizedForm` objects.
- Recognizing common fields from business cards, using a pre-trained business card model. These fields and metadata are returned in a collection of `RecognizedForm` objects.
- Recognizing form content, including tables, lines and words, without the need to train a model. Form content is returned in a collection of `FormPage` objects.

Sample code snippets are provided to illustrate using a FormRecognizerClient [here](#recognize-forms-using-a-custom-model "Recognize Forms Using a Custom Model").
Expand All @@ -154,9 +156,9 @@ Long-running operations are operations which consist of an initial request sent
followed by polling the service at intervals to determine whether the operation has completed or failed, and if it has
succeeded, to get the result.

Methods that train models, recognize values from forms, or copy models are modeled as long-running operations.
The client exposes a `begin_<method-name>` method that returns an `LROPoller` or `AsyncLROPoller`. Callers should wait
for the operation to complete by calling `result()` on the poller object returned from the `begin_<method-name>` method.
Methods that train models, recognize values from forms, or copy models are modeled as long-running operations.
The client exposes a `begin_<method-name>` method that returns an `LROPoller` or `AsyncLROPoller`. Callers should wait
for the operation to complete by calling `result()` on the poller object returned from the `begin_<method-name>` method.
Sample code snippets are provided to illustrate using long-running operations [below](#examples "Examples").


Expand All @@ -167,6 +169,7 @@ The following section provides several code snippets covering some of the most c
* [Recognize Forms Using a Custom Model](#recognize-forms-using-a-custom-model "Recognize Forms Using a Custom Model")
* [Recognize Content](#recognize-content "Recognize Content")
* [Recognize Receipts](#recognize-receipts "Recognize receipts")
* [Recognize Business Cards](#recognize-business-cards "Recognize business cards")
* [Train a Model](#train-a-model "Train a model")
* [Manage Your Models](#manage-your-models "Manage Your Models")

Expand Down Expand Up @@ -203,7 +206,7 @@ for recognized_form in result:
))
```

Alternatively, a form URL can also be used to recognize custom forms using the `begin_recognize_custom_forms_from_url` method.
Alternatively, a form URL can also be used to recognize custom forms using the `begin_recognize_custom_forms_from_url` method.
The `_from_url` methods exist for all the recognize methods.

```
Expand Down Expand Up @@ -268,9 +271,39 @@ for receipt in result:
print("{}: {} has confidence {}".format(name, field.value, field.confidence))
```

### Recognize Business Cards
Recognize data from business cards using a prebuilt model. Business card fields recognized by the service can be found [here][service_recognize_business_cards].

```python
from azure.ai.formrecognizer import FormRecognizerClient
from azure.core.credentials import AzureKeyCredential

endpoint = "https://<region>.api.cognitive.microsoft.com/"
credential = AzureKeyCredential("<api_key>")

form_recognizer_client = FormRecognizerClient(endpoint, credential)

with open("<path to your business card>", "rb") as fd:
business_card = fd.read()

poller = form_recognizer_client.begin_recognize_business_cards(business_card)
result = poller.result()

for business_card in result:
for name, field in business_card.fields.items():
if name == "ContactNames":
print("ContactNames:")
for items in field.value:
for item_name, item in items.value.items():
print("...{}: {} has confidence {}".format(item_name, item.value, item.confidence))
else:
for item in field.value:
iscai-msft marked this conversation as resolved.
Show resolved Hide resolved
print("{}: {} has confidence {}".format(item.name, item.value, item.confidence))
```

### Train a model
Train a custom model on your own form type. The resulting model can be used to recognize values from the types of forms it was trained on.
Provide a container SAS URL to your Azure Storage Blob container where you're storing the training documents.
Provide a container SAS URL to your Azure Storage Blob container where you're storing the training documents.
If training files are within a subfolder in the container, use the [prefix][prefix_ref_docs] keyword argument to specify under which folder to train.

More details on setting up a container and required file structure can be found in the [service documentation][training_data].
Expand Down Expand Up @@ -397,6 +430,7 @@ These code samples show common scenario operations with the Azure Form Recognize
* Client authentication: [sample_authentication.py][sample_authentication]
* Recognize receipts: [sample_recognize_receipts.py][sample_recognize_receipts]
* Recognize receipts from a URL: [sample_recognize_receipts_from_url.py][sample_recognize_receipts_from_url]
* Recognize business cards: [sample_recognize_business_cards.py][sample_recognize_business_cards]
* Recognize content: [sample_recognize_content.py][sample_recognize_content]
* Recognize custom forms: [sample_recognize_custom_forms.py][sample_recognize_custom_forms]
* Train a model without labels: [sample_train_model_without_labels.py][sample_train_model_without_labels]
Expand All @@ -413,6 +447,7 @@ are found under the `azure.ai.formrecognizer.aio` namespace.
* Client authentication: [sample_authentication_async.py][sample_authentication_async]
* Recognize receipts: [sample_recognize_receipts_async.py][sample_recognize_receipts_async]
* Recognize receipts from a URL: [sample_recognize_receipts_from_url_async.py][sample_recognize_receipts_from_url_async]
* Recognize business cards: [sample_recognize_business_cards_async.py][sample_recognize_business_cards_async]
* Recognize content: [sample_recognize_content_async.py][sample_recognize_content_async]
* Recognize custom forms: [sample_recognize_custom_forms_async.py][sample_recognize_custom_forms_async]
* Train a model without labels: [sample_train_model_without_labels_async.py][sample_train_model_without_labels_async]
Expand Down Expand Up @@ -466,6 +501,7 @@ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_con
[azure_identity]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/identity/azure-identity
[default_azure_credential]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/identity/azure-identity#defaultazurecredential
[service_recognize_receipt]: https://aka.ms/formrecognizer/receiptfields
[service_recognize_business_cards]: https://aka.ms/formrecognizer/businesscardfields
[sdk_logging_docs]: https://docs.microsoft.com/azure/developer/python/azure-sdk-logging

[cla]: https://cla.microsoft.com
Expand All @@ -485,6 +521,8 @@ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_con
[sample_recognize_receipts_from_url_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/async_samples/sample_recognize_receipts_from_url_async.py
[sample_recognize_receipts]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/sample_recognize_receipts.py
[sample_recognize_receipts_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/async_samples/sample_recognize_receipts_async.py
[sample_recognize_business_cards]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/sample_recognize_business_cards.py
[sample_recognize_business_cards_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/async_samples/sample_recognize_business_cards_async.py
[sample_train_model_with_labels]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/sample_train_model_with_labels.py
[sample_train_model_with_labels_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/async_samples/sample_train_model_with_labels_async.py
[sample_train_model_without_labels]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/sample_train_model_without_labels.py
Expand Down
Loading