Update the Azure APIs to their latest versions #553

dkotter · 2023-08-03T16:03:32Z

Is your enhancement related to a problem? Please describe.

Ideally we should be looking to update any APIs we use to their latest versions on a regular basis. This issue is focused on any Azure APIs we use. The following is a list of the APIs we are using and the version.

Analyze Image v3.0
OCR v3.2
Read v3.2
Generate Thumbnail v3.1
Personalizer v1.0
TTS cognitiveservices/v1

For the Personalizer API, v1.0 is the latest (though there is a v1.1 in preview) so nothing needed there. Same for our Text to Speech API, we are currently using the latest version.

The Analyze Image, OCR, Read and Generate Thumbnail APIs are all under the same service (previously known as Cognitive Services Computer Vision, since renamed to Azure AI Vision). The latest released version of this API is v3.2, while there is a v4.0 public preview API.

Azure is pushing for everyone to use the new v4.0 public preview API but in researching this, there are currently some limitations that may hold us back. For instance, generating image captions or smart cropping are only available in a small set of regions in v4.0 (East US, France Central, Korea Central, North Europe, Southeast Asia, West Europe, and West US, East Asia).

There's also been quite a few changes to these APIs in v4.0, so will take some refactoring if we pursue these updates. For instance, all existing features we use, outside of reading content from PDFs, is now under a single Analyze API in v4.0. This will require some changes to how our code works to account for this.

That said, assuming we're okay with the region limitations, I'd like to pursue updating all of those to v4.0. If we're not okay with that, I think it would be ideal to get all of those on v3.2 (so just Analyze Image and Generate Thumbnail).

I tried updating to v3.2 of the Analyze Image API and while the results we get seem good, the confidence scores, at least for image captions, are lower, so that's something we would need to determine how best to handle (in using the Vision Studio tool, this seems to have been fixed in v4.0). Their docs even mention:

In general, we advise a confidence threshold of 0.4 for the Image Analysis 3.2 API and of 0.0 for the Image Analysis 4.0 API (preview).

If we decide to update to v4.0, here's tasks as I see them:

Update Analyze Image API to v4.0 and address any issues there. I believe we'll need to update how we send data and how we parse the received response
Update how we handle OCR to use this new API
Update how we handle generating thumbnails to use this new API
Investigate the Read API. It seems like this functionality moved to a new API (Document Intelligence). We should investigate what it would mean to use that API instead. We may find it's not worth the effort and we leave this on the current v3.2 API

If we stick with v3.2, here's what we'll want to do:

Update the Analyze Image API to v3.2 and modify how we handle error responses (this changed in v3.2).
Update how we deal with confidence scores to account for lower scores in v3.2
Update the Generate Thumbnail API to v3.2 and address any issues there

Designs

No response

Describe alternatives you've considered

No response

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

kmgalanakis · 2023-08-08T15:39:37Z

@jeffpaul what should be our decision here? Move to v4.0 or stick to v3.2?

cc @dkotter

kmgalanakis · 2023-08-10T13:49:56Z

I've created a draft PR for this at #559.

I verified that the confidence scores have been lowered. Judging by the tests I did what worked best for me was a score between 0.5 and 0.55. As far as the lowering of the confidence scores is concerned, I mostly see it as a matter of personal preference.

As a consequence, I would suggest that we leave the default option value for the scores as is and display a dismissable notification when we detect that an API version greater or equal to 3.2 and the selected confidence threshold is above 0.5-0.55.

I tried to create another PR with the update of the APIs to version 4.0 but I found it too difficult, considering the fact that I'm not that familiar with the codebase, and since from what I saw the endpoints have changed.

jeffpaul · 2023-09-21T21:44:21Z

I received an email from Microsoft Azure that Computer Vision 3.1 API will be retired on 13 September 2026 and to migrate our computer vision workloads to Computer Vision 3.2 API with these benefits:

Improved image captioning, image tagging and object detection
164 language support for OCR including handwritten support for 9 Languages: English, Simplified Chinese, French, German, Italian, Japanese, Korean, Portuguese, and Spanish
Up-to-date documentation and better customer support

Seems like we're well along on that path, but best that we continue to stay on top of the APIs we're using in ClassifAI to ensure we're more regularly updating the API versions in ClassifAI to stay as current as feasibly possible.

dkotter added the type:enhancement label Aug 3, 2023

dkotter added this to the Future Release milestone Aug 3, 2023

vikrampm1 added this to Open Source Practice Aug 3, 2023

github-project-automation bot moved this to Incoming in Open Source Practice Aug 3, 2023

kmgalanakis mentioned this issue Aug 10, 2023

Update all Azure APIs to their latest public version #559

Merged

4 tasks

kmgalanakis self-assigned this Aug 10, 2023

kmgalanakis moved this from Incoming to In Review in Open Source Practice Aug 14, 2023

jeffpaul modified the milestones: Future Release, 2.3.0 Aug 14, 2023

dkotter modified the milestones: 2.3.0, 2.4.0 Aug 17, 2023

dkotter modified the milestones: 2.4.0, 2.5.0 Nov 7, 2023

jeffpaul assigned sksaju Nov 28, 2023

jeffpaul modified the milestones: 2.5.0, 2.6.0 Dec 12, 2023

dkotter modified the milestones: 3.1.0, 3.0.0 Feb 1, 2024

jeffpaul moved this from In Review to Review Approved in Open Source Practice Feb 1, 2024

dkotter closed this as completed in #559 Feb 7, 2024

github-project-automation bot moved this from Review Approved to Merged in Open Source Practice Feb 7, 2024

dkotter mentioned this issue Nov 21, 2024

Update Azure AI Vision from 3.2 to 4.0 #827

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the Azure APIs to their latest versions #553

Update the Azure APIs to their latest versions #553

dkotter commented Aug 3, 2023 •

edited

Loading

kmgalanakis commented Aug 8, 2023

kmgalanakis commented Aug 10, 2023

jeffpaul commented Sep 21, 2023

Update the Azure APIs to their latest versions #553

Update the Azure APIs to their latest versions #553

Comments

dkotter commented Aug 3, 2023 • edited Loading

Is your enhancement related to a problem? Please describe.

Designs

Describe alternatives you've considered

Code of Conduct

kmgalanakis commented Aug 8, 2023

kmgalanakis commented Aug 10, 2023

jeffpaul commented Sep 21, 2023

dkotter commented Aug 3, 2023 •

edited

Loading