Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the Azure APIs to their latest versions #553

Closed
1 of 8 tasks
dkotter opened this issue Aug 3, 2023 · 3 comments · Fixed by #559
Closed
1 of 8 tasks

Update the Azure APIs to their latest versions #553

dkotter opened this issue Aug 3, 2023 · 3 comments · Fixed by #559
Assignees
Milestone

Comments

@dkotter
Copy link
Collaborator

dkotter commented Aug 3, 2023

Is your enhancement related to a problem? Please describe.

Ideally we should be looking to update any APIs we use to their latest versions on a regular basis. This issue is focused on any Azure APIs we use. The following is a list of the APIs we are using and the version.

  • Analyze Image v3.0
  • OCR v3.2
  • Read v3.2
  • Generate Thumbnail v3.1
  • Personalizer v1.0
  • TTS cognitiveservices/v1

For the Personalizer API, v1.0 is the latest (though there is a v1.1 in preview) so nothing needed there. Same for our Text to Speech API, we are currently using the latest version.

The Analyze Image, OCR, Read and Generate Thumbnail APIs are all under the same service (previously known as Cognitive Services Computer Vision, since renamed to Azure AI Vision). The latest released version of this API is v3.2, while there is a v4.0 public preview API.

Azure is pushing for everyone to use the new v4.0 public preview API but in researching this, there are currently some limitations that may hold us back. For instance, generating image captions or smart cropping are only available in a small set of regions in v4.0 (East US, France Central, Korea Central, North Europe, Southeast Asia, West Europe, and West US, East Asia).

There's also been quite a few changes to these APIs in v4.0, so will take some refactoring if we pursue these updates. For instance, all existing features we use, outside of reading content from PDFs, is now under a single Analyze API in v4.0. This will require some changes to how our code works to account for this.

That said, assuming we're okay with the region limitations, I'd like to pursue updating all of those to v4.0. If we're not okay with that, I think it would be ideal to get all of those on v3.2 (so just Analyze Image and Generate Thumbnail).

I tried updating to v3.2 of the Analyze Image API and while the results we get seem good, the confidence scores, at least for image captions, are lower, so that's something we would need to determine how best to handle (in using the Vision Studio tool, this seems to have been fixed in v4.0). Their docs even mention:

In general, we advise a confidence threshold of 0.4 for the Image Analysis 3.2 API and of 0.0 for the Image Analysis 4.0 API (preview).

If we decide to update to v4.0, here's tasks as I see them:

  • Update Analyze Image API to v4.0 and address any issues there. I believe we'll need to update how we send data and how we parse the received response
  • Update how we handle OCR to use this new API
  • Update how we handle generating thumbnails to use this new API
  • Investigate the Read API. It seems like this functionality moved to a new API (Document Intelligence). We should investigate what it would mean to use that API instead. We may find it's not worth the effort and we leave this on the current v3.2 API

If we stick with v3.2, here's what we'll want to do:

  • Update the Analyze Image API to v3.2 and modify how we handle error responses (this changed in v3.2).
  • Update how we deal with confidence scores to account for lower scores in v3.2
  • Update the Generate Thumbnail API to v3.2 and address any issues there

Designs

No response

Describe alternatives you've considered

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@kmgalanakis
Copy link
Contributor

@jeffpaul what should be our decision here? Move to v4.0 or stick to v3.2?

cc @dkotter

@kmgalanakis
Copy link
Contributor

I've created a draft PR for this at #559.

I verified that the confidence scores have been lowered. Judging by the tests I did what worked best for me was a score between 0.5 and 0.55. As far as the lowering of the confidence scores is concerned, I mostly see it as a matter of personal preference.

As a consequence, I would suggest that we leave the default option value for the scores as is and display a dismissable notification when we detect that an API version greater or equal to 3.2 and the selected confidence threshold is above 0.5-0.55.

I tried to create another PR with the update of the APIs to version 4.0 but I found it too difficult, considering the fact that I'm not that familiar with the codebase, and since from what I saw the endpoints have changed.

@kmgalanakis kmgalanakis self-assigned this Aug 10, 2023
@kmgalanakis kmgalanakis moved this from Incoming to In Review in Open Source Practice Aug 14, 2023
@jeffpaul jeffpaul modified the milestones: Future Release, 2.3.0 Aug 14, 2023
@dkotter dkotter modified the milestones: 2.3.0, 2.4.0 Aug 17, 2023
@jeffpaul
Copy link
Member

I received an email from Microsoft Azure that Computer Vision 3.1 API will be retired on 13 September 2026 and to migrate our computer vision workloads to Computer Vision 3.2 API with these benefits:

  • Improved image captioning, image tagging and object detection
  • 164 language support for OCR including handwritten support for 9 Languages: English, Simplified Chinese, French, German, Italian, Japanese, Korean, Portuguese, and Spanish
  • Up-to-date documentation and better customer support

Seems like we're well along on that path, but best that we continue to stay on top of the APIs we're using in ClassifAI to ensure we're more regularly updating the API versions in ClassifAI to stay as current as feasibly possible.

@dkotter dkotter modified the milestones: 2.4.0, 2.5.0 Nov 7, 2023
@jeffpaul jeffpaul modified the milestones: 2.5.0, 2.6.0 Dec 12, 2023
@dkotter dkotter modified the milestones: 3.1.0, 3.0.0 Feb 1, 2024
@jeffpaul jeffpaul moved this from In Review to Review Approved in Open Source Practice Feb 1, 2024
@github-project-automation github-project-automation bot moved this from Review Approved to Merged in Open Source Practice Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants