add checkpoint_dir content-type, remove checkpoint variant #70

bertsky · 2022-02-10T16:44:59Z

This removes the older of the two parameterizations (a direct Calamari-like glob expression) in favour of the newer, more OCR-D-like model directory, and adds a proper format and content-type as required by the specs for optimal resmgr handling.

codecov-commenter · 2022-02-10T16:51:10Z

Codecov Report

Merging #70 (5fddd32) into master (76b34c5) will increase coverage by 0.97%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #70      +/-   ##
==========================================
+ Coverage   88.37%   89.34%   +0.97%     
==========================================
  Files           3        3              
  Lines         172      169       -3     
  Branches       39       38       -1     
==========================================
- Hits          152      151       -1     
+ Misses         11       10       -1     
+ Partials        9        8       -1

Impacted Files	Coverage Δ
ocrd_calamari/recognize.py	`88.95% <100.00%> (+1.00%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 76b34c5...5fddd32. Read the comment docs.

bertsky · 2022-02-11T07:20:19Z

I had to fix the tests as well. We could also use kant_aufklaerung_1784-binarized instead of kant_aufklaerung_1784-page-region-line-word_glyph to get binary input without the need for ad-hoc IM binarization BTW.

mikegerber · 2022-02-16T12:25:04Z

It's a bit unfortunate to drop the former default parameter checkpoint, is there really no way to still support this and phase it out afterwards?

bertsky · 2022-02-21T11:27:58Z

It's a bit unfortunate to drop the former default parameter checkpoint, is there really no way to still support this and phase it out afterwards?

No, unfortunately I can't see any. We had to get to terms with our file vs directory resources and the semantics of content-type in a ocrd-tool.json – see related discussions in core. Unless we make the resource mechanism much more complicated (by specifying which params can have which resources), if we want to properly support directories as a resource type, individual processors should not have both.

mikegerber · 2022-09-16T09:27:26Z

I've also removed the checkpoint mention in the README in #80!

PR #70 changed the model download and did not update the README accordingly. Fix the README. Also update the example download to use a single page with existing binarization and segmentation.

bertsky added 2 commits February 10, 2022 17:38

ocrd-tool.json: add model content-type, remove glob variant

11615be

recognize: remove checkpoint param in favour of checkpoint_dir alone

5f23c03

kba approved these changes Feb 10, 2022

View reviewed changes

bertsky and others added 7 commits February 10, 2022 18:06

adapt to checkpoint_dir only

332d02b

fix deps

13031d5

recognize: delegate to core functions

01312c6

test: fix initLogging

7661662

test: use resmgr for downloading model

59089fb

test: workspace download instead of urllib

1f0252d

test: use other fileGrp to avoid assets#87

5fddd32

mikegerber self-assigned this Feb 23, 2022

mikegerber merged commit 1eb342e into OCR-D:master Feb 23, 2022

mikegerber mentioned this pull request Oct 16, 2023

Review README #95

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add checkpoint_dir content-type, remove checkpoint variant #70

add checkpoint_dir content-type, remove checkpoint variant #70

bertsky commented Feb 10, 2022

codecov-commenter commented Feb 10, 2022 •

edited

Loading

bertsky commented Feb 11, 2022

mikegerber commented Feb 16, 2022 •

edited

Loading

bertsky commented Feb 21, 2022

mikegerber commented Sep 16, 2022 •

edited

Loading

add checkpoint_dir content-type, remove checkpoint variant #70

add checkpoint_dir content-type, remove checkpoint variant #70

Conversation

bertsky commented Feb 10, 2022

codecov-commenter commented Feb 10, 2022 • edited Loading

Codecov Report

bertsky commented Feb 11, 2022

mikegerber commented Feb 16, 2022 • edited Loading

bertsky commented Feb 21, 2022

mikegerber commented Sep 16, 2022 • edited Loading

codecov-commenter commented Feb 10, 2022 •

edited

Loading

mikegerber commented Feb 16, 2022 •

edited

Loading

mikegerber commented Sep 16, 2022 •

edited

Loading