-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Differential Binarization model from PaddleOCR to Keras3 #1739
base: master
Are you sure you want to change the base?
Conversation
Let's split this up. Start with ResNetVD backbone? Some notes...
|
1826dce
to
753047d
Compare
753047d
to
a5e5d8f
Compare
@gowthamkpr is the PR ready for review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I have left a reorganization comment.
example for structuring the code - https://github.com/keras-team/keras-hub/tree/master/keras_hub/src/models/sam
@@ -0,0 +1,243 @@ | |||
# Copyright 2024 The KerasNLP Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename folder to differential_binarization
and file to differential_binarization.py
backbone = backbone | ||
|
||
inputs = backbone.input | ||
x = backbone.pyramid_outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please create a file differential_binarization_backbone.py
and move the diffbin_fpn_model
and backbone code into that. You can rename the backbone
you are using in this file to image_encoder
in the differential_binarization_backbone
file. The task model should contain the preprocessor, backbone and the task head
.
from keras import ops | ||
|
||
|
||
class DiceLoss: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add test coverage for the losses here
Hi @gowthamkpr! can you please refactor the code to KerasHub style?
|
I've refactored using SAM as example.
I've added
I've subclassed
Done. The model is not yet in Kaggle, so I've disabled the presets test for now.
Done. Not sure if there are additional standard test routines other than the ones used in SAM that should be run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Gowtham! left a few comments!
keras_hub/src/models/differential_binarization/differential_binarization_backbone.py
Show resolved
Hide resolved
56, | ||
256, | ||
), | ||
run_mixed_precision_check=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the mixed precision check pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I tried adding an explicit dtype
argument, but the problem remains that the mixed precision check checks against each sublayer of the model. The ResNet backbone, which is instantiated separately, therefore has the wrong dtype
.
keras_hub/src/models/differential_binarization/differential_binarization_test.py
Outdated
Show resolved
Hide resolved
keras_hub/src/models/differential_binarization/differential_binarization_test.py
Outdated
Show resolved
Hide resolved
instance. | ||
head_kernel_list: list of ints. The number of filters for probability | ||
and threshold maps. Defaults to [3, 2, 2]. | ||
step_function_k: float. `k` parameter used within the differential |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think step_function_k is a arg we want to expose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
Args: | ||
backbone: A `keras_hub.models.DifferentialBinarizationBackbone` | ||
instance. | ||
head_kernel_list: list of ints. The number of filters for probability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets move the head code to backbone.
rename this class to DifferentialBinarizationOCR
and just take in preprocessor and backbone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR Gowtham! Left a few comments. Can you please also add a demo colab in the PR description to verify the model is working before merging?
pyramid network. | ||
|
||
Args: | ||
image_encoder: A `keras_hub.models.ResNetBackbone` instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add all args in docstring
|
||
|
||
def diffbin_fpn_model(inputs, out_channels, dtype=None): | ||
in2 = layers.Conv2D( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is in2 - can we rename this to be more readable?
) | ||
|
||
outputs = { | ||
"probability_maps": probability_maps, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like probability_maps and threshold_maps are identical. what is the difference?
|
||
@keras_hub_export("keras_hub.layers.DifferentialBinarizationImageConverter") | ||
class DifferentialBinarizationImageConverter(ImageConverter): | ||
backbone_cls = DifferentialBinarizationBackbone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there should be some resizing/rescaling ops here right?
|
||
|
||
@keras_hub_export("keras_hub.models.DifferentialBinarizationOCR") | ||
class DifferentialBinarizationOCR(ImageSegmenter): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to add a new base class for ocr, I don't think ImageSegmenter is a good. one. Do you have a specific reason you chose to subclass ImageSegmenter?
This adds the Differntial Binarization model for text detection.
Implemented the architecture based on ResNet50_vd from PaddleOCR and ported the weights.