Per frame affine transforms #3946

stiepan · 2022-05-31T14:15:03Z

Category:

Description:

Adds support for sequence input to affine transforms and coordinate transforms ops. This should facilitate use cases of the warp_affine operator with video. For the same reason, the coord_transform operator is extended to support sequences.
The change required modification of the SequenceOperator so that the named arguments may be treated as a source of truth as to whether the operator processes sequences. This helps in two ways. Firstly, operator may have no positional input but should produce per-frame output, for example transforms.rotation may receive per-frame angles as a named argument and should produce per-frame rotation matrices based on that. Secondly, transforms.rotation may receive input matrices on per-sample basis (for instance from Constant op) and then compose them with per-frame matrices produced based on per-frame angles.
Corresponding changes was made in the sequences testing utility. Previously the utility expected sequence-containing batches for the 0-th input of the operator (while the rest of the params was specified as callbacks that the utility used to produce inputs for arguments that matches the shape of the input). Now, you can specify to which argument the provided batches should be fed, effectively making the arbitrary argument "a source of truth" as to the shape of the input sequences.

Additional information:

Affected modules and functionalities:

The functionalities of following operators are affected:

coord_transform
transforms.combine
transforms.crop
transforms.rotation
transforms.scale
transforms.shear
transforms.translation

The SequenceOperator is modified to handle transforms. It is already utilized by gaussian blur, laplacian, rotate and warp operators but the change should not affect their functionalities.

It duplicates the fix for the tests to pass.
#3958
This PR was rebased on that one.

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: DALI-2795

stiepan · 2022-06-07T11:48:22Z

!build

dali-automaton · 2022-06-07T11:50:28Z

CI MESSAGE: [5033445]: BUILD STARTED

dali-automaton · 2022-06-07T11:56:20Z

CI MESSAGE: [5033445]: BUILD FAILED

stiepan · 2022-06-07T12:03:52Z

!build

dali-automaton · 2022-06-07T12:05:10Z

CI MESSAGE: [5033491]: BUILD STARTED

dali-automaton · 2022-06-07T13:12:21Z

CI MESSAGE: [5033491]: BUILD FAILED

dali-automaton · 2022-06-07T13:52:37Z

CI MESSAGE: [5033491]: BUILD PASSED

klecki

Mostly small nitpicks and complaints to test, I don't see any significant issues in the operator code, in fact it is really nice that we can add this sequence/per-frame support via expansion with so little changes.

klecki · 2022-06-13T15:27:06Z

dali/pipeline/operator/sequence_operator.h

@@ -361,32 +358,50 @@ class SequenceOperator : public Operator<Backend>, protected SampleBroadcasting<
  void SetupSequenceOperator(const workspace_t<Backend> &ws) {
    auto num_inputs = ws.NumInput();
    input_expand_desc_.clear();
-    input_expand_desc_.reserve(num_inputs);
+    input_expand_desc_.reserve(num_inputs + 1);


Not a fan of the input_expand_desc_ having one more element, why can't we directly use the expand_like_ as the source of truth desc? Especially if it won't be used some times (the additional element, it can be error prone).
Maybe make it testable as a boolean, so we know that if it's empty we don't expand?

I agree with @klecki about not adding the last element to input_expand_desc_. How about making expand_like_ a unique_ptr that InferReferenceExpandDesc returns. a nullptr would mean nothing to expand.

Thanks, I like the unique pointer idea.

klecki · 2022-06-13T16:54:37Z

dali/test/python/sequences_test_utils.py

-            source=dummy_source(input_data), layout=input_layout)
-        if device == "gpu":
-            input = input.gpu()
+    def pipeline(input_data: ArgData, args_data: List[ArgData]):


What is the difference between input_data and args_data if args_data can have positional inputs?

They are the same type, maybe those can be concatenated outside of the pipeline and we would have half of the ifs here removed.

Yep, it can. Thanks for catching it that. Somewhere along the refactoring I already tried it but there were some issues with specifying the device, but now it was no problem and it simplified the function indeed,

klecki · 2022-06-13T17:05:21Z

dali/test/python/sequences_test_utils.py

-    return [expand_arg(input_layout, num_expand, arg_has_frames, input_batch, arg_batch)
-            for input_batch, arg_batch in zip(input_data, arg_data)]
-
+def expand_arg_input(input_data: ArgData, arg_data: ArgData):


Is it replicating the arg_data if it's not per-frame in manner corresponding to the input data? I would not mind some small docstring here or there :P

Yes, added the info about it in a docstr.

klecki · 2022-06-13T17:13:55Z

dali/test/python/test_operator_rotate.py

        if expanded_axis is not None:
            output_size_params += (expanded_axis.data,)
        output_sizes = [
            sequence_batch_output_size(*args)
            for args in zip(*output_size_params)]
-        expanded_params.append(ArgData(ArgDesc("size", False, "cpu"), output_sizes))
+        expanded_params.append(ArgData(ArgDesc("size", "", "cpu"), output_sizes))


Just a suggestion, maybe use names of the arguments in ArgDesc?

Suggested change

expanded_params.append(ArgData(ArgDesc("size", "", "cpu"), output_sizes))

expanded_params.append(ArgData(

ArgDesc(name="size", expandable_prefix="", dest_device="cpu"),

output_sizes)

)

klecki · 2022-06-13T17:15:55Z

dali/test/python/test_operator_affine_transforms.py

+    class TransformsParamsProvider(ParamsProvider):
+        def unfold_output_layout(self, layout):
+            unfolded = super().unfold_output_layout(layout)
+            if unfolded == "**":


Why do we get "**"?

Transform ops propagate the information about sequence layout to reduce the need for fn.per_frame in the intermediate steps when composing multiple transforms. So if you specify args per-frame, the output matrices have F** layout. The base unfold_output_layout unfolds the outputs (of the layout F**, ) and drops the prefix of the layout (F** -> **). Usually it is OK, because it means something like FHWC -> HWC, but "**" is meaningless and baseline transform op run with no per frame input produces outputs with no layout.

klecki · 2022-06-13T17:24:00Z

dali/test/python/test_operator_affine_transforms.py

+    ]
+
+    seq_cases = test_cases + only_with_seq_input_cases
+    yield from sequence_suite_helper(rng, "F", [("F**", mt_seq_input)], seq_cases, num_iters)


Oh, I guess we just remove F in Python test to generate the baseline?

klecki · 2022-06-13T17:27:42Z

dali/test/python/test_operator_affine_transforms.py

+    # transform the test cases to test the transforms with per-frame args but:
+    # 1. with the positional input that does not contain frames
+    # 2. without the positional input
+    for tested_fn, fixed_params, params_provider, devices in test_cases:


FYI, just skimmed this.

klecki · 2022-06-13T17:28:04Z

dali/test/python/test_operator_coord_transform.py

@@ -176,3 +178,61 @@ def _test_empty_input(device):
 def test_empty_input():
    for device in ["cpu", "gpu"]:
        yield _test_empty_input, device
+
+
+def test_sequences():


And I didn't read this yet.

jantonguirao · 2022-06-13T18:03:05Z

dali/pipeline/operator/sequence_operator.h

-    auto expand_like_idx = GetReferenceInputIdx();
-    assert(expand_like_idx == -1 || GetInputExpandDesc(expand_like_idx).NumDimsToExpand() > 0);
-    return expand_like_idx >= 0;
+    assert(expand_like_ == nullptr || expand_like_->NumDimsToExpand() > 0);


can you explain the reasoning behind this assert?

It stipulates that input used as a reference for expansion must be sequence-like. Just to avoid repeated code of the form

GetReferentialExpandDesc();

Make sure it is an actual sequence, i.e. there are some dims to expand.

jantonguirao · 2022-06-13T18:11:50Z

dali/pipeline/operator/sequence_operator.h

-        return input_idx;
+  const ExpandDesc *InferNamedReferenceExpandDesc(const workspace_t<Backend> &ws) {
+    for (const auto &arg_input : ws) {
+      auto &shared_tvec = arg_input.second.tvec;


Suggested change

auto &shared_tvec = arg_input.second.tvec;

auto *shared_tvec = arg_input.second.tvec;

would read better

It is a shared_ptr. It is taken by reference to avoid unnecessary counter increment.

jantonguirao · 2022-06-13T18:14:09Z

dali/pipeline/operator/sequence_operator.h

@@ -361,32 +358,50 @@ class SequenceOperator : public Operator<Backend>, protected SampleBroadcasting<
  void SetupSequenceOperator(const workspace_t<Backend> &ws) {
    auto num_inputs = ws.NumInput();
    input_expand_desc_.clear();
-    input_expand_desc_.reserve(num_inputs);
+    input_expand_desc_.reserve(num_inputs + 1);


I agree with @klecki about not adding the last element to input_expand_desc_. How about making expand_like_ a unique_ptr that InferReferenceExpandDesc returns. a nullptr would mean nothing to expand.

stiepan · 2022-06-17T19:44:31Z

!build

dali-automaton · 2022-06-17T19:50:16Z

CI MESSAGE: [5114301]: BUILD STARTED

dali-automaton · 2022-06-17T21:00:14Z

CI MESSAGE: [5114301]: BUILD PASSED

Signed-off-by: Kamil Tokarski <[email protected]>

…of truth of the sequence shape Signed-off-by: Kamil Tokarski <[email protected]>

Signed-off-by: Kamil Tokarski <[email protected]>

stiepan · 2022-06-20T09:48:52Z

!build

dali-automaton · 2022-06-20T09:50:05Z

CI MESSAGE: [5129150]: BUILD STARTED

dali-automaton · 2022-06-20T11:11:30Z

CI MESSAGE: [5129150]: BUILD PASSED

klecki · 2022-06-20T15:15:36Z

dali/test/python/test_operator_affine_transforms.py

+        (fn.transforms.shear, {}, TransformsParamsProvider(
+            [ArgCb("angles", shear_angles, True)]), ["cpu"]),
+        (fn.transforms.shear, {}, TransformsParamsProvider(
+            [ArgCb("shear", shift, True)]), ["cpu"]),


How about testing arguments like shift as a proper scalar value (not only as argument input that is per frame or not per frame)?

Probably not important, the baseline implementation after unfolding should handle it.

Added a scalar case (and a few more per-frame non-per frame compose cases.

Signed-off-by: Kamil Tokarski <[email protected]>

…lity Signed-off-by: Kamil Tokarski <[email protected]>

stiepan · 2022-06-20T19:02:20Z

!build

dali-automaton · 2022-06-20T19:12:03Z

CI MESSAGE: [5132897]: BUILD STARTED

dali-automaton · 2022-06-20T20:18:30Z

CI MESSAGE: [5132897]: BUILD PASSED

stiepan force-pushed the per_frame_transforms branch from 72643bc to e26dd87 Compare June 6, 2022 22:15

stiepan marked this pull request as ready for review June 7, 2022 11:38

jantonguirao assigned klecki and jantonguirao Jun 8, 2022

klecki reviewed Jun 13, 2022

View reviewed changes

jantonguirao approved these changes Jun 13, 2022

View reviewed changes

stiepan force-pushed the per_frame_transforms branch from b123d27 to 579a2df Compare June 17, 2022 15:21

stiepan added 13 commits June 20, 2022 11:15

Allow named arguments as a reference

bd6a832

Signed-off-by: Kamil Tokarski <[email protected]>

Enable per-frame transforms

3974d15

Signed-off-by: Kamil Tokarski <[email protected]>

Per frame coord transform

999a828

Signed-off-by: Kamil Tokarski <[email protected]>

Fixed misspelling

a93cda0

Signed-off-by: Kamil Tokarski <[email protected]>

Mark transforms supporting per-frame input in schema

7048193

Signed-off-by: Kamil Tokarski <[email protected]>

Add basic coord_transform per-frame tests

d49f697

Signed-off-by: Kamil Tokarski <[email protected]>

Extend sequence test utility to handle other inputs/args as a source …

17b50da

…of truth of the sequence shape Signed-off-by: Kamil Tokarski <[email protected]>

Add broadcasting test to coord_transform

5c100ee

Signed-off-by: Kamil Tokarski <[email protected]>

Affine transforms tests, make the error message clearer in seq op

4fc9d5f

Signed-off-by: Kamil Tokarski <[email protected]>

Fix linter issues

6026505

Signed-off-by: Kamil Tokarski <[email protected]>

Fix Python lint issues

359be9f

Signed-off-by: Kamil Tokarski <[email protected]>

Do not store positional arg expand desc in the input desc vector

ef59c8a

Signed-off-by: Kamil Tokarski <[email protected]>

Apply tests review remarks

9a22f59

Signed-off-by: Kamil Tokarski <[email protected]>

Revert reserving extra space in input desc vector

754d1d0

Signed-off-by: Kamil Tokarski <[email protected]>

stiepan force-pushed the per_frame_transforms branch from 3e503de to 754d1d0 Compare June 20, 2022 09:43

klecki reviewed Jun 20, 2022

View reviewed changes

klecki approved these changes Jun 20, 2022

View reviewed changes

stiepan added 2 commits June 20, 2022 19:39

Add scalar tests, more combine/broadcast tests

01a63ee

Signed-off-by: Kamil Tokarski <[email protected]>

Pass ArgData as main input to sequence suite helper for better readab…

65262d7

…lity Signed-off-by: Kamil Tokarski <[email protected]>

stiepan merged commit d3ecce5 into NVIDIA:main Jun 21, 2022

JanuszL mentioned this pull request Jan 11, 2023

DALI 2022 roadmap #3774

Closed

	auto &shared_tvec = arg_input.second.tvec;
	auto *shared_tvec = arg_input.second.tvec;

Per frame affine transforms #3946

Per frame affine transforms #3946

Conversation

stiepan commented May 31, 2022 • edited Loading

Category:

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

stiepan commented Jun 7, 2022

dali-automaton commented Jun 7, 2022

dali-automaton commented Jun 7, 2022

stiepan commented Jun 7, 2022

dali-automaton commented Jun 7, 2022

dali-automaton commented Jun 7, 2022

dali-automaton commented Jun 7, 2022

klecki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stiepan Jun 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stiepan Jun 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stiepan Jun 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stiepan commented Jun 17, 2022

dali-automaton commented Jun 17, 2022

dali-automaton commented Jun 17, 2022

stiepan commented Jun 20, 2022

dali-automaton commented Jun 20, 2022

dali-automaton commented Jun 20, 2022

klecki Jun 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stiepan commented Jun 20, 2022

dali-automaton commented Jun 20, 2022

dali-automaton commented Jun 20, 2022

stiepan commented May 31, 2022 •

edited

Loading

stiepan Jun 17, 2022 •

edited

Loading

stiepan Jun 17, 2022 •

edited

Loading

stiepan Jun 17, 2022 •

edited

Loading

klecki Jun 20, 2022 •

edited

Loading