Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure sample encapsulation in Tensor Vector #3701

Merged

Conversation

klecki
Copy link
Contributor

@klecki klecki commented Feb 23, 2022

Category: New feature, Breaking change

Description:

Add APIs matching TensorList to TensorVector:

  • sample pointer accessors
  • Set/GetMeta

Change operator[] to return [Const]SampleView.
Introduce UnsafeSetSample and UnsafeCopySample
to replace TensorVector[i].ShareData(tensor)
and TensorVector[i].Copy(tensor) - they work with current
code base, but for proper sample-based data structure
more checks should be introduced - intended for follow up.

Adjust code where necessary:

  • where possible use data accessors directly on the TensorVector
    instead of the sample, as it should be faster than create temporary, so:
    tv[i].mutable_data() -> tv.mutable_tensor(i) etc.
  • Using SampleViews is compatible with code that uses view<T>,
    as view<T>(Tensor) is equivalent to view<T>(sample_view(Tensor))

Adjustments:

  • allow views to work with scalar Tensors (they treated them as empty)
  • introduce distinct SampleView and ConstSampleView as they need to
    be returned by value and we need sensible overloads for view<>.
  • allow to access capacity and nbytes of individual samples,
    introduce total_capacity and total_nbytes for sum of them.

Next steps written as TODO in TensorVector dosctring.

Keep the SampleView accessors temporarily with leading underscore
so they can be easily adjusted during review.

Current naming:
The sample view has temporary leading underscores in the API (_raw_data),
that would be removed before merging the PR - it's to facilitate renaming in case
it's requested and to help with verifying all calls are reworked to the
intended tv.raw_tensor(i) etc.

The Unsafe prefix in SetSample and CopySample is intended to temporary stay
there to discourage introduction of new use cases till the followup introduces
remaining checks.

  • Remove prefix _ from sample view naming before merge.

Additional information:

Affected modules and functionalities:

TensorVector, SampleView, SampleWorkspace-based operators, CPU operators,
C API, small bits of executor.

Key points relevant for the review:

Checklist

Tests

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@klecki klecki changed the title Introduce runtime typed TensorView Ensure sample encapsulation in Tensor Vector Feb 23, 2022
@klecki klecki force-pushed the encapsulate-tensor-vector-operator-subscript branch 2 times, most recently from bd0e2c5 to bc72179 Compare February 25, 2022 18:07
@klecki klecki force-pushed the encapsulate-tensor-vector-operator-subscript branch from d8d5052 to 11ca202 Compare March 9, 2022 18:05
@klecki
Copy link
Contributor Author

klecki commented Mar 9, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4111684]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4111684]: BUILD FAILED

@klecki
Copy link
Contributor Author

klecki commented Mar 10, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4118107]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4118107]: BUILD FAILED

@klecki
Copy link
Contributor Author

klecki commented Mar 10, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4119744]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4119744]: BUILD FAILED

@klecki
Copy link
Contributor Author

klecki commented Mar 14, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4145105]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4145105]: BUILD FAILED

@klecki
Copy link
Contributor Author

klecki commented Mar 15, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4151450]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4151450]: BUILD FAILED

@klecki klecki force-pushed the encapsulate-tensor-vector-operator-subscript branch 2 times, most recently from ec135cf to a296052 Compare March 16, 2022 16:12
@klecki klecki marked this pull request as ready for review March 16, 2022 16:13
@klecki klecki mentioned this pull request Mar 16, 2022
23 tasks
@klecki
Copy link
Contributor Author

klecki commented Mar 16, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4163058]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4163058]: BUILD FAILED

@mzient mzient self-assigned this Mar 17, 2022
@klecki
Copy link
Contributor Author

klecki commented Mar 17, 2022

There is some issue with CPU-only test, I will debug it tomorrow, the rest works fine.


if (fill_in_data) {
for (auto &in_ptr : *data_in) {
auto *ptr = in_ptr->template mutable_data<T>();
for (int sample_id = 0; sample_id < batch_size; sample_id++) {
Copy link
Contributor

@mzient mzient Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int sample_id = 0; sample_id < batch_size; sample_id++) {
for (int sample_idx = 0; sample_idx < batch_size; sample_idx++) {

id suggest it's something more than an ordinal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@klecki
Copy link
Contributor Author

klecki commented Mar 31, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4281403]: BUILD STARTED

Signed-off-by: Krzysztof Lecki <[email protected]>
@klecki
Copy link
Contributor Author

klecki commented Mar 31, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4281513]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4281513]: BUILD PASSED

klecki added 3 commits March 31, 2022 15:29
Signed-off-by: Krzysztof Lecki <[email protected]>
Signed-off-by: Krzysztof Lecki <[email protected]>
Signed-off-by: Krzysztof Lecki <[email protected]>
* @brief Returns the size in bytes of the underlying data chunks
* TODO(klecki): Temporary API to be reworked, do not use.
*/
std::vector<size_t> _chunks_nbytes() const noexcept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::vector<size_t> _chunks_nbytes() const noexcept;
std::vector<size_t> _chunks_nbytes() const;

After all, a vector is dynamically allocated...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* @brief Returns the real size of the underlying allocations
* TODO(klecki): Temporary API to be reworked, do not use.
*/
std::vector<size_t> _chunks_capacity() const noexcept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Krzysztof Lecki <[email protected]>
Signed-off-by: Krzysztof Lecki <[email protected]>
@klecki
Copy link
Contributor Author

klecki commented Mar 31, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4283355]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4283355]: BUILD PASSED

@klecki klecki merged commit 0cd1cf1 into NVIDIA:main Mar 31, 2022
@klecki klecki deleted the encapsulate-tensor-vector-operator-subscript branch March 31, 2022 19:24
@klecki klecki restored the encapsulate-tensor-vector-operator-subscript branch April 1, 2022 09:17
cyyever pushed a commit to cyyever/DALI that referenced this pull request May 13, 2022
Add APIs matching TensorList to TensorVector:
* sample pointer accessors
* Set/GetMeta

Change operator[] to return [Const]SampleView.
Introduce UnsafeSetSample and UnsafeCopySample 
to replace TensorVector[i].ShareData(tensor) 
and TensorVector[i].Copy(tensor) - they work with current
code base, but for proper sample-based data structure
more checks should be introduced - intended for follow up.

Adjust code where necessary:
* where possible use data accessors directly on the TensorVector
    instead of the sample, as it should be faster than create temporary, so:
    `tv[i].mutable_data<T>()` -> `tv.mutable_tensor<T>(i)` etc.
* Using SampleViews is compatible with code that uses `view<T>`,
    as `view<T>(Tensor)` is equivalent to `view<T>(sample_view(Tensor))` 

Adjustments:
* allow views to work with scalar Tensors (they treated them as empty)
* introduce distinct SampleView and ConstSampleView as they need to
    be returned by value and we need sensible overloads for `view<>`.
* allow to access `capacity` and `nbytes` of individual samples,
    introduce _chunks_capacity and _chunks_nbytes for that.

Next steps written as TODO in TensorVector dosctring.

Current naming:
The `Unsafe` prefix in SetSample and CopySample is intended to temporary stay
there to discourage introduction of new use cases till the followup introduces
remaining checks. Capacity and nbytes of individual allocations have leading 
underscore as the API is to be reworked and is not intended for new usages.  

Signed-off-by: Krzysztof Lecki <[email protected]>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jun 7, 2022
Add APIs matching TensorList to TensorVector:
* sample pointer accessors
* Set/GetMeta

Change operator[] to return [Const]SampleView.
Introduce UnsafeSetSample and UnsafeCopySample 
to replace TensorVector[i].ShareData(tensor) 
and TensorVector[i].Copy(tensor) - they work with current
code base, but for proper sample-based data structure
more checks should be introduced - intended for follow up.

Adjust code where necessary:
* where possible use data accessors directly on the TensorVector
    instead of the sample, as it should be faster than create temporary, so:
    `tv[i].mutable_data<T>()` -> `tv.mutable_tensor<T>(i)` etc.
* Using SampleViews is compatible with code that uses `view<T>`,
    as `view<T>(Tensor)` is equivalent to `view<T>(sample_view(Tensor))` 

Adjustments:
* allow views to work with scalar Tensors (they treated them as empty)
* introduce distinct SampleView and ConstSampleView as they need to
    be returned by value and we need sensible overloads for `view<>`.
* allow to access `capacity` and `nbytes` of individual samples,
    introduce _chunks_capacity and _chunks_nbytes for that.

Next steps written as TODO in TensorVector dosctring.

Current naming:
The `Unsafe` prefix in SetSample and CopySample is intended to temporary stay
there to discourage introduction of new use cases till the followup introduces
remaining checks. Capacity and nbytes of individual allocations have leading 
underscore as the API is to be reworked and is not intended for new usages.  

Signed-off-by: Krzysztof Lecki <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants