-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure sample encapsulation in Tensor Vector #3701
Ensure sample encapsulation in Tensor Vector #3701
Conversation
bd0e2c5
to
bc72179
Compare
d8d5052
to
11ca202
Compare
!build |
CI MESSAGE: [4111684]: BUILD STARTED |
CI MESSAGE: [4111684]: BUILD FAILED |
!build |
CI MESSAGE: [4118107]: BUILD STARTED |
CI MESSAGE: [4118107]: BUILD FAILED |
!build |
CI MESSAGE: [4119744]: BUILD STARTED |
CI MESSAGE: [4119744]: BUILD FAILED |
!build |
CI MESSAGE: [4145105]: BUILD STARTED |
CI MESSAGE: [4145105]: BUILD FAILED |
!build |
CI MESSAGE: [4151450]: BUILD STARTED |
CI MESSAGE: [4151450]: BUILD FAILED |
ec135cf
to
a296052
Compare
!build |
CI MESSAGE: [4163058]: BUILD STARTED |
CI MESSAGE: [4163058]: BUILD FAILED |
There is some issue with CPU-only test, I will debug it tomorrow, the rest works fine. |
dali/benchmark/operator_bench.h
Outdated
|
||
if (fill_in_data) { | ||
for (auto &in_ptr : *data_in) { | ||
auto *ptr = in_ptr->template mutable_data<T>(); | ||
for (int sample_id = 0; sample_id < batch_size; sample_id++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (int sample_id = 0; sample_id < batch_size; sample_id++) { | |
for (int sample_idx = 0; sample_idx < batch_size; sample_idx++) { |
id
suggest it's something more than an ordinal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Krzysztof Lecki <[email protected]>
Signed-off-by: Krzysztof Lecki <[email protected]>
!build |
CI MESSAGE: [4281403]: BUILD STARTED |
Signed-off-by: Krzysztof Lecki <[email protected]>
!build |
CI MESSAGE: [4281513]: BUILD STARTED |
CI MESSAGE: [4281513]: BUILD PASSED |
Signed-off-by: Krzysztof Lecki <[email protected]>
Signed-off-by: Krzysztof Lecki <[email protected]>
Signed-off-by: Krzysztof Lecki <[email protected]>
dali/pipeline/data/tensor_vector.h
Outdated
* @brief Returns the size in bytes of the underlying data chunks | ||
* TODO(klecki): Temporary API to be reworked, do not use. | ||
*/ | ||
std::vector<size_t> _chunks_nbytes() const noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<size_t> _chunks_nbytes() const noexcept; | |
std::vector<size_t> _chunks_nbytes() const; |
After all, a vector
is dynamically allocated...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
dali/pipeline/data/tensor_vector.h
Outdated
* @brief Returns the real size of the underlying allocations | ||
* TODO(klecki): Temporary API to be reworked, do not use. | ||
*/ | ||
std::vector<size_t> _chunks_capacity() const noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Krzysztof Lecki <[email protected]>
Signed-off-by: Krzysztof Lecki <[email protected]>
!build |
CI MESSAGE: [4283355]: BUILD STARTED |
CI MESSAGE: [4283355]: BUILD PASSED |
Add APIs matching TensorList to TensorVector: * sample pointer accessors * Set/GetMeta Change operator[] to return [Const]SampleView. Introduce UnsafeSetSample and UnsafeCopySample to replace TensorVector[i].ShareData(tensor) and TensorVector[i].Copy(tensor) - they work with current code base, but for proper sample-based data structure more checks should be introduced - intended for follow up. Adjust code where necessary: * where possible use data accessors directly on the TensorVector instead of the sample, as it should be faster than create temporary, so: `tv[i].mutable_data<T>()` -> `tv.mutable_tensor<T>(i)` etc. * Using SampleViews is compatible with code that uses `view<T>`, as `view<T>(Tensor)` is equivalent to `view<T>(sample_view(Tensor))` Adjustments: * allow views to work with scalar Tensors (they treated them as empty) * introduce distinct SampleView and ConstSampleView as they need to be returned by value and we need sensible overloads for `view<>`. * allow to access `capacity` and `nbytes` of individual samples, introduce _chunks_capacity and _chunks_nbytes for that. Next steps written as TODO in TensorVector dosctring. Current naming: The `Unsafe` prefix in SetSample and CopySample is intended to temporary stay there to discourage introduction of new use cases till the followup introduces remaining checks. Capacity and nbytes of individual allocations have leading underscore as the API is to be reworked and is not intended for new usages. Signed-off-by: Krzysztof Lecki <[email protected]>
Add APIs matching TensorList to TensorVector: * sample pointer accessors * Set/GetMeta Change operator[] to return [Const]SampleView. Introduce UnsafeSetSample and UnsafeCopySample to replace TensorVector[i].ShareData(tensor) and TensorVector[i].Copy(tensor) - they work with current code base, but for proper sample-based data structure more checks should be introduced - intended for follow up. Adjust code where necessary: * where possible use data accessors directly on the TensorVector instead of the sample, as it should be faster than create temporary, so: `tv[i].mutable_data<T>()` -> `tv.mutable_tensor<T>(i)` etc. * Using SampleViews is compatible with code that uses `view<T>`, as `view<T>(Tensor)` is equivalent to `view<T>(sample_view(Tensor))` Adjustments: * allow views to work with scalar Tensors (they treated them as empty) * introduce distinct SampleView and ConstSampleView as they need to be returned by value and we need sensible overloads for `view<>`. * allow to access `capacity` and `nbytes` of individual samples, introduce _chunks_capacity and _chunks_nbytes for that. Next steps written as TODO in TensorVector dosctring. Current naming: The `Unsafe` prefix in SetSample and CopySample is intended to temporary stay there to discourage introduction of new use cases till the followup introduces remaining checks. Capacity and nbytes of individual allocations have leading underscore as the API is to be reworked and is not intended for new usages. Signed-off-by: Krzysztof Lecki <[email protected]>
Category: New feature, Breaking change
Description:
Add APIs matching TensorList to TensorVector:
Change operator[] to return [Const]SampleView.
Introduce UnsafeSetSample and UnsafeCopySample
to replace TensorVector[i].ShareData(tensor)
and TensorVector[i].Copy(tensor) - they work with current
code base, but for proper sample-based data structure
more checks should be introduced - intended for follow up.
Adjust code where necessary:
instead of the sample, as it should be faster than create temporary, so:
tv[i].mutable_data() -> tv.mutable_tensor(i) etc.
view<T>
,as
view<T>(Tensor)
is equivalent toview<T>(sample_view(Tensor))
Adjustments:
be returned by value and we need sensible overloads for
view<>
.capacity
andnbytes
of individual samples,introduce total_capacity and total_nbytes for sum of them.
Next steps written as TODO in TensorVector dosctring.
Keep the SampleView accessors temporarily with leading underscore
so they can be easily adjusted during review.
Current naming:
The sample view has temporary leading underscores in the API (
_raw_data
),that would be removed before merging the PR - it's to facilitate renaming in case
it's requested and to help with verifying all calls are reworked to the
intended
tv.raw_tensor(i)
etc.The
Unsafe
prefix in SetSample and CopySample is intended to temporary staythere to discourage introduction of new use cases till the followup introduces
remaining checks.
_
from sample view naming before merge.Additional information:
Affected modules and functionalities:
TensorVector, SampleView, SampleWorkspace-based operators, CPU operators,
C API, small bits of executor.
Key points relevant for the review:
Checklist
Tests
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A