Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fn.experimental.audio_resample GPU #3911

Merged
merged 39 commits into from
May 23, 2022

Conversation

jantonguirao
Copy link
Contributor

Category:

New feature

Description:

  • It adds a new operator fn.experimental.audio_resample for the GPU backend.
  • The operator builds on top of the existing CPU operator counterpart
  • The implementation relies on the GPU signal resampling kernel.

Additional information:

Affected modules and functionalities:

New Op

Key points relevant for the review:

Test correctness?

Checklist

Tests

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: ARES.[01-08]

JIRA TASK: DALI-1445

@jantonguirao jantonguirao changed the title Audio resampling gpu op Add fn.experimental.audio_resample GPU May 17, 2022
@jantonguirao jantonguirao marked this pull request as draft May 17, 2022 11:47
@@ -34,64 +34,76 @@
rates = [ 16000, 22050, 12347 ]
lengths = [ 10000, 54321, 12345 ]

def create_test_files():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: nose was picking this as a test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making it "private" (with leading underscore) would help, too.

mzient and others added 15 commits May 17, 2022 15:56
Signed-off-by: Michał Zientkiewicz <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
@jantonguirao jantonguirao force-pushed the audio_resampling_gpu_op branch from aeb5f2d to bd8212a Compare May 17, 2022 13:56
mzient and others added 11 commits May 17, 2022 16:28
Signed-off-by: Michał Zientkiewicz <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
@@ -144,7 +148,8 @@ class ResampleBase : public Operator<Backend> {
ArgValue<float> scale_{"scale", spec_};
ArgValue<int64_t> out_length_{"out_length", spec_};

std::vector<double> scales_;
using Args = kernels::signal::resampling::Args;
SmallVector<Args, 128> args_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plain vector will do just fine. Large SmallVectors (however weird that sounds) are better suited for temporary local buffers, where the difference between (frequent) stack allocation (free) and heap allocation (thousands of cycles) is of essence. Here, the vector will be reallocated a few times per operator lifetime at worst (typically it will be allocated just once) - and chances are, we'll still run over 128.

Suggested change
SmallVector<Args, 128> args_;
std::vector<Args> args_;

Signed-off-by: Joaquin Anton <[email protected]>
Signed-off-by: Joaquin Anton <[email protected]>
@jantonguirao jantonguirao force-pushed the audio_resampling_gpu_op branch from fc49aae to 7a0cc7a Compare May 18, 2022 11:48
Signed-off-by: Joaquin Anton <[email protected]>
@jantonguirao jantonguirao force-pushed the audio_resampling_gpu_op branch from 4e08539 to 9409cc9 Compare May 18, 2022 13:54
@jantonguirao jantonguirao force-pushed the audio_resampling_gpu_op branch from 9409cc9 to d76cd88 Compare May 18, 2022 13:54
@@ -77,13 +77,12 @@ void ResamplerGPU<Out, In>::Run(KernelContext &context, const OutListGPU<Out> &o
desc.out = out_sample.data;
desc.window = window_gpu_;
const auto &in_sh = in_sample.shape;
const auto &out_sh = out_sample.shape;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this because it was seen as not-used in release builds (therefore a warning)

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4879759]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4879843]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4879843]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4881910]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4881910]: BUILD PASSED

@@ -1,4 +1,4 @@
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2020-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a separate PR?

@jantonguirao jantonguirao force-pushed the audio_resampling_gpu_op branch from 70b6342 to a5782b4 Compare May 20, 2022 08:02
assert np.allclose(out.at(i), ref, 1e-6, eps)

print("Diff: ", out_arr.astype(np.float) - ref)
assert False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: the original code deliberately repeated the check here, so that the error in nosetests would appear as more than False in non-verbose runs.

Comment on lines 131 to 132
args_[s].in_rate = 1.0;
args_[s].out_rate = in_length ? 1.0 * out_length / in_length : 0.0;
Copy link
Contributor

@mzient mzient May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick:
Now that we have args, this would increase precision - we use the inverse scale in the kernel (in_rate/out_rate), so performing the other division here and reciprocal there will decrease (albeit very slightly) the precision.

Suggested change
args_[s].in_rate = 1.0;
args_[s].out_rate = in_length ? 1.0 * out_length / in_length : 0.0;
args_[s].in_rate = in_length ? in_length : 1; // avoid division by 0
args_[s].out_rate = out_length ? out_length : 1; // avoid division by 0

Signed-off-by: Joaquin Anton <[email protected]>
@jantonguirao jantonguirao force-pushed the audio_resampling_gpu_op branch from 4d9d2ae to 23525bf Compare May 20, 2022 08:16
@@ -125,7 +128,8 @@ class ResampleBase : public Operator<Backend> {
DALI_FAIL(make_string("Cannot produce a non-empty signal from an empty input.\n"
"Error at sample ", s));
}
scales_[s] = in_length ? 1.0 * out_length / in_length : 0.0;
args_[s].in_rate = in_length ? in_length : 1; // avoid division by 0
args_[s].out_rate = out_length ? out_length : 1; // avoid division by 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
args_[s].out_rate = out_length ? out_length : 1; // avoid division by 0
args_[s].out_rate = out_length ? out_length : 1; // avoid division by 0

the linter will complain

Signed-off-by: Joaquin Anton <[email protected]>
@@ -34,64 +34,73 @@
rates = [ 16000, 22050, 12347 ]
lengths = [ 10000, 54321, 12345 ]

def create_test_files():
def create_files():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be another way, more verbose probably:

Suggested change
def create_files():
from nose.tools import nottest
@nottest
def create_files():

@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4891719]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4891719]: BUILD PASSED

@jantonguirao jantonguirao merged commit 9918cb5 into NVIDIA:main May 23, 2022
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jun 7, 2022
@JanuszL JanuszL mentioned this pull request Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants