Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TOPI] FIFO buffer op, to accelerate sequence modeling with dilated convolutions #4039

Merged
merged 6 commits into from
Oct 10, 2019

Conversation

hcho3
Copy link
Contributor

@hcho3 hcho3 commented Oct 1, 2019

Motivation. Dilated convolutions have appeared as an effective alternative to recurrent units in modeling sequences. For example, WaveNet [1] uses a stack of dilated convolutional layers to generate raw audio waveforms from text. Snips [2] modifies the WaveNet architecture to detect a keyword in an audio stream.

In order to capture temporal context, the WaveNet architecture feeds a sliding window over the input sequence into the first convolutional layer. As noted in [2] and [3], computing convolution over the sliding window results in redundant computation:
ring_buffer

This pull request implements a FIFO buffer operator where intermediate outputs are cached from each convolutional layer, so as to eliminate redundant computation. This is like [4], except that here the re-use is explicit and inherent in the model. Note that caching is only applicable in inference time (so not applicable to training).
ring_buffer2

Semantics. The FIFO buffer op should behave like

concat(buffer, data, axis=axis)
.slice_axis(axis=axis, begin=data.shape[axis], end=data.shape[axis]+buffer.shape[axis])

Usage. See topi/tests/python/test_fifo_buffer.py

Limitation. Currently, the buffer op exists only in TOPI. To make it useful, we want to merge it into MXNet and other frameworks. Alternatively, we could conceivably implement a custom pass in Relay so that the user can annotate a stack of convolutional layers.

References
[1] "WaveNet: A Generative Model for Raw Audio." https://arxiv.org/abs/1609.03499
[2] "Efficient keyword spotting using dilated convolutions and gating" https://arxiv.org/abs/1811.07684
[3] "Fast Wavenet Generation Algorithm" https://arxiv.org/abs/1611.09482
[4] "Deep reuse: streamline CNN inference on the fly via coarse-grained computation reuse" https://dl.acm.org/citation.cfm?id=3330384

Special thanks to Thibaud Senechal (Amazon) for initially suggesting the concept of FIFO buffer.

cc @yongwww @wweic @zhiics @kevinthesun @anijain2305

@hcho3
Copy link
Contributor Author

hcho3 commented Oct 1, 2019

TODO.

  • Create an end to end example.
  • Send a pull request to MXNet.

@tqchen
Copy link
Member

tqchen commented Oct 2, 2019

cc @vinx13 @merrymercy would be great if you can help comment and review

Copy link
Contributor

@anijain2305 anijain2305 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I will have to look into the details to understand the compute, but overall looks good to me. Will do one more round by tomorrow.

python/tvm/relay/op/nn/_nn.py Show resolved Hide resolved
src/relay/op/nn/nn.cc Outdated Show resolved Hide resolved
Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I left some minor reviews. Otherwise, looks good to me.

python/tvm/relay/frontend/mxnet.py Show resolved Hide resolved
src/relay/op/nn/nn.cc Outdated Show resolved Hide resolved
topi/tests/python/test_fifo_buffer.py Show resolved Hide resolved
Copy link
Member

@yongwww yongwww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @vinx13 Can you take another look?

@vinx13 vinx13 merged commit aa42413 into apache:master Oct 10, 2019
@hcho3 hcho3 deleted the fifo_buffer_op branch October 11, 2019 01:11
anijain2305 pushed a commit to anijain2305/tvm that referenced this pull request Oct 17, 2019
…onvolutions (apache#4039)

* Add FIFO buffer op to enable explicit computation re-use in convolution

* Add a test

* Add end-to-end test with 1D convolution

* Add a stub in MXNet frontend

* Address reviewer comments

* Add back stub for MXNet frontend
wweic pushed a commit to neo-ai/tvm that referenced this pull request Oct 18, 2019
…onvolutions (apache#4039)

* Add FIFO buffer op to enable explicit computation re-use in convolution

* Add a test

* Add end-to-end test with 1D convolution

* Add a stub in MXNet frontend

* Address reviewer comments

* Add back stub for MXNet frontend
@tqchen tqchen unassigned zhiics and vinx13 Nov 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants