Skip to content

Commit

Permalink
Support for BPE vocabs + denoising autoencoder in PyTorch Translate (f…
Browse files Browse the repository at this point in the history
…acebookresearch#362)

Summary:
Pull Request resolved: facebookresearch#362

Pull Request resolved: pytorch/translate#254

This actually uses the fairseq logic which supports BPE cont / end word marker suffixes.

Reviewed By: xianxl

Differential Revision: D12952766

fbshipit-source-id: 35a1bbc38240e4145bec0fc419f2d0a6a73ae2e5
  • Loading branch information
liezl200 authored and facebook-github-bot committed Nov 13, 2018
1 parent 880e7cd commit 7e60d45
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion fairseq/data/noising.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ def noising(self, x, lengths, dropout_prob=0.1, blank_idx=None):
assert 0 < dropout_prob < 1

# be sure to drop entire words
word_idx = self._get_bpe_word_idx(x)
word_idx = self.get_word_idx(x)
sentences = []
modified_lengths = []
for i in range(lengths.size(0)):
Expand Down

0 comments on commit 7e60d45

Please sign in to comment.