forked from xiaoschannel/pytorch-Deep-Learning
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[RU] 12-2, 12-1, 12-3, 12 tranlsation, index fix, [EN] 12-3 fix (Atco…
…ld#703) * 12 week translation started * 12-1 init * 12-1 translation fixes * [RU] translation of 12-1.md * [RU] index fixed 15th week added * [RU] 12-2 translated * [RU] 12-3 translation * [EN] 12-3 fixes * [RU] config fixes Co-authored-by: Alfredo Canziani <[email protected]>
- Loading branch information
Showing
7 changed files
with
2,009 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
--- | ||
lang: ru | ||
lang-ref: ch.12 | ||
title: Неделя 12 | ||
translation-date: 01 Dec 2020 | ||
translator: Evgeniy Pak | ||
--- | ||
|
||
|
||
<!-- ## Lecture part A --> | ||
## Часть A лекции | ||
|
||
<!-- In this section we discuss the various architectures used in NLP applications, beginning with CNNs, RNNs, and eventually covering the state of-the art architecture, transformers. We then discuss the various modules that comprise transformers and how they make transformers advantageous for NLP tasks. Finally, we discuss tricks that allow transformers to be trained effectively. --> | ||
|
||
В этом разделе мы обсуждаем различные архитектуры, используемые в приложениях обработки естественного языка, начиная с CNNs, RNNs, и, в конечном итоге, рассматривая state-of-the-art архитектуру, трансформеры. Затем мы обсуждаем различные модули, которые включают трансформеры и то, как они дают преимущество трансформерам в задачах естественной обработки языка. В итоге мы обсудим приёмы, позволяющие эффективно обучать трансформеры. | ||
|
||
|
||
<!-- ## Lecture part B --> | ||
## Часть B лекции | ||
|
||
<!-- In this section we introduce beam search as a middle ground between greedy decoding and exhaustive search. We consider the case of wanting to sample from the generative distribution (*i.e.* when generating text) and introduce "top-k" sampling. Subsequently, we introduce sequence to sequence models (with a transformer variant) and backtranslation. We then introduce unsupervised learning approaches for learning embeddings and discuss word2vec, GPT, and BERT. --> | ||
|
||
В этом разделе мы знакомим с лучевым поиском как золотой серединой между жадным декодированием и полным перебором. Мы рассматриваем случай, когда требуется выборка из порождающего распределения (*т.e.* при генерации текста) и вводим понятие "top-k" выборки. Затем мы знакомим с моделями sequence to sequence (в варианте трансформера) и обратным переводом. После рассматриваем подход обучения без учителя к обучению характеристик и обсуждаем word2vec, GPT и BERT. | ||
|
||
<!-- ## Practicum --> | ||
## Практикум | ||
|
||
<!-- We introduce attention, focusing on self-attention and its hidden layer representations of the inputs. Then, we introduce the key-value store paradigm and discuss how to represent queries, keys, and values as rotations of an input. Finally, we use attention to interpret the transformer architecture, taking a forward pass through a basic transformer, and comparing the encoder-decoder paradigm to sequential architectures. --> | ||
|
||
Вводим понятие внимания, фокусируясь на self-attention и его представлениях входов на скрытом слое. Затем мы представляем парадигму хранилища ключ-значение и обсуждаем, как представить запросы, ключи и значения, как повороты входов. Наконец мы используем внимание для интерпретации архитектуры трансформер, взяв результат прямого прохода через базовый трансформер и сравнивая парадигму кодирования-декодирования с последовательной архитектурой. |