From 7b02940ca4c72fd0969a30a8b4d9e31c9b07f46e Mon Sep 17 00:00:00 2001 From: EIFY Date: Mon, 1 Aug 2022 11:20:22 -0700 Subject: [PATCH] "added by us" placement It's more common and easier to follow to put the participial phrase after the noun, I think. --- train/tr11-176B-ml/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/train/tr11-176B-ml/README.md b/train/tr11-176B-ml/README.md index 34ebed8a..ef920713 100644 --- a/train/tr11-176B-ml/README.md +++ b/train/tr11-176B-ml/README.md @@ -335,7 +335,7 @@ Thus: `sqrt(1/(14336*3)) = 0.00482197968631537` ### Positional Encoding -We use the added by us AliBi implementation: +We use the AliBi implementation added by us: ``` --position-embedding-type alibi \ @@ -345,7 +345,7 @@ Paper: [Train Short, Test Long: Attention with Linear Biases Enables Input Lengt ### Embed LayerNorm -We use the added by us embedding layer norm which makes the training more stable at a small training slowdown cost and a tiny additional amount of memory. +We use the embedding layer norm added by us which makes the training more stable at a small training slowdown cost and a tiny additional amount of memory. ``` --embed-layernorm \