From e3acea5e87fab027fbc347beaf7a234b351ee9d3 Mon Sep 17 00:00:00 2001
From: Chenqqian Zhang <100290172+Chengqian-Zhang@users.noreply.github.com>
Date: Fri, 31 May 2024 18:18:21 +0800
Subject: [PATCH] docs: add document equations for `se_atten_v2` (#3828)

Solve issue #3139
`"se_atten_v2"` is inherited from `"se_atten"` with the following
parameter modifications:

```json
      "tebd_input_mode": "strip",
      "smooth_type_embedding": true,
      "set_davg_zero": false
```

I add the equations for parameter `"tebd_input_mode"`.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Detailed the default value and functionality of the
`"tebd_input_mode"` parameter.
- Highlighted the performance superiority of `"se_atten_v2"` over
`"se_atten"`.
  - Specified a model compression requirement for `se_atten_v2`.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com>
---
 doc/model/train-se-atten.md | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/doc/model/train-se-atten.md b/doc/model/train-se-atten.md
index 5bebdf580d..acd1a500a7 100644
--- a/doc/model/train-se-atten.md
+++ b/doc/model/train-se-atten.md
@@ -21,7 +21,11 @@ Attention-based descriptor $\mathcal{D}^i \in \mathbb{R}^{M \times M_{<}}$, whic
 ```
 
 where $\hat{\mathcal{G}}^i$ represents the embedding matrix $\mathcal{G}^i$ after additional self-attention mechanism and $\mathcal{R}^i$ is defined by the full case in the [`se_e2_a`](./train-se-e2-a.md).
-Note that we obtain $\mathcal{G}^i$ using the type embedding method by default in this descriptor.
+Note that we obtain $\mathcal{G}^i$ using the type embedding method by default in this descriptor. By default, we concat $s(r_{ij})$ and the type embeddings of central and neighboring atoms $\mathcal{A}^i$ and $\mathcal{A}^j$ as input of the embedding network $\mathcal{N}_{e,2}$:
+
+```math
+   (\mathcal{G}^i)_j = \mathcal{N}_{e,2}(\{s(r_{ij}), \mathcal{A}^i, \mathcal{A}^j\})  \quad \mathrm{or}\quad(\mathcal{G}^i)_j = \mathcal{N}_{e,2}(\{s(r_{ij}), \mathcal{A}^j\})
+```
 
 To perform the self-attention mechanism, the queries $\mathcal{Q}^{i,l} \in \mathbb{R}^{N_c\times d_k}$, keys $\mathcal{K}^{i,l} \in \mathbb{R}^{N_c\times d_k}$, and values $\mathcal{V}^{i,l} \in \mathbb{R}^{N_c\times d_v}$ are first obtained:
 
@@ -122,6 +126,16 @@ We highly recommend using the version 2.0 of the attention-based descriptor `"se
       "set_davg_zero": false
 ```
 
+When using PyTorch backend, you must continue to use descriptor `"se_atten"` and specify `tebd_input_mode` as `"strip"` and `smooth_type_embedding` as `"true"`, which achieves the effect of `"se_atten_v2"`. The `tebd_input_mode` can take `"concat"` and `"strip"` as values. When using TensorFlow backend, you need to use descriptor `"se_atten_v2"` and do not need to set `tebd_input_mode` and `smooth_type_embedding` because the default value of `tebd_input_mode` is `"strip"`, and the default value of `smooth_type_embedding` is `"true"` in TensorFlow backend. When `tebd_input_mode` is set to `"strip"`, the embedding matrix $\mathcal{G}^i$ is constructed as:
+
+```math
+   (\mathcal{G}^i)_j = \mathcal{N}_{e,2}(s(r_{ij})) + \mathcal{N}_{e,2}(s(r_{ij})) \odot ({N}_{e,2}(\{\mathcal{A}^i, \mathcal{A}^j\}) \odot s(r_{ij})) \quad \mathrm{or}
+```
+
+```math
+    (\mathcal{G}^i)_j = \mathcal{N}_{e,2}(s(r_{ij})) + \mathcal{N}_{e,2}(s(r_{ij})) \odot ({N}_{e,2}(\{\mathcal{A}^j\}) \odot s(r_{ij}))
+```
+
 Practical evidence demonstrates that `"se_atten_v2"` offers better and more stable performance compared to `"se_atten"`.
 
 Notice: Model compression for the `se_atten_v2` descriptor is exclusively designed for models with the training parameter {ref}`attn_layer <model/descriptor[se_atten_v2]/attn_layer>` set to 0.