From 2b448cbbe0e6785e84c5b4dd610da8bfb5755d7d Mon Sep 17 00:00:00 2001 From: Chenqqian Zhang <100290172+Chengqian-Zhang@users.noreply.github.com> Date: Tue, 2 Jul 2024 03:59:48 -0400 Subject: [PATCH] docs: add document equations for `se_atten_v2` (#3828) Solve issue #3139 `"se_atten_v2"` is inherited from `"se_atten"` with the following parameter modifications: ```json "tebd_input_mode": "strip", "smooth_type_embedding": true, "set_davg_zero": false ``` I add the equations for parameter `"tebd_input_mode"`. ## Summary by CodeRabbit - **Documentation** - Detailed the default value and functionality of the `"tebd_input_mode"` parameter. - Highlighted the performance superiority of `"se_atten_v2"` over `"se_atten"`. - Specified a model compression requirement for `se_atten_v2`. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com> (cherry picked from commit e3acea5e87fab027fbc347beaf7a234b351ee9d3) Signed-off-by: Jinzhe Zeng --- doc/model/train-se-atten.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/doc/model/train-se-atten.md b/doc/model/train-se-atten.md index 0ed73fe203..fef910b0a7 100644 --- a/doc/model/train-se-atten.md +++ b/doc/model/train-se-atten.md @@ -17,7 +17,11 @@ Attention-based descriptor $\mathcal{D}^i \in \mathbb{R}^{M \times M_{<}}$, whic ``` where $\hat{\mathcal{G}}^i$ represents the embedding matrix $\mathcal{G}^i$ after additional self-attention mechanism and $\mathcal{R}^i$ is defined by the full case in the [`se_e2_a`](./train-se-e2-a.md). -Note that we obtain $\mathcal{G}^i$ using the type embedding method by default in this descriptor. +Note that we obtain $\mathcal{G}^i$ using the type embedding method by default in this descriptor. By default, we concat $s(r_{ij})$ and the type embeddings of central and neighboring atoms $\mathcal{A}^i$ and $\mathcal{A}^j$ as input of the embedding network $\mathcal{N}_{e,2}$: + +```math + (\mathcal{G}^i)_j = \mathcal{N}_{e,2}(\{s(r_{ij}), \mathcal{A}^i, \mathcal{A}^j\}) \quad \mathrm{or}\quad(\mathcal{G}^i)_j = \mathcal{N}_{e,2}(\{s(r_{ij}), \mathcal{A}^j\}) +``` To perform the self-attention mechanism, the queries $\mathcal{Q}^{i,l} \in \mathbb{R}^{N_c\times d_k}$, keys $\mathcal{K}^{i,l} \in \mathbb{R}^{N_c\times d_k}$, and values $\mathcal{V}^{i,l} \in \mathbb{R}^{N_c\times d_v}$ are first obtained: @@ -118,6 +122,12 @@ We highly recommend using the version 2.0 of the attention-based descriptor `"se "set_davg_zero": false ``` +You need to use descriptor `"se_atten_v2"` and do not need to set `stripped_type_embedding` and `smooth_type_embedding` because the default value of `stripped_type_embedding` is `true`, and the default value of `smooth_type_embedding` is `true`. When `stripped_type_embedding` is set to `true`, the embedding matrix $\mathcal{G}^i$ is constructed as: + +```math + (\mathcal{G}^i)_j = \mathcal{N}_{e,2}(s(r_{ij})) + \mathcal{N}_{e,2}(s(r_{ij})) \odot ({N}_{e,2}(\{\mathcal{A}^i, \mathcal{A}^j\}) \odot s(r_{ij})) \quad \mathrm{or} +``` + Practical evidence demonstrates that `"se_atten_v2"` offers better and more stable performance compared to `"se_atten"`. Notice: Model compression for the `se_atten_v2` descriptor is exclusively designed for models with the training parameter {ref}`attn_layer ` set to 0.