Add 4 pt descriptor compression #4227

cherryWangY · 2024-10-17T05:09:50Z

se_a, se_atten(DPA1), se_t, se_r

Summary by CodeRabbit

Release Notes

New Features
- Introduced a model compression feature across multiple descriptor classes, enhancing performance and efficiency.
- Added enable_compression methods to various classes, allowing users to enable and configure compression settings.
Bug Fixes
- Improved error handling for unsupported compression scenarios and parameter validation.
Tests
- Added comprehensive unit tests for new compression functionalities across multiple descriptor classes to ensure accuracy and reliability.
Documentation
- Enhanced documentation for new methods and classes to clarify usage and parameters related to compression.

…o devel pt_tabulate_fusion recommit by Yan.

for more information, see https://pre-commit.ci

Signed-off-by: Jinzhe Zeng <[email protected]>

coderabbitai

Actionable comments posted: 9

🧹 Outside diff range and nitpick comments (5)

deepmd/tf/utils/tabulate.py (1)
76-82: Document the purpose of the hardcoded False parameter in super().__init__.

The last parameter in the superclass initialization is hardcoded as False without any explanation of its purpose. This reduces code maintainability and makes it harder to understand the initialization logic.

Consider adding a comment explaining what this boolean parameter represents, or better yet, make it an explicit named parameter:
         super().__init__(
             descrpt,
             neuron,
             type_one_side,
             exclude_types,
-            False,
+            enable_compression=False,  # Disable compression during initialization
         )
deepmd/pt/utils/tabulate.py (4)

31-31: Correct 'it's' to 'its' in docstring possessives

In the class docstring, "it's" should be "its" when indicating possession. Please change "as it's uniform stride" to "as its uniform stride" in both occurrences.

46-46: Update activation function reference in docstring

In the parameter description for activation_fn, the reference to common.ActivationFn may be incorrect. It should reflect the correct module path, such as deepmd.pt.utils.utils.ActivationFn.

471-471: Typo in docstring: 'Then' should be 'The'

The docstring for _n_all_excluded has a typo. It should read "The number of types excluding all types."

21-21: Unused logger 'log'

The logger log is defined but not used in the file. If logging is not required, consider removing it to clean up the code.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 39d6d17 and cb15335.

📒 Files selected for processing (3)

deepmd/pt/utils/tabulate.py (1 hunks)
deepmd/tf/utils/tabulate.py (3 hunks)
deepmd/utils/tabulate.py (1 hunks)

🧰 Additional context used

🪛 Ruff

deepmd/pt/utils/tabulate.py

54-54: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

55-55: Do not perform function call ActivationFn in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

deepmd/utils/tabulate.py

347-347: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

🔇 Additional comments (5)

deepmd/tf/utils/tabulate.py (1)

84-84: Acknowledge static analysis warnings.

The code scanning alerts about overwriting attributes and mixing implicit/explicit returns have been addressed in the previous review comments through the suggested refactoring of the descrpt_type attribute and the _get_descrpt_type method.

Also applies to: 172-183

deepmd/utils/tabulate.py (4)

347-348: Avoid using @lru_cache on instance methods to prevent memory leaks

As previously noted, using functools.lru_cache on instance methods can lead to memory leaks because the cache holds strong references to self, preventing garbage collection of class instances. Consider refactoring _all_excluded to avoid this issue.

🧰 Tools

🪛 Ruff

347-347: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

444-451: Check for division by zero when rmax equals rmin in _spline5_switch

As previously mentioned, if rmax equals rmin, the expression (rmax - rmin) becomes zero, resulting in a ZeroDivisionError. Ensure that rmax and rmin are not equal, or handle this case appropriately to prevent a runtime error.

418-422: Verify the impact of converting floats to integers

The _convert_numpy_float_to_int method converts floating-point values of self.lower and self.upper to integers. This might lead to loss of precision or unintended truncation. Ensure that this conversion is intentional and does not adversely affect the accuracy of the tabulation results.

243-265: Ensure array indices are within bounds in _build_lower

In the _build_lower method, array slicing is performed using indices calculated from nspline. If nspline is not computed correctly or is out of expected bounds, this could lead to IndexError. Verify that nspline is correctly calculated and that all arrays (vv, dd, d2) have appropriate lengths to prevent out-of-bounds access.

deepmd/tf/utils/tabulate.py

deepmd/utils/tabulate.py

deepmd/pt/utils/tabulate.py

cherryWangY · 2024-10-31T15:54:17Z

tf's se_t should not pass the test. Can you help me find out what the problem is? I compared the code and it seems to be consistent with the original implementation.
@njzjz

deepmd/utils/tabulate.py

coderabbitai

Actionable comments posted: 8

🧹 Outside diff range and nitpick comments (1)

deepmd/tf/utils/tabulate.py (1)
37-37: Add type hints for class attributes.

The class should define type hints for its attributes to improve code maintainability and IDE support.

Add type hints at the class level:
class DPTabulate(BaseTabulate):
    descrpt_type: str
    graph: tf.Graph
    graph_def: tf.GraphDef
    suffix: str
    functype: int
    activation_fn: Callable[[tf.Tensor], tf.Tensor]

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between cb15335 and 15bf3e2.

📒 Files selected for processing (3)

deepmd/pt/utils/tabulate.py (1 hunks)
deepmd/tf/utils/tabulate.py (3 hunks)
deepmd/utils/tabulate.py (1 hunks)

🧰 Additional context used

🪛 Ruff

deepmd/pt/utils/tabulate.py

54-54: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

55-55: Do not perform function call ActivationFn in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

deepmd/utils/tabulate.py

353-353: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

🔇 Additional comments (11)

deepmd/utils/tabulate.py (4)

19-65: LGTM! Well-structured class initialization.

The class is well-organized with clear documentation and proper use of abstract base class pattern.

353-368: Remove @lru_cache to prevent memory leaks.

The existing review comment about potential memory leaks with @lru_cache on instance methods is still valid.

🧰 Tools

🪛 Ruff

353-353: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

449-457: ⚠️ Potential issue

Add validation for rmax and rmin parameters.

The method performs division by (rmax - rmin) without validating that they are different.

Add parameter validation:

 def _spline5_switch(self, xx, rmin, rmax):
+    if rmax <= rmin:
+        raise ValueError("rmax must be greater than rmin")
     if xx < rmin:
         vv = 1
     elif xx < rmax:

Likely invalid or redundant comment.

429-448: ⚠️ Potential issue

Add validation for min_nbor_dist parameter.

The method uses min_nbor_dist in divisions without validating that it's positive and non-zero.

Add parameter validation:

 def _get_env_mat_range(self, min_nbor_dist):
+    if min_nbor_dist <= 0:
+        raise ValueError("min_nbor_dist must be positive")
     """Change the embedding net range to sw / min_nbor_dist."""
     sw = self._spline5_switch(min_nbor_dist, self.rcut_smth, self.rcut)

Likely invalid or redundant comment.

deepmd/tf/utils/tabulate.py (5)

76-82: LGTM: Proper initialization of the parent class.

The constructor correctly initializes the parent class with the required parameters.

84-84: Consider using a cached property for descriptor type.

The descrpt_type is computed only once during initialization and doesn't change. Using @cached_property would be more idiomatic and consistent with other properties in the class.

30-32: LGTM: Clean import of base class.

The import of BaseTabulate is clean and follows Python import conventions.

172-172: ⚠️ Potential issue

Fix implicit return in _get_descrpt_type.

The method has an implicit return path when no conditions match, which could lead to runtime errors.

76-82: Verify descriptor type initialization.

The initialization of descrpt_type happens after the parent class initialization. Let's verify that the parent class doesn't depend on this attribute.

Also applies to: 84-84

✅ Verification successful

Let me gather more information about the parent class and its initialization to make a conclusive decision.

Let me try a different approach to find the parent class and its usage of descrpt_type.

Let me check the parent class (BaseTabulate) to see how it uses the descriptor.

Based on the code analysis, I can now provide the final response:

The initialization order is correct and safe. The parent class BaseTabulate only initializes shared attributes and sets descrpt_type to a default value "Base". The actual descriptor type is determined by the child class's _get_descrpt_type method, which is called after parent initialization. The parent class doesn't use descrpt_type during initialization, and its methods that use this attribute are only called later during the build method.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any usage of descrpt_type in the parent class
rg -l "descrpt_type" | grep -v "tabulate.py$" | while read -r file; do
    echo "=== $file ==="
    rg "descrpt_type" "$file"
done

Length of output: 972

Script:

#!/bin/bash
# Find the parent class implementation and its initialization
ast-grep --pattern 'class $_ {
  $$$
  __init__($$$) {
    $$$
  }
  $$$
}'  deepmd/tf/utils/tabulate.py

Length of output: 105

Script:

#!/bin/bash
# First, let's see the content of tabulate.py to understand the inheritance
cat deepmd/tf/utils/tabulate.py

# Then search for any usage of descrpt_type in the file
rg -A 5 -B 5 "descrpt_type" deepmd/tf/utils/tabulate.py

Length of output: 23892

Script:

#!/bin/bash
# Check the BaseTabulate class implementation
cat deepmd/utils/tabulate.py

Length of output: 16199

deepmd/pt/utils/tabulate.py (2)

515-516: ⚠️ Potential issue

Improve error handling in grad_grad function

The function returns -1 for unsupported function types, while the grad function raises a ValueError. This inconsistency could lead to silent failures.

Make error handling consistent:

    elif functype == 6:
        return y * (1 - y) * (1 - 2 * y)
-   else:
-       return -1
+   raise ValueError(f"Unsupported function type: {functype}")

Likely invalid or redundant comment.

321-321: ⚠️ Potential issue

Add division by zero check in layer size calculation

The calculation basic_size // (self.ntypes - self._n_all_excluded) could raise a ZeroDivisionError if all types are excluded.

Add a check:

if self.type_one_side:
+   denominator = self.ntypes - self._n_all_excluded
+   if denominator == 0:
+       raise ValueError("Cannot calculate layer size: all types are excluded")
-   layer_size = basic_size // (self.ntypes - self._n_all_excluded)
+   layer_size = basic_size // denominator

Likely invalid or redundant comment.

deepmd/utils/tabulate.py

deepmd/tf/utils/tabulate.py

deepmd/pt/utils/tabulate.py

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 7

🧹 Outside diff range and nitpick comments (4)

deepmd/utils/tabulate.py (3)
22-65: Enhance constructor documentation with type hints and attribute descriptions.

The constructor's documentation could be improved by:

Adding type hints for all parameters

Documenting the purpose of each parameter

Documenting the attributes that must be initialized in subclasses

Apply this diff to improve the documentation:
     def __init__(
         self,
-        descrpt,
-        neuron,
-        type_one_side,
-        exclude_types,
-        is_pt,
+        descrpt: Any,  # TODO: Add specific type
+        neuron: list[int],
+        type_one_side: bool,
+        exclude_types: set[tuple[int, int]],
+        is_pt: bool,
     ) -> None:
-        """Constructor."""
+        """Initialize the base tabulate class.
+        
+        Parameters
+        ----------
+        descrpt : Any
+            The descriptor object
+        neuron : list[int]
+            List of neurons in each layer
+        type_one_side : bool
+            Whether to use one-sided type
+        exclude_types : set[tuple[int, int]]
+            Set of type pairs to exclude
+        is_pt : bool
+            Whether this is a PyTorch implementation
+
+        Notes
+        -----
+        The following attributes must be initialized in subclasses:
+        - descrpt_type: str
+        - sel_a: list
+        - rcut: float
+        - rcut_smth: float
+        - davg: np.ndarray
+        - dstd: np.ndarray
+        - ntypes: int
+        """
336-423: Enhance abstract method documentation with complete type hints.

The abstract methods would benefit from more detailed documentation and complete type hints.

Example improvement for _get_descrpt_type:
     @abstractmethod
-    def _get_descrpt_type(self):
-        """Get the descrpt type."""
+    def _get_descrpt_type(self) -> str:
+        """Get the descriptor type.
+
+        Returns
+        -------
+        str
+            The type of descriptor. Must be one of:
+            - "Atten"
+            - "A"
+            - "T"
+            - "R"
+            - "AEbdV2"
+        """
         pass
🧰 Tools

🪛 Ruff

354-354: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

1-458: Add unit tests for mathematical operations.

The file contains complex mathematical operations, particularly in the build and _build_lower methods. Consider adding unit tests to verify:

Correct calculation of spline coefficients

Proper handling of boundary conditions

Accuracy of tabulation results

Would you like me to help generate comprehensive unit tests for these mathematical operations?

🧰 Tools

🪛 Ruff

354-354: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)
deepmd/pt/utils/tabulate.py (1)
81-89: Move activation_map to a module-level constant

The activation_map dictionary is defined inside the __init__ method. Since it does not depend on any instance-specific data, defining it at the module level can improve code clarity and prevent it from being recreated with each instance.

You can move activation_map outside the class definition:
# Module-level constant
ACTIVATION_MAP = {
    "tanh": 1,
    "gelu": 2,
    "gelu_tf": 2,
    "relu": 3,
    "relu6": 4,
    "softplus": 5,
    "sigmoid": 6,
}

class DPTabulate(BaseTabulate):
    def __init__(self, ...):
        # Use ACTIVATION_MAP here

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 15bf3e2 and c06f54a.

📒 Files selected for processing (3)

deepmd/pt/utils/tabulate.py (1 hunks)
deepmd/utils/tabulate.py (1 hunks)
source/tests/pt/test_tabulate.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

source/tests/pt/test_tabulate.py

🧰 Additional context used

🪛 Ruff

deepmd/pt/utils/tabulate.py

54-54: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

55-55: Do not perform function call ActivationFn in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

deepmd/utils/tabulate.py

354-354: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

deepmd/utils/tabulate.py

deepmd/pt/utils/tabulate.py

cherryWangY and others added 30 commits June 9, 2024 12:37

yan devel

6bb8795

tabulate_fusion_se_t

7ecd122

tabulate_fusion_all_op_basic_verion

ab670ed

compile safe version

9fc3fb0

compile safe version

cab50c9

Merge branch 'devel' of https://github.com/cherryWangY/deepmd-kit int…

e9ccb98

…o devel pt_tabulate_fusion recommit by Yan.

se_a & se_atten

87909e2

se_r

2225975

remove print

c09a7a7

move pt op test

ee5b64e

Merge remote-tracking branch 'upstream/devel' into devel

c7efbce

[pre-commit.ci] auto fixes from pre-commit.com hooks

763c7b4

for more information, see https://pre-commit.ci

remove print

3fe4b64

fixed for commit

6f76ccf

[pre-commit.ci] auto fixes from pre-commit.com hooks

b89ceed

for more information, see https://pre-commit.ci

fix pull request warning

5a8b77e

fix pr warning

25ca8c1

[pre-commit.ci] auto fixes from pre-commit.com hooks

f1c43f5

for more information, see https://pre-commit.ci

gpu test debug

179e175

merge

34c664c

merge

4cc1478

[pre-commit.ci] auto fixes from pre-commit.com hooks

5921a60

for more information, see https://pre-commit.ci

table_info set cpu

b63209c

remove print

a824a80

[pre-commit.ci] auto fixes from pre-commit.com hooks

8527819

for more information, see https://pre-commit.ci

add dtype=float64

e47dcba

[pre-commit.ci] auto fixes from pre-commit.com hooks

95a9566

for more information, see https://pre-commit.ci

add dtype=float64

114f7a6

[pre-commit.ci] auto fixes from pre-commit.com hooks

9e677f4

for more information, see https://pre-commit.ci

test both float64 and float32

22ae3b7

Signed-off-by: Jinzhe Zeng <[email protected]>

coderabbitai bot reviewed Oct 31, 2024

View reviewed changes

njzjz reviewed Oct 31, 2024

View reviewed changes

deepmd/utils/tabulate.py Show resolved Hide resolved

fix for loop; fix codeql warnings

15bf3e2

njzjz reviewed Oct 31, 2024

View reviewed changes

deepmd/utils/tabulate.py Show resolved Hide resolved

coderabbitai bot reviewed Oct 31, 2024

View reviewed changes

njzjz requested a review from wanghan-iapcm October 31, 2024 20:36

njzjz added the Test CUDA Trigger test CUDA workflow label Oct 31, 2024

github-actions bot removed the Test CUDA Trigger test CUDA workflow label Oct 31, 2024

wanghan-iapcm approved these changes Nov 1, 2024

View reviewed changes

cherryWangY and others added 4 commits November 1, 2024 11:16

add comment at descrpt SeT build()

689748e

Refactor duplicate code in _get_bias and _get_matrix methods

9c7534e

fix device inconsistency

5794248

[pre-commit.ci] auto fixes from pre-commit.com hooks

c06f54a

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Nov 1, 2024

View reviewed changes

cherryWangY requested a review from njzjz November 1, 2024 06:36

njzjz approved these changes Nov 1, 2024

View reviewed changes

iProzd approved these changes Nov 1, 2024

View reviewed changes

Merge branch 'devel' into devel

006c159

njzjz enabled auto-merge November 1, 2024 09:09

njzjz added this pull request to the merge queue Nov 1, 2024

Merged via the queue into deepmodeling:devel with commit 8355947 Nov 1, 2024
60 checks passed

This was referenced Nov 2, 2024

Add compression API to BaseModel and AtomicModel #4298

Merged

Add pt compress commad line #4300

Merged

feat(pt): DPA-2 repinit compress #4329

Merged

This was referenced Nov 12, 2024

fix(pt): fix type annotations for dummy compress op; improve docs #4342

Merged

fix(pt): fix precision #4344

Merged

This was referenced Nov 21, 2024

fix(pt/dp): make strip more efficient #4400

Merged

pd: support dpa1 #4414

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 4 pt descriptor compression #4227

Add 4 pt descriptor compression #4227

cherryWangY commented Oct 17, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot left a comment

cherryWangY commented Oct 31, 2024

coderabbitai bot left a comment

coderabbitai bot left a comment

Add 4 pt descriptor compression #4227

Add 4 pt descriptor compression #4227

Conversation

cherryWangY commented Oct 17, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

coderabbitai bot left a comment

Choose a reason for hiding this comment

cherryWangY commented Oct 31, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

cherryWangY commented Oct 17, 2024 •

edited by coderabbitai bot

Loading