Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd: support dpa1 #4414

Open
wants to merge 68 commits into
base: devel
Choose a base branch
from
Open

Conversation

HydrogenSulfate
Copy link
Contributor

@HydrogenSulfate HydrogenSulfate commented Nov 25, 2024

Summary of this PR:

  1. upload DPA-1 related code
  2. merge much develop code
  3. add all eager composite operators except softmax_grad, p_norm_grad, split_grad, and concat_grad to the composite operator blacklist(https://github.com/deepmodeling/deepmd-kit/pull/4414/files#diff-e678abb052b278f8a479f8d13b839a9ec0effd9923478a850bc13758f918e1e9R134-R148) to significantly improve model execution speed (reducing the time taken from 100% more than PyTorch to about 10% to 15% more).

related PR: lanpa/tensorboardX#728

Training curve:

training_curves_comparison_eager_opt

Accuracy test(left: paddle, right: torch):

image

Ralated optimization of Paddle framework:

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced several new classes for molecular descriptors, including DescrptDPA1, DescrptBlockSeAtten, and LayerNorm, enhancing the modeling capabilities for molecular simulations.
    • Added a new JSON configuration file for model parameters.
    • Implemented new test classes for validating the functionality of the DPAtomicModel and various descriptor classes.
    • Added new test classes for evaluating denoising models, including TestDenoiseModelDPA1 and TestDenoiseModelDPA2.
    • Enhanced the ModelWrapper class to clarify the handling of model parameters and state management.
  • Bug Fixes

    • Improved internal logic for handling model state saving and loading, ensuring consistency in outputs.
  • Documentation

    • Enhanced type hints and return annotations across various classes and methods for better clarity.
  • Tests

    • Expanded the testing framework with new test cases for denoising models and descriptor functionalities, ensuring robust validation of features.
    • Activated previously skipped tests for energy models, improving test coverage.

HydrogenSulfate and others added 30 commits November 2, 2024 11:14
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
deepmd/pd/utils/utils.py (2)

188-198: Add documentation and parameter validation

The context manager implementation is clean and follows best practices. However, consider these improvements:

  1. Add docstring explaining:

    • Purpose and usage of the context manager
    • Parameter descriptions
    • Example usage
  2. Add parameter validation for the name parameter

Here's a suggested implementation:

 @contextmanager
 def nvprof_context(enable_profiler: bool, name: str):
+    """Context manager for NVIDIA profiling using NVTX markers.
+
+    Args:
+        enable_profiler (bool): Flag to enable/disable profiling
+        name (str): Name of the profiling section, must be non-empty
+
+    Example:
+        >>> with nvprof_context(True, "forward_pass"):
+        ...     model.forward(data)
+    """
+    if not isinstance(name, str) or not name.strip():
+        raise ValueError("name must be a non-empty string")
+
     if enable_profiler:
         core.nvprof_nvtx_push(name)

     try:
         yield

     finally:
         if enable_profiler:
             core.nvprof_nvtx_pop()

188-198: Strategic addition for performance analysis

This profiling context manager is a valuable addition that will help:

  1. Debug and optimize the DPA-1 implementation
  2. Compare performance between Paddle and Torch implementations
  3. Support the training curve comparisons mentioned in the PR description

Consider documenting common profiling points in the codebase where this context manager should be used, particularly around the DPA-1 implementation, to ensure consistent performance monitoring.

deepmd/pd/train/training.py (1)

1218-1227: LGTM! Consider adding input validation.

The function is well-implemented with proper type hints and clear formatting. Consider these optional improvements:

 def format_training_message(
     batch: int,
     wall_time: float,
     eta: Optional[int] = None,
 ):
     """Format a training message."""
+    if batch < 0:
+        raise ValueError("batch must be non-negative")
+    if wall_time < 0:
+        raise ValueError("wall_time must be non-negative")
+    if eta is not None and eta < 0:
+        raise ValueError("eta must be non-negative")
     msg = f"batch {batch:7d}: " f"total wall time = {wall_time:.2f} s"
     if isinstance(eta, int):
         msg += f", eta = {datetime.timedelta(seconds=int(eta))!s}"
     return msg
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 01d4179 and 48b7146.

📒 Files selected for processing (5)
  • .github/workflows/test_cuda.yml (1 hunks)
  • deepmd/pd/__init__.py (0 hunks)
  • deepmd/pd/train/training.py (3 hunks)
  • deepmd/pd/utils/env.py (1 hunks)
  • deepmd/pd/utils/utils.py (3 hunks)
💤 Files with no reviewable changes (1)
  • deepmd/pd/init.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • .github/workflows/test_cuda.yml
  • deepmd/pd/utils/env.py
🔇 Additional comments (1)
deepmd/pd/utils/utils.py (1)

6-8: LGTM: Clean import additions

The new imports are well-organized and necessary for the NVIDIA profiling functionality.

Also applies to: 18-20

@@ -673,6 +660,7 @@ def step(_step_id, task_key="Default"):
# Paddle Profiler
if enable_profiling:
core.nvprof_nvtx_push(f"Training step {_step_id}")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix indentation and add error handling for profiling.

The profiling code has incorrect indentation and lacks proper error handling. Consider wrapping the profiling calls in a try-finally block to ensure proper cleanup.

-            if enable_profiling:
-                core.nvprof_nvtx_push(f"Training step {_step_id}")
+            if enable_profiling:
+                try:
+                    core.nvprof_nvtx_push(f"Training step {_step_id}")
+                finally:
+                    core.nvprof_nvtx_pop()

Committable suggestion skipped: line range outside the PR's diff.

data.pop("@class")
data.pop("type")
variables = data.pop("@variables")
embeddings = data.pop("embeddings")

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable env_mat is not used.
Parameters
----------
extended_coord
The extended coordinates of atoms. shape: nf x (nallx3)

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable nall is not used.
The extended coordinates of atoms. shape: nf x (nallx3)
extended_atype
The extended aotm types. shape: nf x nall x nt
extended_atype_embd

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable nall is not used.
)
prec = PRECISION_DICT[obj.precision]

def check_load_param(ss):

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable prec is not used.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (7)
deepmd/pd/model/descriptor/env_mat.py (1)

30-30: Consider batched norm computation

The switch to paddle.linalg.norm is correct. For potential optimization, consider using batched operations if processing multiple frames simultaneously.

-    length = paddle.linalg.norm(diff, axis=-1, keepdim=True)
+    # Compute norms for all frames in a single operation
+    length = paddle.sqrt(paddle.sum(paddle.square(diff), axis=-1, keepdim=True))
deepmd/pd/utils/region.py (1)

80-86: Add docstring to b_to_face_distance function

Consider adding a docstring to explain the function's purpose, parameters, and return value, similar to other functions in this file. This would improve code maintainability and help users understand the mathematical operations being performed.

Example docstring:

def b_to_face_distance(cell):
    """Calculate the perpendicular distances from origin to the faces of simulation cell.

    Parameters
    ----------
    cell : paddle.Tensor
        Simulation cell tensor of shape [batch_size, 3, 3]

    Returns
    -------
    paddle.Tensor
        Distances to faces of shape [batch_size, 3]
    """
deepmd/pd/utils/exclude_mask.py (1)

145-149: Consider adding index validation

The take_along_axis operation assumes valid indices. Consider adding validation to ensure index values are within bounds of ae's dimension 1.

         index = paddle.where(nlist == -1, nall, nlist).reshape([nf, nloc * nnei])
+        # Validate indices are within bounds
+        if paddle.any(index < 0) or paddle.any(index >= ae.shape[1]):
+            raise ValueError("Invalid indices detected in neighbor list")
         type_j = paddle.take_along_axis(ae, axis=1, indices=index).reshape(
             [nf, nloc, nnei]
         )
deepmd/pd/utils/nlist.py (2)

Line range hint 300-317: Consider adding input validation and error handling.

While the implementation is correct, consider adding validation for:

  • Empty input tensors
  • Invalid type indices
  • Mismatched dimensions between nlist and atype

This would make the function more robust against edge cases.

 def nlist_distinguish_types(
     nlist: paddle.Tensor,
     atype: paddle.Tensor,
     sel: list[int],
 ):
+    if paddle.in_dynamic_mode():
+        if nlist.numel() == 0 or atype.numel() == 0:
+            raise ValueError("Empty input tensors are not allowed")
+        if atype.max() >= len(sel):
+            raise ValueError("Invalid type index found in atype")
+        if nlist.shape[0] != atype.shape[0]:
+            raise ValueError("Batch dimensions must match")
     nf, nloc, nnei = nlist.shape

398-404: Add input validation for rcuts and nsels.

Consider validating that:

  • rcuts is in ascending order
  • nsels is in ascending order
  • Both lists have matching lengths
 def build_multiple_neighbor_list(
     coord: paddle.Tensor,
     nlist: paddle.Tensor,
     rcuts: list[float],
     nsels: list[int],
 ) -> dict[str, paddle.Tensor]:
+    if not all(rcuts[i] <= rcuts[i+1] for i in range(len(rcuts)-1)):
+        raise ValueError("rcuts must be in ascending order")
+    if not all(nsels[i] <= nsels[i+1] for i in range(len(nsels)-1)):
+        raise ValueError("nsels must be in ascending order")
+    if len(rcuts) != len(nsels):
+        raise ValueError("rcuts and nsels must have the same length")
deepmd/pd/loss/ener.py (1)

226-226: Consider adding epsilon to prevent division by zero.

The force normalization implementation is correct, but consider adding a small epsilon value to prevent potential division by zero when force magnitudes exactly cancel out the relative_f shift.

-                norm_f = force_label_3.norm(axis=1, keepdim=True) + self.relative_f
+                eps = 1e-7
+                norm_f = force_label_3.norm(axis=1, keepdim=True) + self.relative_f + eps
deepmd/pd/model/descriptor/se_atten.py (1)

826-1017: Consider adding additional input validation.

While the implementation is solid, consider adding validation for:

  • Input tensor shapes in the forward method
  • Non-negative scaling_factor
  • Valid range for attnw_shift parameter

Example validation to add at the start of forward method:

def forward(self, query, nei_mask, input_r: Optional[paddle.Tensor] = None,
           sw: Optional[paddle.Tensor] = None, attnw_shift: float = 20.0):
    # Validate input shapes
    batch_size, seq_len, embed_dim = query.shape
    assert embed_dim == self.embed_dim, f"Expected embed_dim {self.embed_dim}, got {embed_dim}"
    assert seq_len == self.nnei, f"Expected sequence length {self.nnei}, got {seq_len}"
    assert attnw_shift > 0, f"attnw_shift must be positive, got {attnw_shift}"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 48b7146 and 4c925f9.

📒 Files selected for processing (9)
  • deepmd/pd/loss/ener.py (1 hunks)
  • deepmd/pd/model/descriptor/env_mat.py (1 hunks)
  • deepmd/pd/model/descriptor/se_atten.py (1 hunks)
  • deepmd/pd/model/model/make_model.py (1 hunks)
  • deepmd/pd/train/training.py (3 hunks)
  • deepmd/pd/utils/decomp.py (2 hunks)
  • deepmd/pd/utils/exclude_mask.py (1 hunks)
  • deepmd/pd/utils/nlist.py (6 hunks)
  • deepmd/pd/utils/region.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • deepmd/pd/train/training.py
🧰 Additional context used
📓 Learnings (1)
deepmd/pd/model/descriptor/se_atten.py (1)
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/model/make_model.py:370-373
Timestamp: 2024-11-12T05:47:21.643Z
Learning: In `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional and should not be flagged in future reviews.
🪛 Ruff (0.8.0)
deepmd/pd/model/descriptor/se_atten.py

61-61: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


78-78: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


371-375: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block

(SIM108)


395-395: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


457-457: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)

🔇 Additional comments (15)
deepmd/pd/model/descriptor/env_mat.py (2)

Line range hint 1-89: Implementation looks robust and well-documented

The environment matrix generation is implemented correctly with proper:

  • Protection against numerical instabilities
  • Clear documentation and type hints
  • Proper tensor shape handling
  • Edge case management

The migration to Paddle backend maintains the scientific accuracy of the calculations.


27-28: Verify tensor gathering operation

The switch to paddle.take_along_axis looks correct. However, let's verify that the tensor shapes and axis alignment are preserved correctly.

✅ Verification successful

Tensor gathering operation is correctly implemented

The verification shows that:

  • paddle.take_along_axis is used consistently across the codebase with similar patterns in multiple files (nlist.py, exclude_mask.py, se_atten.py, make_model.py)
  • All instances follow the same pattern of using axis=1 followed by a reshape operation to maintain tensor dimensions
  • The remaining decomp.take_along_axis occurrences are only in test files, which is expected
  • The implementation in env_mat.py aligns with the established pattern in the codebase
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if paddle.take_along_axis is used consistently across the codebase
# and verify there are no remaining decomp.take_along_axis calls

# Search for any remaining decomp.take_along_axis usage
rg "decomp\.take_along_axis" -l

# Search for paddle.take_along_axis usage patterns
rg "paddle\.take_along_axis.*axis\s*=\s*1" -A 2

Length of output: 1484

deepmd/pd/utils/region.py (2)

81-81: LGTM! Consistent use of paddle.linalg.norm

The replacement of decomp.norm with paddle.linalg.norm is correct and aligns with the standardization efforts to use Paddle's native operations.

Also applies to: 83-83, 85-85


Line range hint 88-89: Clarify status of commented vmap implementation

The commented-out vectorized implementation using paddle.vmap could potentially offer performance benefits. Is this a planned optimization? If so, consider creating an issue to track this enhancement.

deepmd/pd/utils/decomp.py (2)

20-20: LGTM: Export list updated correctly

The addition of "numel" to __all__ is consistent with the new function implementation and removal of deprecated decomposition functions.


Line range hint 1-8: Verify migration strategy for removed decomposition functions

The file header indicates this is a temporary solution until functions are decomposed into primitive functions in the Paddle framework. With the removal of several decomposition functions (softmax_decomp, norm_decomp, take_along_axis_decomp), we should verify that:

  1. The removed functions have equivalent native Paddle implementations
  2. All call sites have been updated to use the native implementations

Let's verify the migration:

✅ Verification successful

Migration to native Paddle functions appears complete and consistent

The verification shows:

  1. The only remaining references to the removed decomposition functions (softmax_decomp, norm_decomp, take_along_axis_decomp) are in test files, which is expected for legacy test coverage.
  2. Native Paddle functions are being used throughout the codebase:
    • paddle.linalg.norm is used in region.py, nlist.py, and descriptor modules
    • paddle.take_along_axis is used in nlist.py, exclude_mask.py, and descriptor modules

The migration appears to be complete with all production code using native Paddle implementations instead of the custom decomposition functions.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining usage of removed decomposition functions
echo "Searching for potential missed migrations..."
rg -g '*.py' -e 'softmax_decomp|norm_decomp|take_along_axis_decomp'

# Search for new paddle native function usage
echo "Verifying migration to native paddle functions..."
rg -g '*.py' -e 'paddle\.(softmax|linalg\.norm|take_along_axis)' -A 2

Length of output: 4654

deepmd/pd/utils/exclude_mask.py (2)

Line range hint 63-91: LGTM! Constructor and reinit method improvements

The changes improve the API and implementation by:

  1. Using tuples to explicitly represent type pairs
  2. Ensuring symmetric exclusion by adding both (i,j) and (j,i)
  3. Adding the no_exclusion optimization flag

146-148: LGTM! Paddle native function migration

The change from decomp.take_along_axis to paddle.take_along_axis aligns with the PR objectives and maintains the same functionality.

deepmd/pd/utils/nlist.py (3)

268-268: LGTM! Consistent with previous changes.

The change to paddle.linalg.norm maintains consistency with other similar changes in the codebase.


506-506: LGTM! Consistent with norm calculation changes.

The change to paddle.norm is consistent with the overall migration from decomp to native Paddle operations.


120-120: LGTM! Consider verifying performance impact.

The change from decomp.norm to paddle.linalg.norm is correct and aligns with Paddle's standard API. Since this operation is in a performance-critical path for neighbor list construction, it would be good to verify that there's no performance regression.

✅ Verification successful

Based on the search results, I can see that decomp.norm appears only in test files (source/tests/pd/test_decomp.py) and not in the actual implementation code. This confirms that the change from decomp.norm to paddle.linalg.norm in deepmd/pd/utils/nlist.py is consistent across the codebase.

Let me verify the imports to ensure the change is complete:


Change from decomp.norm to paddle.linalg.norm is correctly implemented

The verification confirms that:

  1. There are no remaining imports of decomp in the file
  2. paddle.linalg.norm is consistently used in all required locations
  3. No other instances of decomp.norm exist in the codebase except in test files

The change is complete and consistent with the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for any remaining usage of decomp.norm in the codebase
# to ensure consistent usage of paddle.linalg.norm

rg "decomp\.norm" --type python

Length of output: 64


Script:

#!/bin/bash
# Search for any remaining usage of decomp.norm in the codebase
# using ripgrep without file type filter since python extension wasn't recognized
rg "decomp\.norm"

Length of output: 159


Script:

#!/bin/bash
# Also check for imports of decomp module to ensure consistency
rg "import.*decomp|from.*decomp"

Length of output: 113


Script:

#!/bin/bash
# List Python files to understand the file extension pattern
fd "\.py$"

Length of output: 33145


Script:

#!/bin/bash
# Check imports in the file under review
rg "import.*decomp|from.*decomp" deepmd/pd/utils/nlist.py

Length of output: 59


Script:

#!/bin/bash
# Check paddle.linalg.norm usage in the file
rg "paddle\.linalg\.norm" deepmd/pd/utils/nlist.py

Length of output: 181

deepmd/pd/model/model/make_model.py (1)

459-459: LGTM! Verify numerical equivalence with previous implementation.

The replacement of decomp functions with native paddle operations looks correct and aligns with the PR objectives. The changes maintain the same functionality while potentially improving performance by using Paddle's optimized implementations.

Let's verify the numerical equivalence between the old and new implementations:

Also applies to: 463-463, 469-469

✅ Verification successful

Let me analyze the test results to verify the numerical equivalence of the neighbor list operations:

  1. The codebase has extensive test coverage for neighbor list operations across different backends (paddle, pytorch, tensorflow, jax)
  2. Specifically for paddle backend, there are tests in source/tests/pd/model/test_nlist.py that verify:
    • Numerical correctness of neighbor list building
    • Comparison between different implementations
    • Edge cases like padding and truncation

Let me check the specific test cases for take_along_axis and linalg.norm operations:


Let me try a different search strategy to find relevant test cases:


Based on the search results, I can now make a definitive assessment:

Verified: Changes maintain numerical equivalence with previous implementation

The changes are safe because:

  1. The test suite includes explicit numerical equivalence tests for both operations:

    • test_norm_decomp verifies paddle.linalg.norm produces identical results to decomp.norm
    • test_take_along_axis_decomp verifies paddle.take_along_axis produces identical results to decomp.take_along_axis
  2. The neighbor list functionality is thoroughly tested with numerical assertions:

    • Tests verify exact numerical equality between different implementations using np.testing.assert_allclose
    • Both sorted and unsorted neighbor lists are compared for correctness
    • Edge cases with different neighbor list sizes are covered

The native Paddle operations are direct replacements that maintain the same numerical behavior as their decomp counterparts.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for test files that verify the numerical correctness of neighbor list operations
# Expected: Find test files comparing paddle and torch implementations

# Look for test files related to neighbor list
fd -e py test_nlist

# Search for specific test patterns in these files
rg -A 5 "def test.*nlist" 

# Look for existing numerical comparison tests
rg -i "numerical.*comparison|assert.*allclose"

Length of output: 57175


Script:

#!/bin/bash
# Search for tests verifying take_along_axis and linalg.norm operations
rg -B 2 -A 5 "take_along_axis|linalg\.norm" source/tests/pd/model/test_nlist.py source/tests/pd/model/test_dp_model.py

Length of output: 120


Script:

#!/bin/bash
# Search for test files and functions that verify format_nlist functionality
# This should cover the operations we're interested in since they're used in format_nlist

# Look for format_nlist test implementations
rg -B 2 -A 10 "def test.*format_nlist" source/tests/pd/

# Look for assertions comparing nlist implementations
rg -B 2 -A 5 "assert.*nlist" source/tests/pd/model/test_nlist.py

# Search for actual usage of these paddle operations in tests
rg -B 2 -A 5 "paddle\.(take_along_axis|linalg\.norm)" source/tests/pd/

Length of output: 2045

deepmd/pd/model/descriptor/se_atten.py (3)

1-50: LGTM! Well-organized imports.

The imports are properly organized and all are utilized in the implementation.


576-717: LGTM! Well-implemented attention mechanism.

The NeighborGatedAttention class provides a robust implementation of multi-layer attention with proper error handling, type checking, and serialization support.


719-824: LGTM! Well-structured attention layer.

The NeighborGatedAttentionLayer class follows best practices with proper residual connections and layer normalization.

Comment on lines +130 to +134
def numel(x: paddle.Tensor) -> int:
if paddle.in_dynamic_mode():
return np.prod(x.shape)

Returns
-------
paddle.Tensor
Computed output.
"""
return paddle.nn.functional.normalize(x, p, axis, epsilon)
# return x / norm(x, p=p, axis=axis, keepdim=True)
return paddle.numel(x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Consider using math.prod instead of np.prod for better performance

Based on the codebase analysis, I noticed that other parts of the codebase use math.prod instead of numpy.prod for shape calculations. This is a more lightweight solution that doesn't require numpy overhead. Here's the suggested improvement:

  1. Replace np.prod with math.prod as it's already being used elsewhere in the codebase (e.g., in deepmd/dpmodel/descriptor/dpa1.py and deepmd/dpmodel/atomic_model/base_atomic_model.py)
  2. Add docstring for better code clarity
+import math
 def numel(x: paddle.Tensor) -> int:
+    """Calculate the total number of elements in a tensor.
+
+    Parameters
+    ----------
+    x : paddle.Tensor
+        Input tensor
+
+    Returns
+    -------
+    int
+        Total number of elements in the tensor
+    """
     if paddle.in_dynamic_mode():
-        return np.prod(x.shape)
+        return math.prod(x.shape)
 
     return paddle.numel(x)
🔗 Analysis chain

Add documentation and consider performance optimization

The numel function implementation could benefit from several improvements:

  1. Missing docstring explaining the different behavior between dynamic and static modes
  2. Using numpy for shape calculation in dynamic mode might be less efficient than using paddle's native functionality
  3. Missing type hints for the return value

Consider applying these improvements:

-def numel(x: paddle.Tensor) -> int:
+def numel(x: paddle.Tensor) -> int:
+    """Calculate the total number of elements in a tensor.
+
+    Parameters
+    ----------
+    x : paddle.Tensor
+        Input tensor
+
+    Returns
+    -------
+    int
+        Total number of elements in the tensor
+
+    Note
+    ----
+    Behavior differs between dynamic and static modes:
+    - In dynamic mode: Uses numpy to calculate product of shape
+    - In static mode: Uses paddle.numel directly
+    """
     if paddle.in_dynamic_mode():
-        return np.prod(x.shape)
+        # TODO: Investigate if there's a more efficient paddle-native solution
+        return int(np.prod(x.shape))
 
     return paddle.numel(x)

Let's verify if paddle provides a native solution for dynamic mode:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any existing paddle implementations of element counting in dynamic mode
rg -g '*.py' -l 'numel.*dynamic' 

Length of output: 34


Script:

#!/bin/bash
# Let's search for numel implementations and usages in paddle-related code
rg -g '*.py' -A 5 'numel.*tensor|tensor.*numel'

Length of output: 49


Script:

#!/bin/bash
# Search for shape-related operations in dynamic mode
rg -g '*.py' -A 5 'in_dynamic_mode.*shape|shape.*dynamic_mode'

Length of output: 64


Script:

#!/bin/bash
# Look for similar tensor size calculation patterns
rg -g '*.py' 'prod.*shape|size.*tensor'

Length of output: 1153


Script:

#!/bin/bash
# Check if there are any other implementations of numel in the codebase
fd -e py -x ast-grep --pattern 'def numel($$$)'

Length of output: 49

rcut_smth: float,
sel: Union[list[int], int],
ntypes: int,
neuron: list = [25, 50, 100],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix mutable default arguments in constructor.

Using mutable objects as default arguments can lead to unexpected behavior in Python. The lists used as default values for neuron and exclude_types parameters are created once when the function is defined and shared between all calls.

Apply this diff to fix the issue:

-    def __init__(
-        self,
-        rcut: float,
-        rcut_smth: float,
-        sel: Union[list[int], int],
-        ntypes: int,
-        neuron: list = [25, 50, 100],
-        axis_neuron: int = 16,
-        exclude_types: list[tuple[int, int]] = [],
+    def __init__(
+        self,
+        rcut: float,
+        rcut_smth: float,
+        sel: Union[list[int], int],
+        ntypes: int,
+        neuron: Optional[list[int]] = None,
+        axis_neuron: int = 16,
+        exclude_types: Optional[list[tuple[int, int]]] = None,

Then initialize the defaults at the beginning of the method:

if neuron is None:
    neuron = [25, 50, 100]
if exclude_types is None:
    exclude_types = []

Also applies to: 78-78

🧰 Tools
🪛 Ruff (0.8.0)

61-61: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (21)
deepmd/pd/model/model/ener_model.py (1)

Line range hint 1-144: Consider documenting the performance implications

As part of the DPA-1 support implementation and the optimization efforts in the Paddle backend, it would be valuable to:

  1. Document the performance improvements from removing deepcopy operations
  2. Add comments explaining why direct assignments are now safe
  3. Consider adding this pattern to the project's best practices guide if it proves beneficial

This will help other contributors understand the reasoning behind these changes and apply similar optimizations consistently.

deepmd/pd/utils/exclude_mask.py (1)

71-71: Approve type change and fix mutable default

The change to list[tuple[int, int]] improves type safety by explicitly representing excluded pairs. However, the mutable default argument should be addressed.

Apply this change to both methods:

    def __init__(
        self,
        ntypes: int,
-       exclude_types: list[tuple[int, int]] = [],
+       exclude_types: list[tuple[int, int]] | None = None,
    ) -> None:
        super().__init__()
        self.reinit(ntypes, exclude_types)

    def reinit(
        self,
        ntypes: int,
-       exclude_types: list[tuple[int, int]] = [],
+       exclude_types: list[tuple[int, int]] | None = None,
    ) -> None:
+       if exclude_types is None:
+           exclude_types = []

Also applies to: 79-79

source/tests/pd/model/test_env_mat.py (1)

25-25: Consider making the test case mixin structure more explicit.

While the type hint addition is correct, this class appears to be a test case mixin but doesn't follow typical Python test patterns. Consider:

  1. Adding a descriptive docstring explaining its mixin nature
  2. Renaming to TestCaseSingleFrameWithNlistMixin to make its purpose clear
  3. Using abc.ABC to make it an explicit abstract base class
source/tests/pd/model/test_forward_lower.py (1)

Line range hint 165-171: Consider documenting the test implementation roadmap

The skip decorators effectively mark unimplemented tests, which is good practice. However, since this is part of a larger implementation effort:

  1. Consider adding TODO comments with references to related issues/PRs for each skipped test class
  2. TestEnergyModelDPA1 is not skipped, which aligns with this PR's focus on DPA-1 support

Example enhancement:

 @unittest.skip("Skip for not implemented yet")
 class TestEnergyModelDPA2(unittest.TestCase, ForwardLowerTest):
+    # TODO: Implement after PR #4302 is merged
+    # Related issue: <issue_number>
     def setUp(self):

Also applies to: 173-184, 186-197, 199-211, 213-226, 228-241

source/tests/pd/model/test_smooth.py (2)

157-166: Consider adding docstring to explain the test class purpose and parameters.

The class implementation looks good, but would benefit from documentation explaining:

  • The purpose of DPA1 model testing
  • Why type_split is set to True
  • The reasoning behind the specific epsilon and aprec values
  • What "less degree of smoothness" means in this context

168-178: Consider reducing code duplication with TestEnergyModelDPA1.

The setup is largely identical to TestEnergyModelDPA1 except for pair_exclude_types. Consider:

  1. Moving common setup to a base class
  2. Using parameterized tests

Example refactor:

class TestEnergyModelDPA1Base(unittest.TestCase, SmoothTest):
    def setUp(self, pair_exclude_types=None):
        model_params = copy.deepcopy(model_dpa1)
        if pair_exclude_types:
            model_params["pair_exclude_types"] = pair_exclude_types
        self.type_split = True
        self.model = get_model(model_params).to(env.DEVICE)
        self.epsilon = 1e-5
        self.aprec = 1e-5

class TestEnergyModelDPA1Excl1(TestEnergyModelDPA1Base):
    def setUp(self):
        super().setUp(pair_exclude_types=[[0, 1]])
source/tests/pd/common.py (1)

23-24: Add type hints and docstring for better maintainability.

The function works correctly but could benefit from better documentation and type safety.

Consider applying these improvements:

-def j_loader(filename):
+def j_loader(filename: Union[str, pathlib.Path]) -> dict:
+    """Load JSON test data from the tests directory.
+    
+    Parameters
+    ----------
+    filename : Union[str, pathlib.Path]
+        Name of the JSON file relative to the tests directory
+        
+    Returns
+    -------
+    dict
+        Loaded JSON data
+    """
     return dp_j_loader(tests_path / filename)
deepmd/pd/model/network/network.py (2)

83-98: Fix docstring to reference Paddle types instead of PyTorch

The docstring contains references to PyTorch types which should be updated to their Paddle equivalents:

  • torch.device should be paddle.base.libpaddle.Place
  • torch.Tensor should be paddle.Tensor

Apply this diff to fix the documentation:

        Parameters
        ----------
-       device : torch.device
+       device : paddle.base.libpaddle.Place
            The device on which to perform the computation.

        Returns
        -------
-       type_embedding : torch.Tensor
+       type_embedding : paddle.Tensor
            The full type embeddings of all types. The last index corresponds to the zero padding.

100-104: Improve docstring while keeping the correct return type annotation

The -> None return type annotation is correct. However, the docstring could be enhanced to better document the parameters.

Consider adding parameter documentation:

    def share_params(self, base_class, shared_level, resume=False) -> None:
        """
        Share the parameters of self to the base_class with shared_level during multitask training.
        If not start from checkpoint (resume is False),
        some separated parameters (e.g. mean and stddev) will be re-calculated across different classes.
+
+       Parameters
+       ----------
+       base_class : TypeEmbedNet
+           The base class instance to share parameters with
+       shared_level : int
+           Level of parameter sharing (currently only 0 is supported)
+       resume : bool, optional
+           Whether to resume from a checkpoint, by default False
        """
deepmd/pd/model/atomic_model/dp_atomic_model.py (1)

214-244: Consider adding parameter validation

While the implementation and documentation are good, consider adding validation for the input parameters to ensure they are within reasonable ranges:

  • min_nbor_dist should be positive
  • table_stride_1 and table_stride_2 should be positive
  • table_extrapolate should be within a reasonable range
 def enable_compression(
     self,
     min_nbor_dist: float,
     table_extrapolate: float = 5,
     table_stride_1: float = 0.01,
     table_stride_2: float = 0.1,
     check_frequency: int = -1,
 ) -> None:
+    if min_nbor_dist <= 0:
+        raise ValueError("min_nbor_dist must be positive")
+    if table_stride_1 <= 0 or table_stride_2 <= 0:
+        raise ValueError("table_stride values must be positive")
+    if table_extrapolate <= 0:
+        raise ValueError("table_extrapolate must be positive")
     self.descriptor.enable_compression(
         min_nbor_dist,
         table_extrapolate,
         table_stride_1,
         table_stride_2,
         check_frequency,
     )
deepmd/pd/model/descriptor/dpa1.py (6)

531-535: Remove unused variable env_mat

The env_mat variable is assigned but never used in the deserialization process.

Apply this diff to remove the unused variable:

     embeddings = data.pop("embeddings")
     type_embedding = data.pop("type_embedding")
     attention_layers = data.pop("attention_layers")
-    env_mat = data.pop("env_mat")
+    data.pop("env_mat")  # Not used in deserialization
     tebd_input_mode = data["tebd_input_mode"]
🧰 Tools
🪛 Ruff (0.8.0)

534-534: Local variable env_mat is assigned to but never used

Remove assignment to unused variable env_mat

(F841)


633-634: Remove unused variable nall

The nall variable is assigned but never used in the forward method.

Apply this diff to remove the unused variable:

     nframes, nloc, nnei = nlist.shape
-    nall = extended_coord.reshape([nframes, -1]).shape[1] // 3
     g1_ext = self.type_embedding(extended_atype)
🧰 Tools
🪛 Ruff (0.8.0)

633-633: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)


586-587: Consider implementing model compression

The enable_compression method is currently not implemented in the Paddle backend. This could impact performance optimization opportunities.

Would you like me to help implement the model compression functionality based on other backend implementations?


629-657: Consider optimizing tensor operations in forward method

The forward method performs multiple type conversions and tensor operations. Consider optimizing by:

  1. Batching the type conversions
  2. Minimizing memory allocations
  3. Reusing tensors where possible

Example optimization:

     # cast the input to internal precsion
     extended_coord = extended_coord.to(dtype=self.prec)
+    # Pre-allocate output tensors with correct precision
+    output_precision = env.GLOBAL_PD_FLOAT_PRECISION
+    outputs = []
     del mapping
     nframes, nloc, nnei = nlist.shape
     g1_ext = self.type_embedding(extended_atype)
     g1_inp = g1_ext[:, :nloc, :]
     if self.tebd_input_mode in ["strip"]:
         type_embedding = self.type_embedding.get_full_embedding(g1_ext.place)
     else:
         type_embedding = None
     g1, g2, h2, rot_mat, sw = self.se_atten(
         nlist,
         extended_coord,
         extended_atype,
         g1_ext,
         mapping=None,
         type_embedding=type_embedding,
     )
     if self.concat_output_tebd:
         g1 = paddle.concat([g1, g1_inp], axis=-1)
 
-    return (
-        g1.to(dtype=env.GLOBAL_PD_FLOAT_PRECISION),
-        rot_mat.to(dtype=env.GLOBAL_PD_FLOAT_PRECISION),
-        g2.to(dtype=env.GLOBAL_PD_FLOAT_PRECISION) if g2 is not None else None,
-        h2.to(dtype=env.GLOBAL_PD_FLOAT_PRECISION),
-        sw.to(dtype=env.GLOBAL_PD_FLOAT_PRECISION),
-    )
+    # Batch convert tensors to output precision
+    outputs = [tensor.to(dtype=output_precision) if tensor is not None else None 
+              for tensor in [g1, rot_mat, g2, h2, sw]]
+    return tuple(outputs)
🧰 Tools
🪛 Ruff (0.8.0)

633-633: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)


4-4: Add unit tests as suggested by TODO comment

The TODO comment at line 4 indicates missing tests. The compute_input_stats method and other critical functionality should be thoroughly tested.

Would you like me to help generate comprehensive unit tests for this class? The tests would cover:

  1. Input validation
  2. Tensor shape verification
  3. Edge cases
  4. Numerical accuracy

Also applies to: 410-431


375-401: Consider expanding shared_level options

The share_params method only supports two levels (0 and 1) of parameter sharing. Consider adding more granular sharing options for better flexibility in multitask training.

Example additional levels:

  • Level 2: Share only attention parameters
  • Level 3: Share only embedding parameters
  • Level 4: Share only normalization parameters
deepmd/pd/model/task/fitting.py (3)

302-302: Handle case_embd being None during serialization

When serializing, if self.case_embd is None, using to_numpy_array(self.case_embd) may cause errors. It's important to handle None values appropriately.

Proposed change:

"@variables": {
    "bias_atom_e": to_numpy_array(self.bias_atom_e),
-   "case_embd": to_numpy_array(self.case_embd),
+   "case_embd": to_numpy_array(self.case_embd) if self.case_embd is not None else None,
    ...
},

Line range hint 324-330: Handle None values during deserialization

In the deserialize method, converting None with to_paddle_tensor can result in errors. It's advisable to check for None before conversion.

Proposed change:

for kk in variables.keys():
-   obj[kk] = to_paddle_tensor(variables[kk])
+   obj[kk] = to_paddle_tensor(variables[kk]) if variables[kk] is not None else None

522-524: Simplify redundant addition operation

Since outs is initialized as zeros, including it in the addition is unnecessary. Simplifying the expression improves efficiency.

Proposed change:

- outs = outs + atom_property + self.bias_atom_e[atype].to(self.prec)
+ outs = atom_property + self.bias_atom_e[atype].to(self.prec)
deepmd/pd/model/descriptor/se_atten.py (2)

367-371: Simplify conditional assignment using a ternary operator

You can simplify the if statement by using a ternary operator, making the code more concise:

sampled = merged() if callable(merged) else merged
🧰 Tools
🪛 Ruff (0.8.0)

367-371: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block

(SIM108)


457-457: Remove unused variable nall

The variable nall is assigned but not used in the forward method. Consider removing it to clean up the code.

🧰 Tools
🪛 Ruff (0.8.0)

457-457: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4c925f9 and dd3191a.

📒 Files selected for processing (19)
  • deepmd/pd/model/atomic_model/dp_atomic_model.py (4 hunks)
  • deepmd/pd/model/descriptor/dpa1.py (1 hunks)
  • deepmd/pd/model/descriptor/se_atten.py (1 hunks)
  • deepmd/pd/model/model/ener_model.py (1 hunks)
  • deepmd/pd/model/network/network.py (3 hunks)
  • deepmd/pd/model/task/fitting.py (12 hunks)
  • deepmd/pd/utils/exclude_mask.py (3 hunks)
  • source/tests/pd/common.py (3 hunks)
  • source/tests/pd/model/test_descriptor.py (1 hunks)
  • source/tests/pd/model/test_env_mat.py (2 hunks)
  • source/tests/pd/model/test_forward_lower.py (1 hunks)
  • source/tests/pd/model/test_null_input.py (2 hunks)
  • source/tests/pd/model/test_permutation.py (5 hunks)
  • source/tests/pd/model/test_rot.py (0 hunks)
  • source/tests/pd/model/test_smooth.py (2 hunks)
  • source/tests/pd/model/test_trans.py (0 hunks)
  • source/tests/pd/test_decomp.py (0 hunks)
  • source/tests/pd/test_finetune.py (0 hunks)
  • source/tests/pd/test_training.py (6 hunks)
💤 Files with no reviewable changes (4)
  • source/tests/pd/model/test_rot.py
  • source/tests/pd/model/test_trans.py
  • source/tests/pd/test_decomp.py
  • source/tests/pd/test_finetune.py
🧰 Additional context used
📓 Learnings (2)
deepmd/pd/model/descriptor/dpa1.py (1)
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/model/make_model.py:370-373
Timestamp: 2024-11-12T05:47:21.643Z
Learning: In `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional and should not be flagged in future reviews.
deepmd/pd/model/descriptor/se_atten.py (1)
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/model/make_model.py:370-373
Timestamp: 2024-11-12T05:47:21.643Z
Learning: In `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional and should not be flagged in future reviews.
🪛 Ruff (0.8.0)
deepmd/pd/model/descriptor/dpa1.py

218-218: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


230-230: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


534-534: Local variable env_mat is assigned to but never used

Remove assignment to unused variable env_mat

(F841)


633-633: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)

deepmd/pd/model/descriptor/se_atten.py

61-61: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


78-78: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


367-371: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block

(SIM108)


391-391: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


457-457: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)

deepmd/pd/utils/exclude_mask.py

25-25: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


78-78: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🔇 Additional comments (45)
source/tests/pd/model/test_null_input.py (2)

25-25: LGTM: Import addition aligns with DPA-1 support

The addition of model_dpa1 import is consistent with the PR's objective to implement DPA-1 support.


98-102: LGTM: Well-structured test class implementation

The new test class follows good testing practices:

  • Proper inheritance from both unittest.TestCase and NullTest
  • Consistent setup pattern with existing test classes
  • Inherits comprehensive null input test cases

Could you verify why type_split is set to True for DPA-1 while it's False for SeA? Let's check the model configurations:

✅ Verification successful

Let me search for more context about type_split in the model configurations to understand if this difference is by design.


Let me try to find model configuration files in a different location.


Let me check the model implementation to understand the type_split parameter's purpose.


Let me try a different search pattern to find the model implementation details.


Based on the consistent pattern observed across multiple test files (test_permutation.py, test_trans.py, test_rot.py, test_autodiff.py, test_smooth.py), I can now provide a verified response:

Setting type_split=True for DPA-1 model is correct and consistent with the codebase

The codebase shows a clear pattern where:

  • All DPA (Deep Potential Active) model variants (DPA-1, DPA-2, Hybrid) consistently use type_split=True
  • SeA (SE-A) model variants consistently use type_split=False

This is a model-specific configuration that's consistently applied across all test suites (null, permutation, rotation, transformation, autodiff, smooth), indicating it's an intentional architectural difference between the model families.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for type_split configuration in model definitions
rg -A 5 "type_split" source/tests/pd/model/test_permutation.py

Length of output: 1926


Script:

#!/bin/bash
# Search for type_split in model configuration files
rg "type_split" source/tests/pd/model/model_param.py

Length of output: 129


Script:

#!/bin/bash
# First, find potential model configuration files
fd "model.*\.py" source/tests/pd/

# Then search for type_split in model implementations
rg -A 5 "type_split" source/tests/pd/ --type py

Length of output: 19627


Script:

#!/bin/bash
# Search for type_split usage in model implementation
ast-grep --pattern 'def $_($$$ type_split $$$) {$$$}' 

Length of output: 55


Script:

#!/bin/bash
# Search for model implementation files and type_split usage
fd "dp.*model.*\.py" source/pd/
rg "type_split" source/pd/ --type py

Length of output: 219

deepmd/pd/model/model/ener_model.py (3)

33-33: LGTM: Type annotation addition improves code clarity

The addition of the return type annotation -> None for the __init__ method follows Python type hinting best practices.


Line range hint 54-63: LGTM: Improved type annotations in method signatures

The forward methods now have more precise type hints, which improves code documentation and enables better static type checking. The implementation changes are consistent with the output definition modifications.

Also applies to: 93-103


40-52: Verify the safety of removing deepcopy

The direct assignment of dictionary values instead of using deepcopy could improve performance, but we should verify that this change doesn't introduce any unintended side effects where shared references could be problematic.

✅ Verification successful

Based on the search results, I can make a determination about the safety of removing deepcopy. Let me generate the final response:

Removal of deepcopy is safe in this context

The direct assignment of dictionary values from out_def_data is safe because:

  1. The output_def dictionary is constructed locally within the translated_output_def method and not shared elsewhere
  2. The source values from out_def_data are already immutable or have their own independent copies through the model_output_def() call
  3. The squeeze operations are performed on the values after assignment, modifying only the local copies
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any tests or documentation that might indicate the necessity of deep copying these values
rg -A 5 "translated_output_def|output_def" --type py

# Look for any similar patterns in the codebase where deepcopy is still being used
ast-grep --pattern 'from copy import deepcopy'

Length of output: 101347

deepmd/pd/utils/exclude_mask.py (3)

Line range hint 80-95: LGTM! Good improvements to exclusion handling

The changes improve the implementation by:

  1. Adding bidirectional exclusion handling (both (i,j) and (j,i))
  2. Adding validation for tuple length
  3. Introducing no_exclusion flag for optimization
🧰 Tools
🪛 Ruff (0.8.0)

70-70: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


78-78: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


137-137: LGTM! Good optimizations and Paddle integration

The changes improve the implementation by:

  1. Using native paddle.take_along_axis instead of decomp.take_along_axis
  2. Adding early return optimization for the common case of no exclusions
  3. Ensuring consistent mask type through boolean casting

Also applies to: 144-146


144-146: Verify consistent replacement of decomp.take_along_axis

Let's ensure all usages of decomp.take_along_axis have been consistently replaced with paddle.take_along_axis across the codebase.

✅ Verification successful

All instances of take_along_axis are consistently using paddle implementation

The verification shows that all instances of take_along_axis in the codebase are using the paddle implementation. There are no remaining instances of decomp.take_along_axis, confirming that the replacement has been done consistently across the entire codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining usage of decomp.take_along_axis
rg "decomp\.take_along_axis" 

# Search for new usage of paddle.take_along_axis to verify consistency
rg "paddle\.take_along_axis"

Length of output: 1407

source/tests/pd/model/test_env_mat.py (2)

158-159: LGTM! Proper type hint and inheritance handling.

The type hint addition is correct, and the explicit call to the parent class's setUp method follows good practices for multiple inheritance.


Line range hint 162-184: Verify test coverage for edge cases.

While the test implementation is thorough for the happy path, consider adding test cases for:

  1. Edge cases with empty neighbor lists
  2. Boundary conditions near rcut and rcut_smth
  3. Different precision modes (float32/float64)
source/tests/pd/model/test_descriptor.py (1)

181-183: LGTM! Appropriate replacement of tensor indexing operation.

The change from decomp.take_along_axis to paddle.take_along_axis is correct and aligns with the PR's objective of implementing DPA-1 support in the Paddle backend. The subsequent assertions properly verify the consistency between TensorFlow and Paddle implementations.

Let's verify the test coverage for this critical component:

✅ Verification successful

Let me analyze the test coverage further by checking the implementation details in the related files.


Let me try one more verification to check the decomp module usage:


The replacement of decomp.take_along_axis with paddle.take_along_axis appears to be an isolated change

Based on the verification results:

  • This is the only instance of take_along_axis usage in Paddle tests that follows this pattern
  • No other occurrences of decomp.take_along_axis were found in the codebase
  • The change is contained within the test file and properly validates the tensor indexing operation
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check test coverage for descriptor-related tests
# Look for other descriptor tests that might need similar updates

# Search for other descriptor test files
fd -e py "test.*descriptor" source/tests/

# Search for other uses of take_along_axis that might need updating
rg "take_along_axis" source/tests/

Length of output: 1460


Script:

#!/bin/bash
# Check implementation details in related files
# 1. Look for similar patterns in other paddle descriptor tests
rg -A 3 "take_along_axis.*mapping.*indices.*nlist" source/tests/pd/

# 2. Check the implementation in the main descriptor code
ast-grep --pattern 'def take_along_axis($$$)'

# 3. Look for decomp.take_along_axis usage that might need updates
rg "decomp.take_along_axis" source/

Length of output: 508


Script:

#!/bin/bash
# Search for decomp module imports and its usage patterns
rg -B 2 -A 2 "import.*decomp" source/

# Look for any other similar tensor indexing patterns in paddle tests
rg -A 2 "paddle.*indices.*=.*mapping" source/tests/pd/

Length of output: 96

source/tests/pd/model/test_forward_lower.py (1)

99-101: Verify the performance impact of switching to paddle.take_along_axis

The change from decomp.take_along_axis to paddle.take_along_axis aligns with the optimization efforts. However, we should verify that this change maintains the same performance characteristics.

✅ Verification successful

Migration from decomp.take_along_axis is safe and correct

The verification shows that take_along_axis is not implemented in the decomp module, which means the original code was directly using paddle's native implementation through the decomp module. The change to use paddle.take_along_axis directly:

  • Removes an unnecessary indirection
  • Doesn't affect functionality as it's the same underlying implementation
  • Aligns with the optimization goals without performance impact
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other instances of decomp.take_along_axis to ensure consistent migration
rg "decomp\.take_along_axis" --type py

# Search for performance-related tests
rg -l "performance|benchmark.*take_along_axis" --type py

Length of output: 435


Script:

#!/bin/bash
# Let's check the actual usage of take_along_axis in the codebase
ast-grep --pattern 'take_along_axis($$$)'

# Also check if there are any remaining decomp.take_along_axis usages specifically
ast-grep --pattern 'decomp.take_along_axis($$$)'

# Let's also check the decomp module to understand the original implementation
fd decomp.py --exec cat {}

Length of output: 5562

source/tests/pd/model/test_smooth.py (3)

22-22: LGTM!

The import of model_dpa1 is correctly placed and aligns with the PR's objective to implement DPA-1 support.


180-190: Verify test coverage of pair exclusions and reduce code duplication.

  1. The code duplication issue extends to this class as well - consider the refactoring suggestion from the previous comment.
  2. We should verify that the test suite adequately covers all important pair exclusion combinations.

Let's verify the test coverage of pair exclusions:

✅ Verification successful

Let me gather more information about the test coverage by examining the test files and model implementation:


Let me check the test coverage more specifically by examining the test files for different combinations and the spin-related functionality:


Based on the gathered information, I can now provide the final response:

Test coverage for pair exclusions is adequate, code duplication is acceptable

The current test coverage for pair exclusions is comprehensive and well-structured:

  • Basic single pair exclusion is tested via TestEnergyModelDPA1Excl1 ([[0,1]])
  • Multiple pair exclusions are tested via TestEnergyModelDPA1Excl12 ([[0,1], [0,2]])
  • Additional extensive pair exclusion combinations are tested in test_spin.py covering various scenarios
  • The implementation is also verified through consistency tests in test_ener.py

The code duplication between the test classes is minimal and justified as it follows the standard unit test pattern where each test case focuses on a specific scenario. The current structure makes the tests clear, maintainable, and easy to debug.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check test coverage for pair exclusions across the codebase

# Search for other pair exclusion tests
rg -l "pair_exclude_types" --type py

# Search for pair exclusion configurations in test files
rg "pair_exclude_types.*=.*\[\[.*\]\]" --type py

Length of output: 1665


Script:

#!/bin/bash
# Check the model implementation for pair exclusion handling
ast-grep --pattern 'def get_model($$$)' deepmd/pd/model/model/__init__.py | head -n 20

# Check test files for pair exclusion test cases
rg -A 10 "class.*Excl" source/tests/pd/model/test_smooth.py
rg -A 10 "class.*Excl" source/tests/pt/model/test_smooth.py

# Check if there are any other pair exclusion configurations in the codebase
rg "pair_exclude_types.*=" --type py

Length of output: 6667


Script:

#!/bin/bash
# Check spin-related test cases as they seem to handle pair exclusions
rg -B 5 -A 10 "expect_pair_exclude_types" source/tests/common/test_spin.py

# Check consistent test cases
rg -B 5 -A 10 "pair_exclude_types" source/tests/consistent/model/test_ener.py

# Look for any documentation about supported pair exclusion combinations
rg -l "pair_exclude_types.*=.*\[\[" docs/ || true

Length of output: 3666


163-164: Verify numerical stability claims across test classes.

The comments mention that "error can be systematically removed by reducing epsilon". We should verify this claim:

  1. Document the relationship between epsilon and error
  2. Consider adding tests with different epsilon values to demonstrate error reduction

Let's search for related tests:

Also applies to: 175-176, 187-188

✅ Verification successful

The comment about error reduction with epsilon is accurate and verified by the implementation.

The test implementation in test_smooth.py confirms the claim about error reduction through epsilon in several ways:

  1. The test systematically verifies smoothness by comparing model outputs (energy, force, virial) with small coordinate perturbations of size epsilon (1e-5).
  2. The test framework uses absolute precision (aprec) checks that match the epsilon scale, showing the error bounds are directly tied to epsilon.
  3. Multiple test classes (DPA1, DPA1Excl1, DPA1Excl12) consistently use the same epsilon value and demonstrate the relationship between precision and perturbation size.

The implementation validates that numerical errors scale with epsilon, confirming that reducing epsilon would indeed systematically reduce the error.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Look for epsilon-related tests and numerical stability verification

# Search for epsilon-related test patterns
rg "epsilon.*=.*1e-" --type py

# Search for numerical stability tests
rg -l "numerical.*stability|convergence.*test" --type py

Length of output: 2795


Script:

#!/bin/bash
# Let's examine the test files more closely to understand the relationship between epsilon and error

# Look at the test implementation in test_smooth.py
rg -B 5 -A 10 "error can be systematically removed" source/tests/pd/model/test_smooth.py

# Check for any error calculations or assertions related to epsilon
rg -B 2 -A 2 "assert.*epsilon|error.*epsilon" source/tests/pd/model/test_smooth.py

# Look for test methods that might demonstrate error reduction
ast-grep --pattern 'def test_$_($$$) {
  $$$
  epsilon = $_
  $$$
}'

Length of output: 2284


Script:

#!/bin/bash
# Let's examine the SmoothTest class to understand how epsilon is used in testing

# Look for the SmoothTest class implementation
rg -B 2 -A 20 "class SmoothTest" source/tests/pd/model/test_smooth.py

# Check for any test methods in the file that use epsilon
rg -B 2 -A 10 "def test_.*smooth" source/tests/pd/model/test_smooth.py

Length of output: 884


Script:

#!/bin/bash
# Let's check the entire test_smooth.py file to understand the testing methodology

# Get the full content of test_smooth.py
cat source/tests/pd/model/test_smooth.py

# Also check for any related smooth test implementations in other files
rg -l "SmoothTest" --type py

Length of output: 6767

source/tests/pd/test_training.py (4)

23-23: LGTM: Import addition aligns with DPA-1 implementation.

The addition of model_dpa1 import is consistent with the PR's objective to implement DPA-1 support.


29-29: LGTM: Consistent type hint additions improve code clarity.

The addition of return type hints (-> None) to test methods follows good practices for Python type annotations and improves code maintainability.

Also applies to: 99-99, 128-128, 139-139, 157-157


176-187: LGTM: Well-structured test class implementation.

The new TestEnergyModelDPA1 class follows the established pattern and correctly implements the necessary setup for testing the DPA-1 model implementation.


181-183: Verify test data compatibility with DPA-1 model.

Please ensure that the test data in "water/data/data_0" is appropriate for testing the DPA-1 model implementation.

✅ Verification successful

Let me gather more information about the test data structure and DPA-1 related configurations.


Let me check the test data format and configuration files specific to this test case.


Test data is compatible with DPA-1 model requirements

The test data in "water/data/data_0" is appropriate for testing the DPA-1 model implementation. The data follows the correct format with:

  • Proper atom type mapping (O: 0, H: 1) in type.raw
  • Correct type_map.raw containing O and H atoms
  • Complete set of required files (coord.npy, box.npy, energy.npy, force.npy)
  • Structure matching DPA-1 examples found in the codebase
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check the structure and content of test data

# Check if test data exists and examine its structure
ls -la "$(dirname "$0")/water/data/data_0"
cat "$(dirname "$0")/water/data/data_0/type.raw" 2>/dev/null || echo "type.raw not found"

# Look for any DPA-1 specific test data patterns or configurations
find "$(dirname "$0")" -type f -name "*.json" -exec grep -l "dpa1" {} \;

Length of output: 576


Script:

#!/bin/bash
# Check the actual path structure in the repository
fd "data_0" --type d

# Look for test data patterns in water directory
fd . --type f "source/tests/pd/water"

# Check content of input configuration files found earlier
cat examples/property/train/input_torch.json
cat examples/water/se_atten/input_torch.json

# Search for DPA-1 related configurations or tests
rg -i "dpa.?1|deep.?potential" source/tests/pd/

Length of output: 7323


Script:

#!/bin/bash
# Check the content of test data files
cat source/tests/pd/water/data/data_0/type.raw
cat source/tests/pd/water/data/data_0/type_map.raw

# Check configuration files in the test directory
cat source/tests/pd/water/se_atten.json

# Look for model configuration in the test file
rg -A 10 "model_dpa1\s*=" source/tests/pd/model/test_permutation.py

Length of output: 2479

source/tests/pd/common.py (3)

2-2: LGTM! Good use of pathlib for cross-platform compatibility.

The additions follow Python best practices:

  • Using pathlib for modern path handling
  • Properly importing dp_j_loader from deepmd.common
  • Correctly defining tests_path relative to the current file

Also applies to: 11-11, 20-20


21-22: LGTM! Good code organization.

The placement of the new function follows PEP 8 guidelines with proper spacing between imports and function definitions, and maintains a logical code organization.


23-24: Verify usage patterns of the new j_loader function.

Let's check if this utility function is being used consistently across test files and if there are any similar functions that could be consolidated.

✅ Verification successful

Let me gather more information about the dp_j_loader function to understand its relationship with the current JSON loading patterns.


No consolidation needed - j_loader is consistently used across test frameworks

The j_loader function in source/tests/pd/common.py is part of a consistent pattern across the test frameworks (pd, pt, tf). Each framework has its own wrapper that uses the common dp_j_loader from the main package. The function is actively used in numerous test files, particularly in the TensorFlow (tf) test suite, and serves as a standardized way to load JSON configuration files for tests.

  • Found in three test framework directories with identical implementation:
    • source/tests/pd/common.py
    • source/tests/pt/common.py
    • source/tests/tf/common.py
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check usage patterns of j_loader and similar functions

# Find test files that might use j_loader
echo "=== Files potentially using j_loader ==="
rg -l "j_loader" "source/tests/pd"

# Look for similar JSON loading patterns
echo -e "\n=== Similar JSON loading patterns ==="
rg -B 2 -A 2 "load.*json|json.*load" "source/tests/pd"

Length of output: 9510


Script:

#!/bin/bash
# Check the implementation of dp_j_loader and j_loader
echo "=== dp_j_loader implementation ==="
rg -B 2 -A 2 "dp_j_loader" --type py

# Check if j_loader is actually being used
echo -e "\n=== j_loader usage ==="
rg "j_loader\(" --type py

Length of output: 9842

deepmd/pd/model/network/network.py (3)

48-48: LGTM: Constructor return type annotation added

The addition of the -> None return type annotation to the constructor follows Python type hinting best practices.


168-168: LGTM: Constructor return type annotation added

The addition of the -> None return type annotation to the constructor is correct and consistent with the parent class.


Line range hint 1-324: Verify PyTorch to Paddle API compatibility

Since this appears to be part of a PyTorch to Paddle migration, let's verify that all PyTorch APIs have been correctly mapped to their Paddle equivalents.

✅ Verification successful

Based on the search results, I can see that this file is part of a PyTorch to Paddle migration effort. The code shows proper usage of Paddle APIs:

  1. Imports are correctly using paddle and paddle.nn
  2. Tensor creation uses paddle.empty() instead of torch.empty()
  3. Layer inheritance uses nn.Layer instead of nn.Module
  4. Device handling uses Paddle's device model
  5. Tensor operations use Paddle's API patterns

Let me verify the specific tensor operations to ensure complete compatibility:


Migration to Paddle APIs looks correct and complete

The PyTorch to Paddle migration in this file has been properly implemented:

  1. All tensor operations use Paddle's API conventions:
    • paddle.eye() for identity matrices
    • paddle.concat() with correct axis parameter instead of PyTorch's dim
    • paddle.zeros() for zero tensors
  2. Device handling uses Paddle's .to(device=device) pattern
  3. Layer inheritance correctly uses nn.Layer instead of nn.Module
  4. No remaining PyTorch-specific APIs or patterns are present
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining PyTorch references and verify Paddle API usage

# Check for any remaining torch references
echo "Checking for torch references..."
rg -i "torch" --type py

# Check Paddle API usage patterns
echo "Checking Paddle API patterns..."
rg "paddle\.(tensor|nn|device)" --type py -A 2

Length of output: 104728


Script:

#!/bin/bash
# Check tensor operations in the file
rg "paddle\.(concat|matmul|zeros|ones|eye|tile|gather|where|scatter)" deepmd/pd/model/network/network.py

Length of output: 656

deepmd/pd/model/atomic_model/dp_atomic_model.py (3)

54-54: LGTM: Good type annotation practice

Adding the return type annotation for the constructor improves code clarity and static type checking capabilities.


Line range hint 203-213: LGTM: Efficient optimization of deserialization

The change from deep copy to shallow copy is a valid optimization since the data dictionary is only used for parameter extraction and the popped values are not modified.


311-311: LGTM: Good type annotation practice

Adding the return type annotation improves code clarity and static type checking capabilities.

source/tests/pd/model/test_permutation.py (4)

6-6: LGTM: Import organization

Moving the numpy import earlier in the file follows a common convention of placing fundamental dependencies first.


346-347: LGTM: Type annotation addition

Adding return type annotation to the test method improves code clarity and type safety.


412-416: Implementation ready for DPA-1 testing

The TestEnergyModelDPA1 class is properly implemented and not skipped, which aligns with the PR objective of implementing DPA-1 (se_atten) support. The configuration uses the appropriate model parameters and enables type splitting.


398-399: LGTM: Consistent type annotations

Return type annotations have been consistently added to all setUp methods across test classes, improving code clarity and type safety.

Also applies to: 406-407, 413-414, 421-422, 429-430, 439-440, 447-448, 457-458, 465-466

deepmd/pd/model/task/fitting.py (12)

57-57: Use of return type annotation enhances type clarity

Adding -> None to the share_params method improves type safety and code readability.


67-67: Clarification on parameter sharing in multitask training

The comment correctly explains that bias_atom_e and case_embd are not shared, improving code understandability.


149-149: Consistent use of return type annotations

Including -> None in the __init__ method enhances consistency in type annotations throughout the codebase.


180-182: Correct conversion of bias_atom_e to a Paddle tensor

Converting bias_atom_e with specified precision and device ensures accurate computations.


211-219: Proper initialization of case_embd when dim_case_embd > 0

Initializing case_embd as a buffer with zeros aligns with the expected setup for case embeddings.


224-224: Updating input dimension to include dim_case_embd

Adjusting in_dim to account for dim_case_embd ensures compatibility with the concatenated inputs.


252-252: Addition of return type annotation in reinit_exclude

Specifying -> None clarifies that the method does not return a value, improving code clarity.


371-371: Consistent use of return type annotations in __setitem__

Adding -> None to the __setitem__ method enhances type consistency and clarity.


430-433: Casting inputs to internal precision ensures consistency

Casting descriptor, fparam, and aparam to self.prec maintains precision consistency across computations.


517-517: Initialization of outs with correct precision and device

Allocating outs with the specified precision and device ensures readiness for subsequent computations.


540-540: Efficient masking of atom_property

Using paddle.where to mask atom_property efficiently zeroes out undesired elements.


545-547: Appropriate application of exclusion masks to outputs

Applying the exclusion mask to outs ensures that atom types meant to be excluded have their contributions zeroed out.

deepmd/pd/model/descriptor/se_atten.py (1)

61-61: Fix mutable default arguments in constructor

Using mutable objects as default arguments can lead to unexpected behavior in Python. The lists used as default values for neuron, exclude_types, and exclude_types parameters are created once when the function is defined and shared between all calls.

Apply this diff to fix the issue:

--- a/deepmd/pd/model/descriptor/se_atten.py
+++ b/deepmd/pd/model/descriptor/se_atten.py
@@ -61,7 +61,7 @@
             ntypes: int,
-            neuron: list = [25, 50, 100],
+            neuron: Optional[list[int]] = None,
             axis_neuron: int = 16,
             tebd_dim: int = 8,
@@ -78,7 +78,7 @@
             type_one_side: bool = False,
-            exclude_types: list[tuple[int, int]] = [],
+            exclude_types: Optional[list[tuple[int, int]]] = None,
             env_protection: float = 0.0,
@@ -391,7 +391,7 @@
     def reinit_exclude(
         self,
-        exclude_types: list[tuple[int, int]] = [],
+        exclude_types: Optional[list[tuple[int, int]]] = None,
     ) -> None:

Then initialize the defaults at the beginning of the methods:

if neuron is None:
    neuron = [25, 50, 100]
if exclude_types is None:
    exclude_types = []

Also applies to: 78-78, 391-391

🧰 Tools
🪛 Ruff (0.8.0)

61-61: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

@@ -18,15 +15,15 @@ def __init__(
self,
ntypes: int,
exclude_types: list[int] = [],
):
) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace mutable default argument with None

Using mutable default arguments (empty list []) can lead to unexpected behavior as the same list object is shared across all instances. Instead, use None as the default and initialize the list within the method.

Apply this change to both the __init__ and reinit methods:

    def __init__(
        self,
        ntypes: int,
-       exclude_types: list[int] = [],
+       exclude_types: list[int] | None = None,
    ) -> None:
        super().__init__()
        self.reinit(ntypes, exclude_types)

    def reinit(
        self,
        ntypes: int,
-       exclude_types: list[int] = [],
+       exclude_types: list[int] | None = None,
    ) -> None:
+       if exclude_types is None:
+           exclude_types = []
        self.ntypes = ntypes
        self.exclude_types = exclude_types

Also applies to: 26-26

rcut_smth: float,
sel: Union[list[int], int],
ntypes: int,
neuron: list = [25, 50, 100],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix mutable default arguments

Using mutable objects (lists) as default arguments can lead to unexpected behavior since they are created only once when the function is defined.

Apply this diff to fix the mutable default arguments:

-    neuron: list = [25, 50, 100],
+    neuron: Optional[list] = None,
     axis_neuron: int = 16,
     tebd_dim: int = 8,
     tebd_input_mode: str = "concat",
     set_davg_zero: bool = True,
     attn: int = 128,
     attn_layer: int = 2,
     attn_dotr: bool = True,
     attn_mask: bool = False,
     activation_function: str = "tanh",
     precision: str = "float64",
     resnet_dt: bool = False,
-    exclude_types: list[tuple[int, int]] = [],
+    exclude_types: Optional[list[tuple[int, int]]] = None,

# Then initialize in the constructor:
def __init__(self, ...):
    if neuron is None:
        neuron = [25, 50, 100]
    if exclude_types is None:
        exclude_types = []

Also applies to: 230-230

🧰 Tools
🪛 Ruff (0.8.0)

218-218: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

Comment on lines +501 to +513
if self.dim_case_embd > 0:
assert self.case_embd is not None
case_embd = paddle.tile(self.case_embd.reshape([1, 1, -1]), [nf, nloc, 1])
xx = paddle.concat(
[xx, case_embd],
axis=-1,
)
if xx_zeros is not None:
xx_zeros = paddle.concat(
[xx_zeros, case_embd],
axis=-1,
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure case_embd is initialized before use in _forward_common

In the _forward_common method, self.case_embd is used when self.dim_case_embd > 0, but if set_case_embd hasn't been called beforehand, self.case_embd might be None, leading to errors.

Consider initializing self.case_embd with a default value or requiring set_case_embd to be called before _forward_common when dim_case_embd > 0.

Comment on lines +367 to +369
self.case_embd = paddle.eye(self.dim_case_embd, dtype=self.prec).to(device)[
case_idx
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Validate case_idx in set_case_embd to prevent out-of-bounds errors

In the set_case_embd method, case_idx should be validated to ensure it is within the valid range [0, self.dim_case_embd), preventing potential IndexError.

Proposed fix:

def set_case_embd(self, case_idx: int):
+   if not (0 <= case_idx < self.dim_case_embd):
+       raise ValueError(f"'case_idx' must be in the range [0, {self.dim_case_embd}), got {case_idx}")
    self.case_embd = paddle.eye(self.dim_case_embd, dtype=self.prec).to(device)[case_idx]

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Outside diff range and nitpick comments (29)
source/tests/pd/model/test_atomic_model_atomic_stat.py (2)

170-170: Remove assignment to unused variable f

The variable f is assigned but never used at line 170. You can omit as f in the with statement to clean up the code.

Apply this diff to fix the issue:

-with h5py.File(h5file, "w") as f:
+with h5py.File(h5file, "w"):
🧰 Tools
🪛 Ruff (0.8.0)

170-170: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)


340-340: Remove assignment to unused variable f

The variable f is assigned but never used at line 340. Omitting as f in the with statement helps maintain code clarity.

Apply this diff to fix the issue:

-with h5py.File(h5file, "w") as f:
+with h5py.File(h5file, "w"):
🧰 Tools
🪛 Ruff (0.8.0)

340-340: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)

source/tests/pd/model/test_atomic_model_global_stat.py (4)

172-172: Remove unused variable f when creating the HDF5 file

The variable f is assigned but never used. Since you're only creating an empty HDF5 file, you can omit the variable assignment.

Apply this diff to fix the issue:

-with h5py.File(h5file, "w") as f:
+with h5py.File(h5file, "w"):
     pass
🧰 Tools
🪛 Ruff (0.8.0)

172-172: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)


489-489: Remove unused variable at in test_serialize method

The variable at is assigned but never used in this method. You can remove it to clean up the code.

Apply this diff to fix the issue:

-        at = self.atype_ext[:, :nloc]
🧰 Tools
🪛 Ruff (0.8.0)

489-489: Local variable at is assigned to but never used

Remove assignment to unused variable at

(F841)


197-199: Refactor: Avoid code duplication by defining cvt_ret as a class method

The helper function cvt_ret is defined multiple times in different test methods. Consider defining it once as a class method to eliminate redundancy.

You can refactor the code as follows:

+    def cvt_ret(self, x):
+        return {kk: to_numpy_array(vv) for kk, vv in x.items()}

     def test_output_stat(self):
-        def cvt_ret(x):
-            return {kk: to_numpy_array(vv) for kk, vv in x.items()}
         ...
-        ret0 = cvt_ret(ret0)
+        ret0 = self.cvt_ret(ret0)

Update the other test methods (test_preset_bias, test_preset_bias_all_none) to use self.cvt_ret(x) as well.


293-296: Remove or fix commented-out code in preset_out_bias

There's commented-out code for the "foo" key that contains a syntax error. If it's no longer needed, consider removing it to keep the code clean.

Apply this diff:

             preset_out_bias = {
-                # "foo": np.array(3.0, 2.0]).reshape(2, 1),
                 "foo": [None, 2],
                 "bar": np.array([7.0, 5.0, 13.0, 11.0]).reshape(2, 1, 2),
             }
source/tests/pd/model/test_trans_denoise.py (4)

8-25: Consider consolidating related imports

The imports from deepmd.pd could be consolidated into a single import statement for better readability:

-from deepmd.pd.model.model import (
-    get_model,
-)
-from deepmd.pd.utils import (
-    env,
-)
+from deepmd.pd import (
+    model.model.get_model,
+    utils.env,
+)

35-35: Remove unused variable assignment

The generator variable is assigned but never used in the code.

-        generator = paddle.seed(GLOBAL_SEED)
+        paddle.seed(GLOBAL_SEED)
🧰 Tools
🪛 Ruff (0.8.0)

35-35: Local variable generator is assigned to but never used

Remove assignment to unused variable generator

(F841)


58-58: Consider making precision a class constant

The precision value used for assertions could be defined as a class constant for better maintainability.

 class TransDenoiseTest:
+    PRECISION = 1e-10
+
     def test(
         self,
     ):

Then update the assertions to use self.PRECISION.


72-91: Consider refactoring common setup logic

The setUp method follows the same pattern across all test classes. Consider moving common logic to the base class.

 class TransDenoiseTest:
+    def setUp(self):
+        self.type_split = True
+        self.model = get_model(self._get_model_params()).to(env.DEVICE)
+
+    def _get_model_params(self):
+        raise NotImplementedError("Subclasses must implement _get_model_params")

 class TestDenoiseModelDPA1(unittest.TestCase, TransDenoiseTest):
-    def setUp(self):
-        model_params = copy.deepcopy(model_dpa1)
-        self.type_split = True
-        self.model = get_model(model_params).to(env.DEVICE)
+    def _get_model_params(self):
+        return copy.deepcopy(model_dpa1)
source/tests/pd/model/test_permutation_denoise.py (2)

32-40: Add documentation for model modifications.

The model modifications (deep copying, type map updates, and fitting net removal) would benefit from comments explaining why these changes are necessary for denoising tests.


43-80: Add docstring and extract magic numbers as constants.

Consider the following improvements:

  1. Add a docstring explaining the test's purpose and methodology
  2. Extract magic numbers (e.g., natoms=5, prec=1e-10) as class constants

Example implementation:

class PermutationDenoiseTest:
    """Base test class for verifying permutation invariance in denoising models.
    
    Tests that model outputs (updated coordinates and logits) remain consistent
    when input atomic coordinates are permuted.
    """
    NATOMS = 5
    PRECISION = 1e-10
    
    def test(self) -> None:
        """Test permutation invariance of the denoising model."""
        natoms = self.NATOMS
        # ... rest of the method ...
        prec = self.PRECISION
🧰 Tools
🪛 Ruff (0.8.0)

47-47: Local variable generator is assigned to but never used

Remove assignment to unused variable generator

(F841)

source/tests/pd/model/test_rot_denoise.py (4)

32-32: Remove unused variable assignment.

The generator variable is assigned but never used.

-        generator = paddle.seed(GLOBAL_SEED)
+        paddle.seed(GLOBAL_SEED)
🧰 Tools
🪛 Ruff (0.8.0)

32-32: Local variable generator is assigned to but never used

Remove assignment to unused variable generator

(F841)


33-33: Document the precision threshold.

Consider adding a comment explaining why this specific precision value (1e-10) was chosen for the tests.

-        prec = 1e-10
+        # Set strict precision threshold for numerical stability in rotation tests
+        prec = 1e-10

28-74: Add docstring to explain test purpose and methodology.

The test class and method lack documentation explaining their purpose and methodology. This is important for maintainability.

 class RotDenoiseTest:
+    """Test class to verify rotation invariance of the denoising model.
+    
+    This class contains tests to ensure that the model's predictions remain
+    consistent under coordinate rotations and combined coordinate-cell rotations.
+    """
     def test(
         self,
     ):
+        """Test rotation invariance of the denoising model.
+        
+        Verifies that:
+        1. Model predictions are invariant when coordinates are rotated
+        2. Model predictions are invariant when both coordinates and cell are rotated
+        """
🧰 Tools
🪛 Ruff (0.8.0)

32-32: Local variable generator is assigned to but never used

Remove assignment to unused variable generator

(F841)


28-104: Consider adding edge case tests.

The current tests cover basic rotation invariance well, but consider adding tests for edge cases:

  1. Zero coordinates/cell values
  2. Very large coordinate values
  3. Nearly singular cell matrices
  4. Mixed positive/negative coordinates

Would you like me to provide example implementations for these test cases?

🧰 Tools
🪛 Ruff (0.8.0)

32-32: Local variable generator is assigned to but never used

Remove assignment to unused variable generator

(F841)

deepmd/pd/utils/env.py (1)

80-139: Document the rationale for blacklisted operations and clarify commented entries.

The blacklist is comprehensive but would benefit from:

  1. Documentation explaining why these specific operations are blacklisted
  2. Clarification on why 'concat_grad' and 'split_grad' are commented out
  3. Consider adding a comment explaining the performance implications

Consider adding documentation above the list:

+    # These operations will use kernel implementations instead of composite operators
+    # to optimize performance and avoid potential issues with primitive mode.
+    # Operations are blacklisted because:
+    # 1. <reason>
+    # 2. <reason>
     EAGER_COMP_OP_BLACK_LIST = [
source/tests/pd/model/test_saveload_dpa1.py (2)

109-113: Improve stat file path handling.

The hardcoded paths for stat files could cause issues in different environments or parallel test execution.

Consider using temporary directories or test-specific paths:

-        model_config["stat_file_dir"] = "stat_files"
-        model_config["stat_file"] = "stat.hdf5"
-        model_config["stat_file_path"] = os.path.join(
-            model_config["stat_file_dir"], model_config["stat_file"]
-        )
+        import tempfile
+        with tempfile.TemporaryDirectory() as tmp_dir:
+            model_config["stat_file_dir"] = tmp_dir
+            model_config["stat_file"] = "stat.hdf5"
+            model_config["stat_file_path"] = os.path.join(tmp_dir, "stat.hdf5")

136-141: Enhance test coverage with additional test cases.

Consider adding:

  1. Negative test cases (e.g., corrupted model file)
  2. Explicit tolerance values for assert_allclose

Example enhancement:

     def test_saveload(self):
         result1 = self.get_model_result()
         result2 = self.get_model_result(read=True)
         for item in result1:
-            np.testing.assert_allclose(result1[item].numpy(), result2[item].numpy())
+            np.testing.assert_allclose(
+                result1[item].numpy(),
+                result2[item].numpy(),
+                rtol=1e-5,
+                atol=1e-8,
+                err_msg=f"Mismatch in {item}"
+            )

+    def test_load_corrupted(self):
+        """Test handling of corrupted model file"""
+        with self.assertRaises(Exception):
+            with open("corrupted.pd", "w") as f:
+                f.write("corrupted data")
+            self.get_model_result(read=True, model_file="corrupted.pd")
source/tests/pd/model/test_dpa1.py (2)

32-32: Consider renaming the test class to match the implementation

The class name TestDescrptSeAtten doesn't match the implementation being tested (DescrptDPA1). Consider renaming it to TestDescrptDPA1 for better clarity and consistency.


36-114: Add docstring and consider breaking down the test

The test method would benefit from:

  1. A docstring explaining its purpose and the significance of each parameter combination
  2. Breaking down into smaller, focused test cases for better maintainability
  3. Enhanced error messages including all relevant parameters (sm, to, tm, ect)

Example docstring:

def test_consistency(self):
    """Tests consistency between different DPA1 implementations.
    
    Verifies that:
    1. Serialization/deserialization preserves behavior
    2. New Paddle implementation matches the original DP implementation
    
    Parameters tested:
    - resnet_dt: Use ResNet delta training
    - smooth_type_embedding: Enable smooth type embedding
    - type_one_side: Use one-sided typing
    - tebd_input_mode: Input mode for TEBD ('concat' or 'strip')
    - precision: Numerical precision
    - use_econf_tebd: Use electronic configuration in TEBD
    """
source/tests/pd/model/test_descriptor_dpa1.py (6)

77-241: Consider moving reference data to a separate test data file.

The hardcoded reference descriptor data makes the test file harder to maintain. Consider moving this data to a separate JSON or NumPy file in a test data directory.


242-245: Consider using a more maintainable path structure for test model files.

The model file paths could be made more maintainable by:

  1. Using a constant for the models directory
  2. Creating a helper function to construct model file paths
+MODEL_DIR = Path(CUR_DIR) / "models"
+
+def get_model_path(filename: str) -> Path:
+    return MODEL_DIR / filename
+
-        with open(Path(CUR_DIR) / "models" / "dpa1.json") as fp:
+        with open(get_model_path("dpa1.json")) as fp:
             self.model_json = json.load(fp)
-        self.file_model_param = Path(CUR_DIR) / "models" / "dpa1.pd"
-        self.file_type_embed = Path(CUR_DIR) / "models" / "dpa2_tebd.pd"
+        self.file_model_param = get_model_path("dpa1.pd")
+        self.file_type_embed = get_model_path("dpa2_tebd.pd")

248-248: Remove commented out code.

The commented out code for setting random seed and saving model parameters should be removed to maintain code cleanliness.

Also applies to: 270-271


252-252: Fix Yoda condition in assert statement.

The assert statement uses a Yoda condition which is not the preferred style.

-        assert "se_atten" == dparams.pop("type")
+        assert dparams.pop("type") == "se_atten"
🧰 Tools
🪛 Ruff (0.8.0)

252-252: Yoda condition detected

Rewrite as dparams.pop("type") == "se_atten"

(SIM300)


367-371: Add type hints to function parameters and return value.

The function lacks type hints which would improve code maintainability and IDE support.

 def translate_se_atten_and_type_embd_dicts_to_dpa1(
-    target_dict,
-    source_dict,
-    type_embd_dict,
+    target_dict: dict,
+    source_dict: dict,
+    type_embd_dict: dict,
+) -> dict:

378-378: Replace hardcoded assertion with a more flexible check.

The hardcoded assertion for exactly 2 type embedding keys makes the function less flexible. Consider making this configurable or deriving it from the model parameters.

-    assert len(type_embd_dict.keys()) == 2
+    expected_keys = {"weight", "bias"}  # or pass as parameter
+    assert set(type_embd_dict.keys()) == expected_keys, f"Expected keys {expected_keys}, got {set(type_embd_dict.keys())}"
deepmd/pd/model/descriptor/se_a.py (2)

717-719: Unused loop variables affecting readability

The loop unpacks variables compress_data_ii and compress_info_ii that are currently unused, making the code less maintainable.

Apply this diff to improve code clarity:

-        for embedding_idx, (ll, compress_data_ii, compress_info_ii) in enumerate(
+        for embedding_idx, (_ll, _compress_data_ii, _compress_info_ii) in enumerate(
             zip(self.filter_layers.networks, self.compress_data, self.compress_info)
         ):
🧰 Tools
🪛 Ruff (0.8.0)

717-717: Loop control variable compress_data_ii not used within loop body

Rename unused compress_data_ii to _compress_data_ii

(B007)


717-717: Loop control variable compress_info_ii not used within loop body

Rename unused compress_info_ii to _compress_info_ii

(B007)


489-502: Consider using list comprehension for parameter initialization

The initialization of compress_info and compress_data could be more concise using list comprehension.

Consider this more pythonic approach:

-        self.compress_info = nn.ParameterList(
-            [
-                self.create_parameter([], dtype=self.prec).to(device="cpu")
-                for _ in range(len(self.filter_layers.networks))
-            ]
-        )
-        self.compress_data = nn.ParameterList(
-            [
-                self.create_parameter([], dtype=self.prec).to(device=env.DEVICE)
-                for _ in range(len(self.filter_layers.networks))
-            ]
-        )
+        n_networks = len(self.filter_layers.networks)
+        self.compress_info = nn.ParameterList([
+            self.create_parameter([], dtype=self.prec).to(device="cpu")
+            for _ in range(n_networks)
+        ])
+        self.compress_data = nn.ParameterList([
+            self.create_parameter([], dtype=self.prec).to(device=env.DEVICE)
+            for _ in range(n_networks)
+        ])
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 7df0e2f and ac479ed.

📒 Files selected for processing (15)
  • .pre-commit-config.yaml (2 hunks)
  • deepmd/pd/model/descriptor/se_a.py (15 hunks)
  • deepmd/pd/model/descriptor/se_atten.py (1 hunks)
  • deepmd/pd/model/model/make_model.py (5 hunks)
  • deepmd/pd/utils/env.py (1 hunks)
  • source/tests/pd/model/models/dpa1.json (1 hunks)
  • source/tests/pd/model/test_atomic_model_atomic_stat.py (1 hunks)
  • source/tests/pd/model/test_atomic_model_global_stat.py (1 hunks)
  • source/tests/pd/model/test_descriptor_dpa1.py (1 hunks)
  • source/tests/pd/model/test_dpa1.py (1 hunks)
  • source/tests/pd/model/test_permutation_denoise.py (1 hunks)
  • source/tests/pd/model/test_rot_denoise.py (1 hunks)
  • source/tests/pd/model/test_saveload_dpa1.py (1 hunks)
  • source/tests/pd/model/test_trans_denoise.py (1 hunks)
  • source/tests/pd/test_training.py (7 hunks)
✅ Files skipped from review due to trivial changes (2)
  • .pre-commit-config.yaml
  • source/tests/pd/model/models/dpa1.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • deepmd/pd/model/model/make_model.py
  • source/tests/pd/test_training.py
🧰 Additional context used
📓 Learnings (1)
deepmd/pd/model/descriptor/se_atten.py (1)
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/model/make_model.py:370-373
Timestamp: 2024-11-12T05:47:21.643Z
Learning: In `deepmd/dpmodel/model/make_model.py`, the variable `nall` assigned but not used is intentional and should not be flagged in future reviews.
🪛 Ruff (0.8.0)
deepmd/pd/model/descriptor/se_a.py

643-643: Loop control variable ll not used within loop body

Rename unused ll to _ll

(B007)


717-717: Loop control variable compress_data_ii not used within loop body

Rename unused compress_data_ii to _compress_data_ii

(B007)


717-717: Loop control variable compress_info_ii not used within loop body

Rename unused compress_info_ii to _compress_info_ii

(B007)

deepmd/pd/model/descriptor/se_atten.py

61-61: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


78-78: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


375-379: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block

(SIM108)


399-399: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


477-477: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)

source/tests/pd/model/test_atomic_model_atomic_stat.py

170-170: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)


340-340: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)

source/tests/pd/model/test_atomic_model_global_stat.py

172-172: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)


489-489: Local variable at is assigned to but never used

Remove assignment to unused variable at

(F841)

source/tests/pd/model/test_descriptor_dpa1.py

252-252: Yoda condition detected

Rewrite as dparams.pop("type") == "se_atten"

(SIM300)

source/tests/pd/model/test_dpa1.py

140-140: Local variable err_msg is assigned to but never used

Remove assignment to unused variable err_msg

(F841)


163-163: Local variable model is assigned to but never used

Remove assignment to unused variable model

(F841)

source/tests/pd/model/test_permutation_denoise.py

47-47: Local variable generator is assigned to but never used

Remove assignment to unused variable generator

(F841)

source/tests/pd/model/test_rot_denoise.py

32-32: Local variable generator is assigned to but never used

Remove assignment to unused variable generator

(F841)

source/tests/pd/model/test_saveload_dpa1.py

42-42: Local variable rcut is assigned to but never used

Remove assignment to unused variable rcut

(F841)


43-43: Local variable sel is assigned to but never used

Remove assignment to unused variable sel

(F841)

source/tests/pd/model/test_trans_denoise.py

35-35: Local variable generator is assigned to but never used

Remove assignment to unused variable generator

(F841)

🔇 Additional comments (12)
source/tests/pd/model/test_atomic_model_global_stat.py (2)

310-312: Refactor: Avoid code duplication of cvt_ret


420-422: Refactor: Avoid code duplication of cvt_ret

deepmd/pd/model/descriptor/se_atten.py (3)

61-61: Avoid using mutable default arguments in function parameters

Using mutable default arguments like lists can lead to unexpected behavior because the default list is shared across all calls to the function. For the neuron parameter, consider setting the default value to None and initializing it inside the method.

Apply this diff to fix the issue:

-            neuron: list = [25, 50, 100],
+            neuron: Optional[list[int]] = None,

Then initialize neuron inside the __init__ method:

if neuron is None:
    neuron = [25, 50, 100]
🧰 Tools
🪛 Ruff (0.8.0)

61-61: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


78-78: Avoid using mutable default arguments in function parameters

The exclude_types parameter has a mutable default argument [], which can lead to unexpected behavior. Set the default value to None and initialize it inside the method.

Apply this diff to fix the issue:

-            exclude_types: list[tuple[int, int]] = [],
+            exclude_types: Optional[list[tuple[int, int]]] = None,

Then initialize exclude_types inside the __init__ method:

if exclude_types is None:
    exclude_types = []
🧰 Tools
🪛 Ruff (0.8.0)

78-78: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


477-477: Unused variable 'nall' assigned but not used

The variable nall is assigned but never used. If this assignment is intentional for future use or for clarity, you can disregard this message. Otherwise, consider removing it to clean up the code.

🧰 Tools
🪛 Ruff (0.8.0)

477-477: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)

source/tests/pd/model/test_trans_denoise.py (1)

70-91: Verify if tests should remain skipped

All test classes are currently skipped with different reasons:

  • DPA1 and DPA2: "support of the denoise is temporally disabled"
  • Hybrid: "hybrid not supported at the moment"

Given that this PR is implementing DPA-1 support, at least TestDenoiseModelDPA1 should be enabled if the implementation is ready.

✅ Verification successful

Let me analyze the implementation status of denoise functionality in DPA1 before making a final decision about the skipped tests.


Keep tests skipped until denoise support is fully implemented

The denoise functionality appears to be consistently disabled across multiple test files in both pd and pt implementations. The fact that:

  1. No denoise-related code was found in the DPA1 descriptor implementations
  2. All denoise-related tests are uniformly skipped with the same message
  3. This pattern is consistent across different model types (translation, rotation, permutation)

suggests that denoise support is indeed not yet implemented, and these tests should remain skipped for now.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if DPA1 implementation is complete in the codebase
rg -l "class DescrptDPA1" 

Length of output: 243


Script:

#!/bin/bash
# Check the implementation status and recent changes in DPA1 related files
rg -l "denoise" deepmd/pd/model/descriptor/dpa1.py
rg -l "denoise" deepmd/pt/model/descriptor/dpa1.py

# Check if there are any other skipped denoise tests
rg "skip.*denoise.*disabled" source/tests/

Length of output: 1570

source/tests/pd/model/test_permutation_denoise.py (1)

82-98: Document test classes and verify skip status.

  1. Add docstrings to test classes explaining their specific purposes.
  2. The skip decorators indicate temporary disabling - we should verify if this is still necessary.

Example docstring:

@unittest.skip("support of the denoise is temporally disabled")
class TestDenoiseModelDPA1(unittest.TestCase, PermutationDenoiseTest):
    """Test permutation invariance for DPA1 denoising model.
    
    Currently skipped as denoise support is temporarily disabled.
    """
source/tests/pd/model/test_rot_denoise.py (1)

107-113: Verify if skip decorator should be removed.

The class is marked to skip with the message "support of the denoise is temporally disabled", but according to the PR objectives, this PR implements DPA-1 support. Should this skip decorator be removed now?

✅ Verification successful

Based on the search results, I can now provide a final response:

Keep the skip decorator as denoise support is currently disabled

The skip decorator should remain in place because:

  • Multiple test files across the codebase consistently show that denoise support is temporarily disabled, as evidenced by similar skip decorators with the same message in various test files:
    • source/tests/pd/model/test_trans_denoise.py
    • source/tests/pd/model/test_permutation_denoise.py
    • source/tests/pt/model/test_rot_denoise.py
    • source/tests/pt/model/test_smooth_denoise.py
    • and others

While DPA-1 implementation exists (as seen in the descriptor class), the denoise functionality appears to be consistently disabled across all test suites, indicating this is an intentional temporary state of the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if denoise support is implemented in related files
rg -l "denoise.*support" --type py

Length of output: 36


Script:

#!/bin/bash
# Let's check for any denoise-related implementations and tests
rg -l "denoise" --type py

# Also check for DPA-1 related implementations
rg -l "dpa1|DPA1" --type py

# Check the specific test file content for more context
cat source/tests/pd/model/test_rot_denoise.py

Length of output: 8319


Script:

#!/bin/bash
# Let's check the implementation of denoise functionality in DPA1
rg "denoise.*=.*True" -B 2 -A 2 deepmd/pd/model/descriptor/dpa1.py deepmd/pt/model/descriptor/dpa1.py

# Check if there are any recent changes or implementations related to denoise in DPA1
rg "def.*forward.*denoise" -A 5 deepmd/pd/model/descriptor/dpa1.py deepmd/pt/model/descriptor/dpa1.py

Length of output: 207


Script:

#!/bin/bash
# Let's check the implementation status of DPA1 model and its denoise functionality
rg "class.*DPA1" -A 10 deepmd/pd/model/descriptor/dpa1.py

# Check if there are any other skipped tests related to denoise functionality
rg "@unittest.skip.*denoise" --type py

# Check the model parameters to understand the denoise configuration
rg "model_dpa1.*=.*{" -A 20 source/tests/pd/model/test_permutation_denoise.py

Length of output: 2391

deepmd/pd/utils/env.py (1)

147-148: 🛠️ Refactor suggestion

Consider error handling and cleanup for blacklist management.

The current implementation has potential robustness issues:

  1. Uses internal API _set_prim_backward_blacklist which might change
  2. No error handling for API failures
  3. Blacklist isn't cleared when primitive mode is disabled

Let's verify the API stability:

Consider adding error handling and cleanup:

     if enable:
-        paddle.framework.core._set_prim_backward_blacklist(*EAGER_COMP_OP_BLACK_LIST)
+        try:
+            paddle.framework.core._set_prim_backward_blacklist(*EAGER_COMP_OP_BLACK_LIST)
+        except Exception as e:
+            log.error(f"Failed to set primitive blacklist: {e}")
+            raise
+    else:
+        try:
+            # Clear the blacklist when disabled
+            paddle.framework.core._set_prim_backward_blacklist()
+        except Exception as e:
+            log.warning(f"Failed to clear primitive blacklist: {e}")
source/tests/pd/model/test_saveload_dpa1.py (2)

117-134: LGTM! Well-implemented data handling.

The implementation includes proper error handling for iterator exhaustion and device placement.


42-43: Verify the necessity of unused configuration parameters.

The rcut and sel parameters are extracted from the configuration but never used in the function. If these parameters are not needed, consider removing them to improve code clarity.

🧰 Tools
🪛 Ruff (0.8.0)

42-42: Local variable rcut is assigned to but never used

Remove assignment to unused variable rcut

(F841)


43-43: Local variable sel is assigned to but never used

Remove assignment to unused variable sel

(F841)

deepmd/pd/model/descriptor/se_a.py (1)

296-307: 🛠️ Refactor suggestion

Potential precision loss in tensor type casting

The method casts input to internal precision but returns tensors in global precision, which could lead to unnecessary precision loss if internal precision is higher than global precision.

Consider preserving precision by returning in internal precision when it's higher than global:

-            g1.to(dtype=env.GLOBAL_PD_FLOAT_PRECISION),
-            rot_mat.to(dtype=env.GLOBAL_PD_FLOAT_PRECISION),
+            g1.to(dtype=max(self.prec, env.GLOBAL_PD_FLOAT_PRECISION)),
+            rot_mat.to(dtype=max(self.prec, env.GLOBAL_PD_FLOAT_PRECISION)),

def test(
self,
) -> None:
generator = get_generator(GLOBAL_SEED)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unused generator variable.

The generator variable is assigned but never used in the test method.

-        generator = get_generator(GLOBAL_SEED)
🧰 Tools
🪛 Ruff (0.8.0)

47-47: Local variable generator is assigned to but never used

Remove assignment to unused variable generator

(F841)

sampler=paddle.io.RandomSampler(self.dataset),
drop_last=False,
),
num_workers=0, # setting to 0 diverges the behavior of its iterator; should be >=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Address the num_workers configuration issue.

The comment indicates that setting num_workers=0 causes divergent iterator behavior. This should be set to >=1 for consistent behavior.

-            num_workers=0,  # setting to 0 diverges the behavior of its iterator; should be >=1
+            num_workers=1,  # Minimum recommended value for consistent behavior
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
num_workers=0, # setting to 0 diverges the behavior of its iterator; should be >=1
num_workers=1, # Minimum recommended value for consistent behavior

Comment on lines +87 to +105
def get_model_result(self, read=False, model_file="tmp_model.pd"):
wrapper = self.create_wrapper(read)
optimizer = paddle.optimizer.Adam(
learning_rate=self.start_lr, parameters=wrapper.parameters()
)
optimizer.clear_grad()
if read:
wrapper.set_state_dict(paddle.load(model_file))
os.remove(model_file)
else:
paddle.save(wrapper.state_dict(), model_file)
result = wrapper(
**self.input_dict,
cur_lr=self.cur_lr,
label=self.label_dict,
task_key=self.task_key,
)[0]
return result

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve temporary file handling.

The current implementation has several potential issues:

  1. No cleanup if an exception occurs before os.remove
  2. Fixed filename could cause conflicts in parallel test execution

Consider using a context manager or tempfile module:

-    def get_model_result(self, read=False, model_file="tmp_model.pd"):
+    def get_model_result(self, read=False, model_file=None):
+        import tempfile
+        with tempfile.NamedTemporaryFile(suffix='.pd', delete=False) as tmp_file:
+            model_file = tmp_file.name
         wrapper = self.create_wrapper(read)
         optimizer = paddle.optimizer.Adam(
             learning_rate=self.start_lr, parameters=wrapper.parameters()
         )
         optimizer.clear_grad()
         try:
             if read:
                 wrapper.set_state_dict(paddle.load(model_file))
             else:
                 paddle.save(wrapper.state_dict(), model_file)
             result = wrapper(
                 **self.input_dict,
                 cur_lr=self.cur_lr,
                 label=self.label_dict,
                 task_key=self.task_key,
             )[0]
         finally:
             if os.path.exists(model_file):
                 os.remove(model_file)
         return result
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_model_result(self, read=False, model_file="tmp_model.pd"):
wrapper = self.create_wrapper(read)
optimizer = paddle.optimizer.Adam(
learning_rate=self.start_lr, parameters=wrapper.parameters()
)
optimizer.clear_grad()
if read:
wrapper.set_state_dict(paddle.load(model_file))
os.remove(model_file)
else:
paddle.save(wrapper.state_dict(), model_file)
result = wrapper(
**self.input_dict,
cur_lr=self.cur_lr,
label=self.label_dict,
task_key=self.task_key,
)[0]
return result
def get_model_result(self, read=False, model_file=None):
import tempfile
with tempfile.NamedTemporaryFile(suffix='.pd', delete=False) as tmp_file:
model_file = tmp_file.name
wrapper = self.create_wrapper(read)
optimizer = paddle.optimizer.Adam(
learning_rate=self.start_lr, parameters=wrapper.parameters()
)
optimizer.clear_grad()
try:
if read:
wrapper.set_state_dict(paddle.load(model_file))
else:
paddle.save(wrapper.state_dict(), model_file)
result = wrapper(
**self.input_dict,
cur_lr=self.cur_lr,
label=self.label_dict,
task_key=self.task_key,
)[0]
finally:
if os.path.exists(model_file):
os.remove(model_file)
return result

Comment on lines +115 to +164
def test_jit(
self,
):
rng = np.random.default_rng(GLOBAL_SEED)
nf, nloc, nnei = self.nlist.shape
davg = rng.normal(size=(self.nt, nnei, 4))
dstd = rng.normal(size=(self.nt, nnei, 4))
dstd = 0.1 + np.abs(dstd)

for idt, prec, sm, to, tm, ect in itertools.product(
[
False,
], # resnet_dt
[
"float64",
], # precision
[False, True], # smooth_type_embedding
[
False,
], # type_one_side
["concat", "strip"], # tebd_input_mode
[False, True], # use_econf_tebd
):
dtype = PRECISION_DICT[prec]
rtol, atol = get_tols(prec)
err_msg = f"idt={idt} prec={prec}"
# dpa1 new impl
dd0 = DescrptDPA1(
self.rcut,
self.rcut_smth,
self.sel,
self.nt,
precision=prec,
resnet_dt=idt,
smooth_type_embedding=sm,
type_one_side=to,
tebd_input_mode=tm,
use_econf_tebd=ect,
type_map=["O", "H"] if ect else None,
seed=GLOBAL_SEED,
)
dd0.se_atten.mean = paddle.to_tensor(davg, dtype=dtype).to(
device=env.DEVICE
)
dd0.se_atten.dstd = paddle.to_tensor(dstd, dtype=dtype).to(
device=env.DEVICE
)
# dd1 = DescrptDPA1.deserialize(dd0.serialize())
model = paddle.jit.to_static(dd0)
# model = paddle.jit.to_static(dd1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Complete the JIT test implementation

The test_jit method has several issues that need to be addressed:

  1. The test is incomplete - no assertions are made to verify JIT compilation works correctly
  2. Contains commented out code that should be removed (lines 162, 164)
  3. Variables err_msg and model are assigned but never used
  4. Missing docstring explaining the test purpose

Consider completing the implementation like this:

 def test_jit(self):
+    """Tests that the model can be successfully compiled using Paddle's JIT.
+    
+    Verifies that:
+    1. Model can be converted to static graph
+    2. JIT-compiled model produces same results as dynamic model
+    """
     # ... existing setup code ...
     
     model = paddle.jit.to_static(dd0)
+    
+    # Verify JIT model produces same results as dynamic model
+    rd0, _, _, _, _ = dd0(
+        paddle.to_tensor(self.coord_ext, dtype=dtype).to(device=env.DEVICE),
+        paddle.to_tensor(self.atype_ext, dtype="int64").to(device=env.DEVICE),
+        paddle.to_tensor(self.nlist, dtype="int64").to(device=env.DEVICE),
+    )
+    
+    rd1, _, _, _, _ = model(
+        paddle.to_tensor(self.coord_ext, dtype=dtype).to(device=env.DEVICE),
+        paddle.to_tensor(self.atype_ext, dtype="int64").to(device=env.DEVICE),
+        paddle.to_tensor(self.nlist, dtype="int64").to(device=env.DEVICE),
+    )
+    
+    np.testing.assert_allclose(
+        rd0.detach().cpu().numpy(),
+        rd1.detach().cpu().numpy(),
+        rtol=rtol,
+        atol=atol,
+        err_msg=f"JIT compilation failed: idt={idt} prec={prec} sm={sm} to={to} tm={tm} ect={ect}"
+    )

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.8.0)

140-140: Local variable err_msg is assigned to but never used

Remove assignment to unused variable err_msg

(F841)


163-163: Local variable model is assigned to but never used

Remove assignment to unused variable model

(F841)

Comment on lines +302 to +303
with open(Path(CUR_DIR) / "models" / "dpa1.json") as fp:
self.model_json = json.load(fp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove duplicate model JSON loading.

The model JSON is already loaded in setUp method. Remove this duplicate loading to improve maintainability.

-        with open(Path(CUR_DIR) / "models" / "dpa1.json") as fp:
-            self.model_json = json.load(fp)

Comment on lines +231 to +254
def enable_compression(
self,
min_nbor_dist: float,
table_extrapolate: float = 5,
table_stride_1: float = 0.01,
table_stride_2: float = 0.1,
check_frequency: int = -1,
) -> None:
"""Receive the statisitcs (distance, max_nbor_size and env_mat_range) of the training data.

Parameters
----------
min_nbor_dist
The nearest distance between atoms
table_extrapolate
The scale of model extrapolation
table_stride_1
The uniform stride of the first table
table_stride_2
The uniform stride of the second table
check_frequency
The overflow check frequency
"""
raise ValueError("Enable compression is not supported.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Implementation mismatch in compression support

The enable_compression method in DescrptSeA raises a "not supported" error, but the class has compression-related attributes and logic. This inconsistency could lead to confusion.

Consider either:

  1. Implementing compression support in DescrptSeA, or
  2. Removing compression-related attributes if compression is truly not supported.

Would you like me to help implement the compression support to align with DescrptBlockSeA?

Comment on lines +750 to +759
if self.compress:
raise NotImplementedError(
"Compressed environment is not implemented yet."
)
else:
# nfnl x nt x ng
gg = ll.forward(ss)
# nfnl x 4 x ng
gr = paddle.matmul(rr.transpose([0, 2, 1]), gg)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Redundant compression check

The code block contains a redundant compression check that was already performed earlier in the method.

Remove the redundant check:

-                    if self.compress:
-                        raise NotImplementedError(
-                            "Compressed environment is not implemented yet."
-                        )
-                    else:
-                        # nfnl x nt x ng
-                        gg = ll.forward(ss)
-                        # nfnl x 4 x ng
-                        gr = paddle.matmul(rr.transpose([0, 2, 1]), gg)
+                    # nfnl x nt x ng
+                    gg = ll.forward(ss)
+                    # nfnl x 4 x ng
+                    gr = paddle.matmul(rr.transpose([0, 2, 1]), gg)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if self.compress:
raise NotImplementedError(
"Compressed environment is not implemented yet."
)
else:
# nfnl x nt x ng
gg = ll.forward(ss)
# nfnl x 4 x ng
gr = paddle.matmul(rr.transpose([0, 2, 1]), gg)
# nfnl x nt x ng
gg = ll.forward(ss)
# nfnl x 4 x ng
gr = paddle.matmul(rr.transpose([0, 2, 1]), gg)


def setUp(self):
TestCaseSingleFrameWithNlist.setUp(self)
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nf is not used.

def setUp(self):
TestCaseSingleFrameWithNlist.setUp(self)
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nloc is not used.

def setUp(self):
TestCaseSingleFrameWithNlist.setUp(self)
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nnei is not used.
to_paddle_tensor(ii) for ii in [self.coord_ext, self.atype_ext, self.nlist]
]
# nf x nloc
at = self.atype_ext[:, :nloc]

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable at is not used.
source/tests/pd/model/test_descriptor_dpa1.py Fixed Show fixed Hide fixed
Comment on lines +98 to +102


# @unittest.skip("hybrid not supported at the moment")
# class TestDenoiseModelHybrid(unittest.TestCase, TestPermutationDenoise):
# def setUp(self):

Check notice

Code scanning / CodeQL

Commented-out code Note test

This comment appears to contain commented-out code.
self,
):
generator = paddle.seed(GLOBAL_SEED)
prec = 1e-10

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable generator is not used.

def get_dataset(config):
model_config = config["model"]
rcut = model_config["descriptor"]["rcut"]

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable rcut is not used.
def get_dataset(config):
model_config = config["model"]
rcut = model_config["descriptor"]["rcut"]
sel = model_config["descriptor"]["sel"]

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable sel is not used.
self,
):
natoms = 5
generator = paddle.seed(GLOBAL_SEED)

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable generator is not used.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)
source/tests/pd/model/test_descriptor_dpa1.py (3)

30-243: Add docstring to explain test data structure.

The setUp method would benefit from a docstring explaining the structure and purpose of the test data (cell dimensions, coordinates, atom types, and reference descriptors).

Add this docstring at the beginning of the method:

 def setUp(self):
+    """Initialize test data for DPA1 descriptor tests.
+    
+    Sets up:
+    - cell: 3x3 periodic cell dimensions
+    - coord: atomic coordinates for 5 atoms
+    - atype: atom types (2 types: [0,0,0,1,1])
+    - ref_d: reference descriptor values for validation
+    - model configuration from JSON and parameter files
+    """

267-269: Remove commented-out code for model parameter saving.

Remove these commented-out lines as they're not providing value to the test implementation.

-        ## to save model parameters
-        # paddle.save(des.state_dict(), 'model_weights.pd')
-        # paddle.save(type_embedding.state_dict(), 'model_weights.pd')

366-387: Add type hints and docstring to helper function.

The helper function would benefit from type hints and a comprehensive docstring explaining its purpose and parameters.

Apply these changes:

 def translate_se_atten_and_type_embd_dicts_to_dpa1(
-    target_dict,
-    source_dict,
-    type_embd_dict,
+    target_dict: dict,
+    source_dict: dict,
+    type_embd_dict: dict,
+) -> dict:
+    """Translate state dictionaries from se_atten and type embedding to DPA1 format.
+    
+    Args:
+        target_dict: Target state dictionary to be updated
+        source_dict: Source state dictionary containing se_atten parameters
+        type_embd_dict: Dictionary containing type embedding parameters
+    
+    Returns:
+        Updated target dictionary with translated parameters
+    
+    Raises:
+        AssertionError: If type_embd_dict doesn't contain exactly 2 keys or if any
+                       required keys are missing in the target dictionary
+    """
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 3e64196 and 56e079c.

📒 Files selected for processing (4)
  • deepmd/pd/model/model/make_model.py (4 hunks)
  • deepmd/pd/model/network/network.py (3 hunks)
  • deepmd/pd/model/task/fitting.py (12 hunks)
  • source/tests/pd/model/test_descriptor_dpa1.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • deepmd/pd/model/model/make_model.py
  • deepmd/pd/model/network/network.py
  • deepmd/pd/model/task/fitting.py
🧰 Additional context used
🪛 Ruff (0.8.0)
source/tests/pd/model/test_descriptor_dpa1.py

249-249: Yoda condition detected

Rewrite as dparams["type"] == "se_atten"

(SIM300)

🔇 Additional comments (1)
source/tests/pd/model/test_descriptor_dpa1.py (1)

300-301: Remove duplicate model JSON loading.

The model JSON is already loaded in setUp method. Remove this duplicate loading to improve maintainability.

-        with open(Path(CUR_DIR) / "models" / "dpa1.json") as fp:
-            self.model_json = json.load(fp)

Comment on lines +244 to +298
def test_descriptor_block(self) -> None:
# paddle.seed(0)
model_dpa1 = self.model_json
dparams = model_dpa1["descriptor"]
ntypes = len(model_dpa1["type_map"])
assert "se_atten" == dparams["type"]
dparams.pop("type")
dparams["ntypes"] = ntypes
des = DescrptBlockSeAtten(
**dparams,
).to(env.DEVICE)
state_dict = paddle.load(str(self.file_model_param))
# this is an old state dict, modify manually
state_dict["compress_info.0"] = des.compress_info[0]
state_dict["compress_data.0"] = des.compress_data[0]
des.set_state_dict(state_dict)
coord = self.coord
atype = self.atype
box = self.cell
# handle type_embedding
type_embedding = TypeEmbedNet(ntypes, 8, use_tebd_bias=True).to(env.DEVICE)
type_embedding.set_state_dict(paddle.load(str(self.file_type_embed)))

## to save model parameters
# paddle.save(des.state_dict(), 'model_weights.pd')
# paddle.save(type_embedding.state_dict(), 'model_weights.pd')
(
extended_coord,
extended_atype,
mapping,
nlist,
) = extend_input_and_build_neighbor_list(
coord,
atype,
des.get_rcut(),
des.get_sel(),
mixed_types=des.mixed_types(),
box=box,
)
descriptor, env_mat, diff, rot_mat, sw = des(
nlist,
extended_coord,
extended_atype,
type_embedding(extended_atype),
mapping=None,
)
# np.savetxt('tmp.out', descriptor.detach().numpy().reshape(1,-1), delimiter=",")
self.assertEqual(descriptor.shape[-1], des.get_dim_out())
self.assertAlmostEqual(6.0, des.get_rcut())
self.assertEqual(30, des.get_nsel())
self.assertEqual(2, des.get_ntypes())
np.testing.assert_allclose(
descriptor.reshape([-1]).numpy(), self.ref_d.numpy(), atol=1e-10, rtol=1e-10
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add docstring and fix assert statement.

The test method needs a docstring, and the assert statement should be rewritten to avoid side effects.

Apply these changes:

 def test_descriptor_block(self) -> None:
+    """Test the DescrptBlockSeAtten class functionality.
+    
+    Verifies:
+    - Correct descriptor output dimensions
+    - Proper rcut, nsel, and ntypes values
+    - Descriptor values match reference data
+    """
-    assert "se_atten" == dparams["type"]
+    descriptor_type = dparams["type"]
+    if descriptor_type != "se_atten":
+        raise AssertionError(f"Expected descriptor type 'se_atten', got '{descriptor_type}'")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_descriptor_block(self) -> None:
# paddle.seed(0)
model_dpa1 = self.model_json
dparams = model_dpa1["descriptor"]
ntypes = len(model_dpa1["type_map"])
assert "se_atten" == dparams["type"]
dparams.pop("type")
dparams["ntypes"] = ntypes
des = DescrptBlockSeAtten(
**dparams,
).to(env.DEVICE)
state_dict = paddle.load(str(self.file_model_param))
# this is an old state dict, modify manually
state_dict["compress_info.0"] = des.compress_info[0]
state_dict["compress_data.0"] = des.compress_data[0]
des.set_state_dict(state_dict)
coord = self.coord
atype = self.atype
box = self.cell
# handle type_embedding
type_embedding = TypeEmbedNet(ntypes, 8, use_tebd_bias=True).to(env.DEVICE)
type_embedding.set_state_dict(paddle.load(str(self.file_type_embed)))
## to save model parameters
# paddle.save(des.state_dict(), 'model_weights.pd')
# paddle.save(type_embedding.state_dict(), 'model_weights.pd')
(
extended_coord,
extended_atype,
mapping,
nlist,
) = extend_input_and_build_neighbor_list(
coord,
atype,
des.get_rcut(),
des.get_sel(),
mixed_types=des.mixed_types(),
box=box,
)
descriptor, env_mat, diff, rot_mat, sw = des(
nlist,
extended_coord,
extended_atype,
type_embedding(extended_atype),
mapping=None,
)
# np.savetxt('tmp.out', descriptor.detach().numpy().reshape(1,-1), delimiter=",")
self.assertEqual(descriptor.shape[-1], des.get_dim_out())
self.assertAlmostEqual(6.0, des.get_rcut())
self.assertEqual(30, des.get_nsel())
self.assertEqual(2, des.get_ntypes())
np.testing.assert_allclose(
descriptor.reshape([-1]).numpy(), self.ref_d.numpy(), atol=1e-10, rtol=1e-10
)
def test_descriptor_block(self) -> None:
"""Test the DescrptBlockSeAtten class functionality.
Verifies:
- Correct descriptor output dimensions
- Proper rcut, nsel, and ntypes values
- Descriptor values match reference data
"""
# paddle.seed(0)
model_dpa1 = self.model_json
dparams = model_dpa1["descriptor"]
ntypes = len(model_dpa1["type_map"])
descriptor_type = dparams["type"]
if descriptor_type != "se_atten":
raise AssertionError(f"Expected descriptor type 'se_atten', got '{descriptor_type}'")
dparams.pop("type")
dparams["ntypes"] = ntypes
des = DescrptBlockSeAtten(
**dparams,
).to(env.DEVICE)
state_dict = paddle.load(str(self.file_model_param))
# this is an old state dict, modify manually
state_dict["compress_info.0"] = des.compress_info[0]
state_dict["compress_data.0"] = des.compress_data[0]
des.set_state_dict(state_dict)
coord = self.coord
atype = self.atype
box = self.cell
# handle type_embedding
type_embedding = TypeEmbedNet(ntypes, 8, use_tebd_bias=True).to(env.DEVICE)
type_embedding.set_state_dict(paddle.load(str(self.file_type_embed)))
## to save model parameters
# paddle.save(des.state_dict(), 'model_weights.pd')
# paddle.save(type_embedding.state_dict(), 'model_weights.pd')
(
extended_coord,
extended_atype,
mapping,
nlist,
) = extend_input_and_build_neighbor_list(
coord,
atype,
des.get_rcut(),
des.get_sel(),
mixed_types=des.mixed_types(),
box=box,
)
descriptor, env_mat, diff, rot_mat, sw = des(
nlist,
extended_coord,
extended_atype,
type_embedding(extended_atype),
mapping=None,
)
# np.savetxt('tmp.out', descriptor.detach().numpy().reshape(1,-1), delimiter=",")
self.assertEqual(descriptor.shape[-1], des.get_dim_out())
self.assertAlmostEqual(6.0, des.get_rcut())
self.assertEqual(30, des.get_nsel())
self.assertEqual(2, des.get_ntypes())
np.testing.assert_allclose(
descriptor.reshape([-1]).numpy(), self.ref_d.numpy(), atol=1e-10, rtol=1e-10
)
🧰 Tools
🪛 Ruff (0.8.0)

249-249: Yoda condition detected

Rewrite as dparams["type"] == "se_atten"

(SIM300)

Comment on lines +299 to +364
def test_descriptor(self) -> None:
with open(Path(CUR_DIR) / "models" / "dpa1.json") as fp:
self.model_json = json.load(fp)
model_dpa2 = self.model_json
ntypes = len(model_dpa2["type_map"])
dparams = model_dpa2["descriptor"]
dparams["ntypes"] = ntypes
assert dparams["type"] == "se_atten"
dparams.pop("type")
dparams["concat_output_tebd"] = False
dparams["use_tebd_bias"] = True
des = DescrptDPA1(
**dparams,
).to(env.DEVICE)
target_dict = des.state_dict()
source_dict = paddle.load(str(self.file_model_param))
type_embd_dict = paddle.load(str(self.file_type_embed))
target_dict = translate_se_atten_and_type_embd_dicts_to_dpa1(
target_dict,
source_dict,
type_embd_dict,
)
des.set_state_dict(target_dict)

coord = self.coord
atype = self.atype
box = self.cell
(
extended_coord,
extended_atype,
mapping,
nlist,
) = extend_input_and_build_neighbor_list(
coord,
atype,
des.get_rcut(),
des.get_sel(),
mixed_types=des.mixed_types(),
box=box,
)
descriptor, env_mat, diff, rot_mat, sw = des(
extended_coord,
extended_atype,
nlist,
mapping=mapping,
)
self.assertEqual(descriptor.shape[-1], des.get_dim_out())
self.assertAlmostEqual(6.0, des.get_rcut())
self.assertEqual(30, des.get_nsel())
self.assertEqual(2, des.get_ntypes())
np.testing.assert_allclose(
descriptor.reshape([-1]).numpy(), self.ref_d.numpy(), atol=1e-10, rtol=1e-10
)

dparams["concat_output_tebd"] = True
des = DescrptDPA1(
**dparams,
).to(env.DEVICE)
descriptor, env_mat, diff, rot_mat, sw = des(
extended_coord,
extended_atype,
nlist,
mapping=mapping,
)
self.assertEqual(descriptor.shape[-1], des.get_dim_out())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add docstring and fix assert statement.

The test method needs a docstring, and the assert statement should be rewritten to avoid side effects.

Apply these changes:

 def test_descriptor(self) -> None:
+    """Test the DescrptDPA1 class functionality.
+    
+    Verifies:
+    - Correct descriptor output dimensions with different concat_output_tebd settings
+    - Proper rcut, nsel, and ntypes values
+    - Descriptor values match reference data
+    """
-    assert dparams["type"] == "se_atten"
+    descriptor_type = dparams["type"]
+    if descriptor_type != "se_atten":
+        raise AssertionError(f"Expected descriptor type 'se_atten', got '{descriptor_type}'")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_descriptor(self) -> None:
with open(Path(CUR_DIR) / "models" / "dpa1.json") as fp:
self.model_json = json.load(fp)
model_dpa2 = self.model_json
ntypes = len(model_dpa2["type_map"])
dparams = model_dpa2["descriptor"]
dparams["ntypes"] = ntypes
assert dparams["type"] == "se_atten"
dparams.pop("type")
dparams["concat_output_tebd"] = False
dparams["use_tebd_bias"] = True
des = DescrptDPA1(
**dparams,
).to(env.DEVICE)
target_dict = des.state_dict()
source_dict = paddle.load(str(self.file_model_param))
type_embd_dict = paddle.load(str(self.file_type_embed))
target_dict = translate_se_atten_and_type_embd_dicts_to_dpa1(
target_dict,
source_dict,
type_embd_dict,
)
des.set_state_dict(target_dict)
coord = self.coord
atype = self.atype
box = self.cell
(
extended_coord,
extended_atype,
mapping,
nlist,
) = extend_input_and_build_neighbor_list(
coord,
atype,
des.get_rcut(),
des.get_sel(),
mixed_types=des.mixed_types(),
box=box,
)
descriptor, env_mat, diff, rot_mat, sw = des(
extended_coord,
extended_atype,
nlist,
mapping=mapping,
)
self.assertEqual(descriptor.shape[-1], des.get_dim_out())
self.assertAlmostEqual(6.0, des.get_rcut())
self.assertEqual(30, des.get_nsel())
self.assertEqual(2, des.get_ntypes())
np.testing.assert_allclose(
descriptor.reshape([-1]).numpy(), self.ref_d.numpy(), atol=1e-10, rtol=1e-10
)
dparams["concat_output_tebd"] = True
des = DescrptDPA1(
**dparams,
).to(env.DEVICE)
descriptor, env_mat, diff, rot_mat, sw = des(
extended_coord,
extended_atype,
nlist,
mapping=mapping,
)
self.assertEqual(descriptor.shape[-1], des.get_dim_out())
def test_descriptor(self) -> None:
"""Test the DescrptDPA1 class functionality.
Verifies:
- Correct descriptor output dimensions with different concat_output_tebd settings
- Proper rcut, nsel, and ntypes values
- Descriptor values match reference data
"""
with open(Path(CUR_DIR) / "models" / "dpa1.json") as fp:
self.model_json = json.load(fp)
model_dpa2 = self.model_json
ntypes = len(model_dpa2["type_map"])
dparams = model_dpa2["descriptor"]
dparams["ntypes"] = ntypes
descriptor_type = dparams["type"]
if descriptor_type != "se_atten":
raise AssertionError(f"Expected descriptor type 'se_atten', got '{descriptor_type}'")
dparams.pop("type")
dparams["concat_output_tebd"] = False
dparams["use_tebd_bias"] = True
des = DescrptDPA1(
**dparams,
).to(env.DEVICE)
target_dict = des.state_dict()
source_dict = paddle.load(str(self.file_model_param))
type_embd_dict = paddle.load(str(self.file_type_embed))
target_dict = translate_se_atten_and_type_embd_dicts_to_dpa1(
target_dict,
source_dict,
type_embd_dict,
)
des.set_state_dict(target_dict)
coord = self.coord
atype = self.atype
box = self.cell
(
extended_coord,
extended_atype,
mapping,
nlist,
) = extend_input_and_build_neighbor_list(
coord,
atype,
des.get_rcut(),
des.get_sel(),
mixed_types=des.mixed_types(),
box=box,
)
descriptor, env_mat, diff, rot_mat, sw = des(
extended_coord,
extended_atype,
nlist,
mapping=mapping,
)
self.assertEqual(descriptor.shape[-1], des.get_dim_out())
self.assertAlmostEqual(6.0, des.get_rcut())
self.assertEqual(30, des.get_nsel())
self.assertEqual(2, des.get_ntypes())
np.testing.assert_allclose(
descriptor.reshape([-1]).numpy(), self.ref_d.numpy(), atol=1e-10, rtol=1e-10
)
dparams["concat_output_tebd"] = True
des = DescrptDPA1(
**dparams,
).to(env.DEVICE)
descriptor, env_mat, diff, rot_mat, sw = des(
extended_coord,
extended_atype,
nlist,
mapping=mapping,
)
self.assertEqual(descriptor.shape[-1], des.get_dim_out())

@HydrogenSulfate HydrogenSulfate changed the title [WIP] pd: support dpa1 pd: support dpa1 Nov 29, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
deepmd/pd/utils/dataloader.py (1)

345-358: Add documentation for the new sampler utility

The implementation looks good, but would benefit from additional documentation:

  1. Add a docstring describing:
    • Purpose of the function
    • Parameter descriptions
    • Return value description
  2. Consider adding type hints for parameters

Example docstring:

def get_sampler_from_params(_data: DpLoaderSet, _params: dict) -> WeightedRandomSampler:
    """Create a sampler based on the provided parameters.
    
    Args:
        _data: Dataset to sample from
        _params: Dictionary containing sampling parameters:
                - sys_probs: Explicit system probabilities
                - auto_prob: Automatic probability calculation method
    
    Returns:
        WeightedRandomSampler configured according to the parameters
    """
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 56e079c and 3d70e7c.

📒 Files selected for processing (2)
  • deepmd/pd/train/training.py (11 hunks)
  • deepmd/pd/utils/dataloader.py (6 hunks)
🔇 Additional comments (5)
deepmd/pd/utils/dataloader.py (2)

200-200: LGTM: Type annotations enhance code clarity

The added return type annotations improve code maintainability and IDE support without affecting functionality.

Also applies to: 245-245, 251-251, 260-260, 269-269, 277-277


223-237: LGTM: Proper distributed training log handling

The added rank check ensures that summary information is printed only once in distributed training scenarios, preventing duplicate logs. The implementation correctly handles both distributed and non-distributed cases.

deepmd/pd/train/training.py (3)

675-675: Add error handling for profiling calls.

The profiling calls should be wrapped in a try-finally block to ensure proper cleanup, even if an error occurs.

Also applies to: 895-897


994-994: LGTM! Good type annotation.

Adding return type annotation improves code readability and type safety.


1121-1123: LGTM! Good type annotation.

Adding return type annotation improves code readability and type safety.

@@ -66,6 +63,7 @@
make_stat_input,
)
from deepmd.pd.utils.utils import (
nvprof_context,
Copy link
Contributor

@coderabbitai coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unused import.

The nvprof_context function has been removed from deepmd/pd/utils/utils.py, but it's still being imported here. This will cause runtime import errors.

-    nvprof_context,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvprof_context is still in deepmd/pd/utils/utils.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HydrogenSulfate, thank you for pointing that out. Since nvprof_context is still present in deepmd/pd/utils/utils.py, the import statement in deepmd/pd/train/training.py is valid. I apologize for the confusion.


✏️ Learnings added
Learnt from: HydrogenSulfate
PR: deepmodeling/deepmd-kit#4414
File: deepmd/pd/train/training.py:66-66
Timestamp: 2024-11-29T12:15:22.120Z
Learning: The function `nvprof_context` is defined in `deepmd/pd/utils/utils.py`, so importing it in `deepmd/pd/train/training.py` is correct.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link

codecov bot commented Nov 29, 2024

Codecov Report

Attention: Patch coverage is 85.08065% with 111 lines in your changes missing coverage. Please review.

Project coverage is 84.09%. Comparing base (a11f264) to head (e74d272).
Report is 1 commits behind head on devel.

Files with missing lines Patch % Lines
deepmd/pd/model/descriptor/se_atten.py 92.81% 27 Missing ⚠️
deepmd/pd/model/descriptor/dpa1.py 83.64% 26 Missing ⚠️
deepmd/pd/train/training.py 42.85% 16 Missing ⚠️
deepmd/pd/model/descriptor/se_a.py 53.12% 15 Missing ⚠️
deepmd/pd/model/network/layernorm.py 88.52% 7 Missing ⚠️
deepmd/pd/model/task/fitting.py 66.66% 7 Missing ⚠️
deepmd/pd/model/model/ener_model.py 0.00% 4 Missing ⚠️
deepmd/pd/entrypoints/main.py 0.00% 2 Missing ⚠️
deepmd/pd/utils/dataloader.py 87.50% 2 Missing ⚠️
deepmd/pd/utils/utils.py 80.00% 2 Missing ⚠️
... and 3 more
Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #4414      +/-   ##
==========================================
+ Coverage   83.74%   84.09%   +0.34%     
==========================================
  Files         667      670       +3     
  Lines       61534    62111     +577     
  Branches     3486     3487       +1     
==========================================
+ Hits        51534    52230     +696     
+ Misses       8874     8754     -120     
- Partials     1126     1127       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
deepmd/pd/train/training.py (2)

879-883: Add error handling for loss values in tensorboard logging.

Consider adding error handling for potential NaN/Inf values before logging to tensorboard.

-                writer.add_scalar(f"{task_key}/loss", loss.item(), display_step_id)
+                loss_value = loss.item()
+                if paddle.isfinite(loss_value):
+                    writer.add_scalar(f"{task_key}/loss", loss_value, display_step_id)
+                else:
+                    log.warning(f"Skipping tensorboard logging for {task_key}/loss due to non-finite value")

-                for item in more_loss:
-                    writer.add_scalar(
-                        f"{task_key}/{item}", more_loss[item].item(), display_step_id
-                    )
+                for item in more_loss:
+                    item_value = more_loss[item].item()
+                    if paddle.isfinite(item_value):
+                        writer.add_scalar(f"{task_key}/{item}", item_value, display_step_id)
+                    else:
+                        log.warning(f"Skipping tensorboard logging for {task_key}/{item} due to non-finite value")

1207-1229: Improve error message clarity in case embedding validation.

The case embedding configuration is well-implemented, but the error message could be more descriptive.

-            f"All models must have the same dimension of case embedding, while the settings are: {numb_case_embd_list}"
+            f"Inconsistent case embedding dimensions across models. Found dimensions {dict(zip(sorted_model_keys, numb_case_embd_list))}. All models must have the same dimension."
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 3d70e7c and 8b5b4a8.

📒 Files selected for processing (4)
  • deepmd/pd/entrypoints/main.py (3 hunks)
  • deepmd/pd/train/training.py (17 hunks)
  • deepmd/pd/train/wrapper.py (2 hunks)
  • source/tests/pd/model/test_autodiff.py (0 hunks)
💤 Files with no reviewable changes (1)
  • source/tests/pd/model/test_autodiff.py
🧰 Additional context used
📓 Learnings (1)
deepmd/pd/train/training.py (1)
Learnt from: HydrogenSulfate
PR: deepmodeling/deepmd-kit#4414
File: deepmd/pd/train/training.py:66-66
Timestamp: 2024-11-29T12:15:22.226Z
Learning: The function `nvprof_context` is defined in `deepmd/pd/utils/utils.py`, so importing it in `deepmd/pd/train/training.py` is correct.
🔇 Additional comments (7)
deepmd/pd/train/wrapper.py (2)

29-29: LGTM: Return type annotation added

Adding explicit return type annotation improves type safety and code clarity.


67-73: ⚠️ Potential issue

Critical: Parameter sharing functionality removed without replacement

The removal of the share_params implementation and replacement with NotImplementedError is concerning because:

  1. The docstring indicates this method is crucial for parameter sharing during multitask training
  2. The class still supports multitask training (via self.multi_task flag and task-specific routing in forward)
  3. The PR objectives don't mention the intention to remove this functionality

This could break multitask training scenarios that depend on parameter sharing.

Let's verify the impact:

Please clarify:

  1. Is this temporary removal part of the planned changes?
  2. Are there alternative mechanisms for parameter sharing?
  3. What's the timeline for implementing this functionality?
deepmd/pd/entrypoints/main.py (2)

233-233: LGTM: Type annotations improve code clarity

The addition of return type annotations to public functions enhances code maintainability and type safety. This follows Python best practices and helps with static type checking.

Also applies to: 343-343, 394-394


324-328: LGTM: Proper tensor conversion with device placement

The conversion of min_nbor_dist to Paddle tensors with explicit dtype and device placement is implemented correctly for both single and multi-task models. This ensures proper handling of neighbor distances, which is crucial for DPA-1 support.

Let's verify the tensor conversion works correctly across different devices:

Also applies to: 331-335

✅ Verification successful

Based on the shell script results, I can now generate the final response:

Tensor conversion with device placement is implemented consistently

The tensor conversion of min_nbor_dist is implemented correctly with proper device placement using DEVICE from deepmd.pd.utils.env. The codebase shows consistent usage of device placement across tensor operations, with DEVICE being properly imported and used for tensor allocations.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if DEVICE is properly imported and used across the codebase
# Test: Search for DEVICE usage to ensure consistent device placement
rg -A 2 "from deepmd\.pd\.utils\.env import DEVICE"

# Test: Check if there are any direct device assignments that might conflict
rg "place=[^D]|device=[^D]"

# Test: Verify tensor conversion patterns for consistency
ast-grep --pattern 'paddle.to_tensor($_, dtype=$_, place=$_)'

Length of output: 98480

deepmd/pd/train/training.py (3)

88-91: LGTM! Documentation improvements.

The added docstring and return type annotations improve code clarity and type safety.

Also applies to: 109-109


700-703: LGTM! Robust gradient clipping implementation.

The gradient clipping implementation with error_if_nonfinite=True helps prevent training instability from exploding gradients.


1188-1203: LGTM! Robust model wrapper implementation.

The model wrapper implementation properly handles both single and multi-task models, with correct case embedding initialization and state management.

Copy link
Member

@njzjz njzjz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add consistent tests. (please note my changes in #4438)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
source/tests/consistent/model/test_dpa1.py (1)

248-255: Consider using ravel() for consistency

The Paddle backend uses flatten() while other backends use ravel(). While both methods produce similar results, using ravel() would maintain consistency across all backends.

-                ret["energy"].flatten(),
-                ret["atom_energy"].flatten(),
-                ret["force"].flatten(),
-                ret["virial"].flatten(),
-                ret["atom_virial"].flatten(),
+                ret["energy"].ravel(),
+                ret["atom_energy"].ravel(),
+                ret["force"].ravel(),
+                ret["virial"].ravel(),
+                ret["atom_virial"].ravel(),
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 8b5b4a8 and d98c644.

📒 Files selected for processing (2)
  • source/tests/consistent/descriptor/test_dpa1.py (5 hunks)
  • source/tests/consistent/model/test_dpa1.py (7 hunks)
🔇 Additional comments (11)
source/tests/consistent/model/test_dpa1.py (6)

17-17: LGTM: Import follows consistent pattern

The addition of INSTALLED_PD import follows the established pattern for backend availability flags.


41-45: LGTM: Paddle imports follow established pattern

The conditional import block for Paddle models follows the same structure as other backends, with proper fallback handling.


99-99: LGTM: Class attribute follows convention

The addition of pd_class follows the established pattern for backend class assignments.


112-113: LGTM: Reference backend selection properly extended

The addition of Paddle backend to reference selection follows the established priority order and error handling pattern.


131-132: LGTM: Model initialization properly handled

The Paddle model initialization in pass_data_to_cls follows the established pattern.


204-211: LGTM: Evaluation method follows convention

The eval_pd method correctly implements the standard evaluation interface used by other backends.

source/tests/consistent/descriptor/test_dpa1.py (5)

21-21: LGTM: Import changes follow established patterns

The import changes correctly follow the same pattern used for other backends, with proper conditional handling when Paddle is not installed.

Also applies to: 43-46


195-221: LGTM: Skip condition matches other backends

The skip_pd property correctly implements the same skip logic used by other backends, properly handling both common skip conditions and specific attention layer tests.


323-323: LGTM: Class assignment follows convention

The pd_class assignment follows the established pattern used for other backend implementations.


424-432: LGTM: Evaluation method matches other backends

The eval_pd method correctly implements the evaluation logic consistent with other backend implementations, properly passing all required parameters to the descriptor evaluation method.


43-46: Verify test coverage for Paddle backend

While the implementation looks correct, let's verify that all test cases will be properly executed for the Paddle backend.

Also applies to: 195-221, 323-323, 424-432

✅ Verification successful

Let me gather more information to verify the test coverage specifically for the Paddle backend.


Let me try one more verification to understand the test execution flow and Paddle backend handling.


Test coverage for Paddle backend is properly implemented

The verification shows that the Paddle backend test coverage is comprehensive and well-implemented:

  1. The test class TestDPA1 includes proper handling of the Paddle backend through:
    • Conditional import of DescrptDPA1PD
    • A dedicated skip_pd property that controls test execution
    • Implementation of eval_pd method that properly evaluates Paddle descriptors
  2. The test infrastructure in common.py includes complete Paddle-specific utilities:
    • Proper device handling with PD_DEVICE
    • Paddle-specific neighbor list building
    • Tensor conversion and evaluation methods
  3. The test parameterization covers all relevant scenarios, and Paddle tests are skipped only when appropriate (e.g., when specific features are incompatible)

The implementation follows the same pattern as other backends (TF, PyTorch, JAX) ensuring consistent test coverage across all supported frameworks.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check test coverage for Paddle backend implementation

# Check if there are any test cases that might be inadvertently skipped
echo "Checking for potential gaps in test coverage..."

# Look for test methods that might not handle pd backend
rg -l "def test_.*" source/tests/consistent/descriptor/
rg "(?<!skip_)pd" source/tests/consistent/descriptor/

# Check for any TODO or FIXME comments related to pd implementation
rg -i "todo.*pd|fixme.*pd" source/tests/consistent/descriptor/

# Look for any inconsistencies in backend handling
rg "(tf|pt|jax|pd)_(class|obj)" source/tests/consistent/descriptor/

Length of output: 7067


Script:

#!/bin/bash
# Check test files for Paddle backend implementation and test methods
rg -A 5 "class.*Test" source/tests/consistent/descriptor/test_dpa1.py
rg "def test_" source/tests/consistent/descriptor/test_dpa1.py

# Check if there are any skip decorators or conditional test executions
rg "@.*skip" source/tests/consistent/descriptor/test_dpa1.py
rg "skipif|pytest.mark.skip" source/tests/consistent/descriptor/test_dpa1.py

# Check the test base class to understand test execution flow
rg -A 10 "class.*Base" source/tests/consistent/descriptor/common.py

Length of output: 514


Script:

#!/bin/bash
# Check the base test class implementation and test execution flow
cat source/tests/consistent/descriptor/common.py

# Check the complete test file for DPA1
cat source/tests/consistent/descriptor/test_dpa1.py

# Look for any conditional test execution patterns
rg -l "INSTALLED_PD" source/tests/consistent/descriptor/

Length of output: 20080

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
source/tests/consistent/model/test_dpa1.py (2)

112-113: Document the backend priority order

While the PD backend is correctly added to the reference backend selection, the priority order of backends (PT > TF > PD > JAX > DP) is not documented. Consider adding a comment explaining why this specific order was chosen.


248-255: Consider using ravel() for consistency

While the return value extraction is functionally correct, it uses flatten() while other backends use ravel(). Although both methods achieve similar results, consider using ravel() for consistency with other backend implementations.

Apply this diff to maintain consistency:

        elif backend is self.RefBackend.PD:
            return (
-                ret["energy"].flatten(),
-                ret["atom_energy"].flatten(),
-                ret["force"].flatten(),
-                ret["virial"].flatten(),
-                ret["atom_virial"].flatten(),
+                ret["energy"].ravel(),
+                ret["atom_energy"].ravel(),
+                ret["force"].ravel(),
+                ret["virial"].ravel(),
+                ret["atom_virial"].ravel(),
            )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between d98c644 and e74d272.

📒 Files selected for processing (2)
  • source/tests/consistent/descriptor/test_dpa1.py (5 hunks)
  • source/tests/consistent/model/test_dpa1.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • source/tests/consistent/descriptor/test_dpa1.py
🔇 Additional comments (4)
source/tests/consistent/model/test_dpa1.py (4)

17-17: LGTM: Backend initialization follows established patterns

The PD backend initialization follows the same pattern as other backends, with proper conditional imports and fallback handling.

Also applies to: 41-45


99-99: LGTM: Class attribute follows convention

The pd_class attribute is correctly added, maintaining consistency with other backend class attributes.


131-132: LGTM: Model instantiation is consistent

The PD model instantiation follows the established pattern using the appropriate get_model_pd function.


204-211: LGTM: Evaluation method is properly implemented

The eval_pd method correctly follows the established pattern for backend evaluation methods, passing all required parameters.

@HydrogenSulfate
Copy link
Contributor Author

HydrogenSulfate commented Nov 30, 2024

Please add consistent tests. (please note my changes in #4438)

Thanks for comments, and updated 2 consistency UT of dpa1,
=========== 346 passed, 446 skipped, 1 warning in 48.38s =============

@njzjz njzjz requested a review from iProzd November 30, 2024 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants