-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: align dataset summary output #4541
Conversation
📝 WalkthroughWalkthroughThe changes in the Changes
Sequence DiagramsequenceDiagram
participant DS as DeepmdDataSystem
participant Constructor as __init__
participant AutoTS as _make_auto_ts
Constructor->>DS: Initialize with test_size
alt Percentage test_size
Constructor->>AutoTS: Compute test size
AutoTS-->>Constructor: Return calculated test size
else Fixed test size
Constructor->>Constructor: Use provided test size
end
The sequence diagram illustrates how the new Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (2)
deepmd/utils/data_system.py (2)
Line range hint
78-106
: Add validation for test size percentage range.The code accepts percentage-based test size but doesn't validate if the percentage is within the valid range (0-100).
Add validation before the percentage calculation:
elif isinstance(self.test_size, str): words = self.test_size.split("%") try: percent = int(words[0]) + if not 0 <= percent <= 100: + raise ValueError(f"test_size percentage must be between 0 and 100, got {percent}") except ValueError as e: raise RuntimeError("unknown test_size rule " + words[0]) from e self.test_size = self._make_auto_ts(percent)
Line range hint
582-599
: Improve deprecation handling of get_test method.The method is marked as deprecated using a comment, but it should use Python's built-in deprecation warning mechanism.
Apply this improvement:
- def get_test(self, sys_idx: Optional[int] = None, n_test: int = -1): # depreciated + def get_test(self, sys_idx: Optional[int] = None, n_test: int = -1): + """Get test data from the data systems. + + .. deprecated:: 2.0.0 + Use get_test_data() instead. Will be removed in version 3.0.0. + """ + warnings.warn( + "get_test() is deprecated since version 2.0.0 and will be removed in 3.0.0. " + "Use get_test_data() instead.", + DeprecationWarning, + stacklevel=2, + )
🧹 Nitpick comments (2)
deepmd/utils/data_system.py (2)
678-678
: Fix string formatting in print_summary.The string formatting has been adjusted for better readability, but the line contains trailing whitespace.
Remove the trailing whitespace:
- "%s %6d %6d %6d %9.3e %3s", + "%s %6d %6d %6d %9.3e %3s".rstrip(),
Line range hint
551-558
: Improve _make_auto_ts implementation.The method needs better documentation and edge case handling.
Apply these improvements:
def _make_auto_ts(self, percent): + """Calculate test size for each system based on percentage. + + Parameters + ---------- + percent : int + Percentage of total samples to use for testing (0-100) + + Returns + ------- + list + List of test sizes for each system + """ ts = [] for ii in range(self.nsystems): ni = self.batch_size[ii] * self.nbatches[ii] tsi = int(ni * percent / 100) + if tsi == 0 and percent > 0: + warnings.warn(f"System {ii} has too few samples, setting minimum test size to 1") + tsi = 1 ts.append(tsi) return ts
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
deepmd/utils/data_system.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Test C++ (false)
- GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Test C++ (true)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## devel #4541 +/- ##
==========================================
- Coverage 84.58% 84.55% -0.04%
==========================================
Files 675 677 +2
Lines 63580 63902 +322
Branches 3488 3487 -1
==========================================
+ Hits 53782 54034 +252
- Misses 8673 8742 +69
- Partials 1125 1126 +1 ☔ View full report in Codecov by Sentry. |
Summary by CodeRabbit
New Features
Deprecation
get_test
method as deprecatedImprovements