-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK] Improvement for DecimalType serialization in toTColumn for column-based TRowSet generation #5810
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5810 +/- ##
============================================
- Coverage 61.35% 61.34% -0.01%
Complexity 23 23
============================================
Files 608 608
Lines 35931 35945 +14
Branches 4937 4942 +5
============================================
+ Hits 22045 22050 +5
- Misses 11494 11495 +1
- Partials 2392 2400 +8 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
Is this still necessary after #5811 is fixed? |
I think yes. This is skipping falling through to |
I think this might not be necessary if the RowSet.toHiveString performance issue is fixed since they all just convert decimal to string. |
The implementation in this PR hints the Java's BigDecimal as the targeted class , and the value is get as BigDecimal directly. And the default value is also directly apply to nulls. |
Seems that the answer should be NO. |
🔍 Description
Issue References 🔗
Subtask of #5808.
Describe Your Solution 🔧
Improvement for DecimalType serialization in toTColumn for column-based TRowSet generation, by skipping type wrapping inside
HiveResult.toHiveString
.Types of changes 🔖
Test Plan 🧪
Behavior Without This Pull Request ⚰️
Column-based TRowSet generation with 3000 rows:
decimalVal 8.935 ms 335.758 rows/ms
Behavior With This Pull Request 🎉
decimalVal 1.135 ms 2643.172 rows/ms
Related Unit Tests
Spark Engine's
RowSetSuite
Checklists
📝 Author Self Checklist
📝 Committer Pre-Merge Checklist
Be nice. Be informative.