-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-37728: [Java] Add methods to get an Iterable for a ValueVector #41895
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose eventually we can add typed/generic versions? (I suppose we'd have to write out overloads per class so that IntVector
returns Iterator<Integer>
etc. Or else accept Class<T>
and assert at runtime that the vector is of the right type.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lidavidm We can. I view this as functionality that is meant to only be used by the unit tests. It could be enhanced to support the Iterator type matching the ValueVector type. For that I would favour adding the type information to the ValueVector classes and using that to determine the type of the Iterator.
The Iterator
and Iterable
are publicly available. Is there anything more that can be done to discourage their use in Arrow applications?
Are you looking for properly typed Iterator
s in this PR or should that wait for a future PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you mean for this to only be available in unit tests then it should be in a distinct module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lidavidm I have updated this. There is a new interface that a ValueVector
can implement. The new interface provides methods for getting an Iterator
or Iterable
. The new interface also allows the value type to be specified so that the Iterator
will be properly typed.
java/dataset/pom.xml
Outdated
<dependency> | ||
<groupId>org.hamcrest</groupId> | ||
<artifactId>hamcrest</artifactId> | ||
<version>2.2</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we set the version in dependencyManagement at the root level? I believe a few other modules use hamcrest too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
@lidavidm should we upgrade to Junit5? |
We can do that separately |
* | ||
* @return number of values in the vector | ||
*/ | ||
int getValueCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this extend ValueVector to avoid the repeated declarations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lidavidm Updated.
9da8f79
to
e6c5efa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think the last thing is just, can we add a test suite that exercises these new methods for each vector?
@lidavidm I have added unit tests for each vector that implements ValueIterableVector.
|
* The new interface indicates that a ValueVector is iterable * Contains default methods for getting an Iterator and Iterable
bff79a2
to
804c0b4
Compare
After merging your PR, Conbench analyzed the 8 benchmarking runs that have been run so far on merge-commit 4413110. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 119 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
Simplify validating the values in a
ValueVector
in unit tests.What changes are included in this PR?
Methods for creating an
Iterable
andIterator
for aValueVector
. Also updated some unit tests to use the new methods.Are these changes tested?
Some unit tests were updated.
Are there any user-facing changes?
The new methods are publicly available in the
ValueVectorUtility
class.