Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Fixed length vector builder #99970

Merged
merged 3 commits into from
Sep 27, 2023

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Sep 27, 2023

This adds things like IntVector.FixedBuilder which is slightly simpler to use than constructing the arrays by hand. It also measures bytes used up front in the circuit breaker. And it'll be easier to integrate it into framework happening over in #99931 to handle errors in topn.

This also uses it in mv_ functions.

This adds things like `IntVector.FixedBuilder` which is slightly simpler
to use than constructing the arrays by hand. It also measures bytes used
up front in the circuit breaker. And it'll be easier to integrate it
into framework happening over in elastic#99931 to handle errors in topn.

This also uses it in `mv_` functions.
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

return size == 1
? ConstantIntVector.RAM_BYTES_USED
: IntArrayVector.BASE_RAM_BYTES_USED + RamUsageEstimator.alignObjectSize(
(long) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER + size * Integer.BYTES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine. Separately we should update IntArrayVector::ramBytesEstimated, and have it based on an array length, rather than an array - then it can be used here as well as for the built vector. ( I think you made a similar comment on that original PR )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did! I see what you mean though, we can make these static methods on the classes. Will check.

@@ -38,7 +38,7 @@ public String name() {
public Block evalNullable(Block fieldVal) {
DoubleBlock v = (DoubleBlock) fieldVal;
int positionCount = v.getPositionCount();
DoubleBlock.Builder builder = DoubleBlock.newBlockBuilder(positionCount);
DoubleBlock.Builder builder = DoubleBlock.newBlockBuilder(positionCount, driverContext.blockFactory());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noice!!!

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only optional comments; and cool to see more circuit breakery in action!

/**
* A builder that never grows.
*/
sealed interface FixedBuilder extends Vector.Builder permits BooleanVectorFixedBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sealed interfaces, nice!

return this;
}

private static long size(int size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: got confused for a little here because of names. Maybe something like this would be clearer?

Suggested change
private static long size(int size) {
private static long size(int length) {

(could also be applied to the constructor) or

    private static long bytesUsed(int size)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I see. I think size is quite reasonable for the number of elements. But ramBytesUsed is actually the name we use for this in other places so I'll steal that.

* The next byte to write into. {@code -1} means the vector has already
* been built.
*/
private int i;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
private int i;
private int nextIndex;

private final BlockFactory blockFactory;
private final double[] values;
/**
* The next byte to write into. {@code -1} means the vector has already
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this is not always a byte for all data types.

Suggested change
* The next byte to write into. {@code -1} means the vector has already
* The next position to write into. {@code -1} means the vector has already

Comment on lines 50 to 51
// vectorBuilder(10, blockFactory).close();
// assertThat(blockFactory.breaker().getUsed(), equalTo(0L));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftovers? Test doesn't seem to do anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup. letfover. When I port this over to my tracking branch that'll need something. I'll update.

@nik9000 nik9000 added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 27, 2023
@elasticsearchmachine elasticsearchmachine merged commit dd1cb82 into elastic:main Sep 27, 2023
@nik9000 nik9000 deleted the esql_fixed_builder branch September 27, 2023 19:18
@nik9000
Copy link
Member Author

nik9000 commented Sep 27, 2023

@luigidellaquila would you be willing to drag this thing into more places? Like, maybe the conversion functions and evals?

piergm pushed a commit to piergm/elasticsearch that referenced this pull request Oct 2, 2023
This adds things like `IntVector.FixedBuilder` which is slightly simpler
to use than constructing the arrays by hand. It also measures bytes used
up front in the circuit breaker. And it'll be easier to integrate it
into framework happening over in elastic#99931 to handle errors in topn.

This also uses it in `mv_` functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >non-issue Team:QL (Deprecated) Meta label for query languages team v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants