Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not allocate task memory for VALUES/metadata-only queries #19027

Merged
merged 2 commits into from
Sep 14, 2023

Conversation

losipiuk
Copy link
Member

@losipiuk losipiuk commented Sep 13, 2023

If query does not scan any tables or only accesses information_schema do assume its tasks do not requrie any memory when allocating node for task execution.

That way metadata only queries can execute even if cluster is fully pre allocated.

Description

Additional context and related issues

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

}

// If fragment source is not reading any external tables or only accesses information_schema assume it does not need significant amount of memory.
// Allow scheduling even if whole server memory is pre allocated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

@findepi
Copy link
Member

findepi commented Sep 13, 2023

information_schema do assume its tasks do not requrie any memory when allocating node for task execution.

to be precise, some memory is allocated (list of tables, columns of view definitions)
but it's not accounted for, afaict

Comment on lines +1200 to +1206
if (!fragment.getRemoteSourceNodes().stream().flatMap(node -> node.getSourceFragmentIds().stream())
.allMatch(sourceFragmentId -> stageExecutions.get(getStageId(sourceFragmentId)).isNoMemoryFragment())) {
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remote sources matter?
shouldn't isNoMemoryFragment(PlanFragment) be a local assessment about the fragment?

is this is about whole execution subtree, let's rename the method to indicate that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is about local fragment. It just uses this bit of information for source stages as a shortcut for "source will produce negligible amount of data". Which is a simplification - but not that much given logic we use to set this bit. I will add a comment.

}

// If fragment source is not reading any external tables or only accesses information_schema assume it does not need significant amount of memory.
// Allow scheduling even if whole server memory is pre allocated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will we schedule unbounded number of information_schema queries? there must be some sort of a limit for them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not have resource queues in place yes. But one can argue this is a mis-setup anyway - and very much possible in streaming mode right now.

StageExecution execution = new StageExecution(
queryStateMachine,
taskDescriptorStorage,
stage,
taskSource,
sinkPartitioningScheme,
exchange,
memoryEstimatorFactory.createPartitionMemoryEstimator(),
noMemoryFragment,
noMemoryFragment ? new NoMemoryPartitionMemoryEstimator() : memoryEstimatorFactory.createPartitionMemoryEstimator(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the standard estimator be modified to more adequatly estimate memory usage of information_schema queries?
like

  • if source stage and this is information_schema assume 0 memory need and ~0 (negligible) number of rows produced
  • for intermediate stage, and sources produce ~0 rows, assume 0 memory need

(actually we should get rid of intermediate stages for DESCRIBE, it's nonsense)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having smarter memory estimation in general is a planned item - but much more complex than this fix. And will require some of a restructuring of interfaces. I think this is fine to have this as a stopgap for sake of metadata queries which are a very common outlier even for clusters which tend to run massing ETL workloads.

Or maybe I did not understand your suggestion here. Let's chat tomorrow.

Future<?> nonMetadataQueryFuture = backgroundExecutor.submit(() -> {
query("select count(*) from nation");
});
Thread.sleep(2000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do without sleep (7s total)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to test if query already started introspecting query runner?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed most (just 1s left)

@losipiuk losipiuk marked this pull request as ready for review September 13, 2023 21:36
If query does not scan any tables or only accesses information_schema do
assume its tasks do not requrie any memory when allocating node for task
execution.

That way metadata only queries can execute even if cluster is fully
pre allocated.
@losipiuk losipiuk merged commit 216c25f into trinodb:master Sep 14, 2023
86 checks passed
@losipiuk losipiuk deleted the lo/fte-metadata-queries branch September 14, 2023 10:21
@github-actions github-actions bot added this to the 427 milestone Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants