Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: msq autocompaction #16681

Merged
merged 28 commits into from
Oct 17, 2024
Merged

docs: msq autocompaction #16681

merged 28 commits into from
Oct 17, 2024

Conversation

317brian
Copy link
Contributor

@317brian 317brian commented Jul 2, 2024

Working with @gargvishesh to create the docs for #16291

These docs are written for being able to set the compaction engine at the datasource level. They'll be updated (in this PR or a followup depending on timing) when the clusterwide setting is available

Release note

n/a. Will be in the dev PR.


Key changed/added classes in this PR
  • MyFoo
  • OurBar
  • TheirBaz

This PR has:

  • been self-reviewed.

@317brian 317brian changed the title docs: msq autocompaction docs docs: msq autocompaction Jul 2, 2024
docs/multi-stage-query/known-issues.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, @317brian !
I have left some comments, let me know if they make sense.

docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/ingestion/supervisor.md Outdated Show resolved Hide resolved
docs/ingestion/supervisor.md Outdated Show resolved Hide resolved
docs/ingestion/supervisor.md Outdated Show resolved Hide resolved
docs/ingestion/supervisor.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/multi-stage-query/known-issues.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
* Have the [MSQ task engine extension loaded](../multi-stage-query/index.md#load-the-extension).
* In your Overlord runtime properties, set the following properties:
* `druid.supervisor.compaction.enabled` to `true` so that compaction tasks can be run as a supervisor task
* `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task engine as the compaction engine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

druid.supervisor.compaction.engine is only for setting the default engine -- which is native otherwise --when no engine is explicitly specified in the spec. This point can be mentioned separately.

So it could be put as:
Either set spec.engine to msq in the supervisor spec, or omit spec.engine in the supervisor spec and set druid.supervisor.compaction.engine runtime property on the overlord to msq

"dataSource": "wikipedia", // required
"tuningConfig": {...}, // optional
"granularitySpec": {...}, // optional
...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add engine parameter here

"granularitySpec": {...},
"engine": <native|msq>,            // optional
...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that skipping it here and specifying it later in supervisor-based spec may be confusing. If we keep it here, just want to make sure that users realize that it's only supported with supervisors.

Also, we need to add this field to Automatic compaction dynamic configuration page. Maybe this info can reside there simiar to below:

engine | Engine for compaction. Can be either native or msq. MSQ is only supported with compaction supervisors | no (default = native)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we also include the above on Automatic compaction dynamic configuration page?

docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
@317brian 317brian marked this pull request as ready for review October 9, 2024 19:19
* Have the [MSQ task engine extension loaded](../multi-stage-query/index.md#load-the-extension).
* In your Overlord runtime properties, set the following properties:
* `druid.supervisor.compaction.enabled` to `true` so that compaction tasks can be run as a supervisor task
* Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task engine as the default compaction engine. If you don't do this, you'll need to set `spec.engine` to `msq` for each compaction supervisor spec where you want to use the MSQ task engine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting this also in Overlord?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, to nudge people towards using compaction supervisors, the msq engine is now only supported with supervisor-based compaction on the overlord -- not on the coordinator. So above properties are set on the overlord.

Copy link
Contributor

@gargvishesh gargvishesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A final set of suggestions. Rest looks good.

docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Show resolved Hide resolved
"dataSource": "wikipedia", // required
"tuningConfig": {...}, // optional
"granularitySpec": {...}, // optional
...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we also include the above on Automatic compaction dynamic configuration page?

- Only dynamic and range-based partitioning are supported
- Set `rollup` to `true` if and only if `metricSpec` is not empty or null.
- You can only partition on string dimensions. However, multi-valued string dimensions are not supported.
- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use `maxRowsPerSegment` instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@gargvishesh gargvishesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you @317brian for persevering through this :)

docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/multi-stage-query/known-issues.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/ingestion/supervisor.md Outdated Show resolved Hide resolved
docs/multi-stage-query/known-issues.md Show resolved Hide resolved
Copy link
Contributor

@gargvishesh gargvishesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@vtlim vtlim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few remaining nits but other than that LGTM! 🦖

docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
docs/data-management/automatic-compaction.md Outdated Show resolved Hide resolved
@vtlim vtlim merged commit d1b81f3 into apache:master Oct 17, 2024
12 checks passed
@vtlim vtlim deleted the msq-autocompact-docs branch October 17, 2024 17:40
317brian added a commit to 317brian/druid that referenced this pull request Oct 17, 2024
Co-authored-by: Kashif Faraz <[email protected]>
Co-authored-by: Vishesh Garg <[email protected]>
Co-authored-by: Victoria Lim <[email protected]>
(cherry picked from commit d1b81f3)
vtlim added a commit that referenced this pull request Oct 18, 2024
Co-authored-by: Kashif Faraz <[email protected]>
Co-authored-by: Vishesh Garg <[email protected]>
Co-authored-by: Victoria Lim <[email protected]>
@kfaraz kfaraz added this to the 31.0.0 milestone Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants