Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dlm add auto rollover condition max age #94950

Merged

Conversation

gmarouli
Copy link
Contributor

@gmarouli gmarouli commented Mar 31, 2023

What are we trying to do:
In DLM the rollover is going to be configured cluster wide via a setting. For most conditions we feel confident that this is enough but we have our concerns about the value of max_age. We believe that it might be better to choose a max_age that depends on the data retention the user has requested for a specific data stream, the smaller the retention time the more fine-grained the indices.

How are we planning to do it
For this reason, we are introducing an option to configure the max_age with auto.

The default rollover configuration will be:

cluster.dlm.default.rollover: max_primary_shard_size=50gb,max_age=auto,max_docs=200000000,min_docs=1

When max_age is auto we’ll use the following retention dependent heuristics to compute the value we’ll use for the rollover operation:

  • If retention is infinite (default) max_age will be 30 days
  • If retention is configured to anything lower than 14 days max_age will be 1 day
  • If retention is configured to anything lower than 3 months max_age will be 7 days
  • If retention is configured to anything greater than 3 months max_age will be 30 days

Implementation Details
In order to design the proposed solution we made some assumptions:

  • The RolloverRequest should have the conditions fully resolved, this way the "automatic" logic can be determined by the DLM and possibly ILM in the future.
  • For now automatic logic will only be supported by DLM. When it's more mature we will see if ILM could benefit from it too, ILM is more complex because the data retention is not as easy to retrieve.
  • The option for having a completely automatic rollover is out of scope for now but it might be something we will consider later.

To facilitate the above we introduce a wrapper of the RolloverConditions the RolloverConfiguration. This class has a method that accepts the data retention as an argument and resolved the configuration to RolloverConditions.
When there is no auto condition, then the RolloverConfiguration contains only an instance of RolloverConditions so it just returns that. If the max_age is configured as auto, then max_age is kept as an automatic property and when we request the RolloverConditions the already calculated conditions are enriched with the calculated value and returned to the caller. Furthermore:

  • We move the parsing of the cluster setting to RolloverConfiguration.
  • We change the XContent serialization of the conditions to max_age: auto when the data retention is not available and max_age: 7d [automatic] when it is available.

Part of: #93596

@gmarouli
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/part-1

@gmarouli gmarouli mentioned this pull request Mar 31, 2023
19 tasks
@gmarouli
Copy link
Contributor Author

gmarouli commented Apr 3, 2023

@elasticmachine update branch

@elasticsearchmachine
Copy link
Collaborator

Hi @gmarouli, I've created a changelog YAML for you.

@gmarouli gmarouli marked this pull request as ready for review April 3, 2023 08:10
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Apr 3, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@gmarouli gmarouli requested review from andreidan and dakrone April 3, 2023 08:22
@gmarouli
Copy link
Contributor Author

gmarouli commented Apr 3, 2023

@andreidan Apologies, I broke something, I will fix it.

Update: Fixed

@gmarouli
Copy link
Contributor Author

gmarouli commented Apr 5, 2023

Merging with main caused some issue, I am resolving them locally and I will update this branch asap.

@gmarouli
Copy link
Contributor Author

gmarouli commented Apr 5, 2023

Updated new endpoints to use RolloverConfiguration instead of RolloverConditions directly.

Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this Mary. This looks great.

Left some minor comments

@gmarouli gmarouli requested a review from andreidan April 5, 2023 16:53
@gmarouli
Copy link
Contributor Author

gmarouli commented Apr 5, 2023

@andreidan thank you for the great feedback, you caught some things I missed but also gave some tips to make things fall into place!

Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on this Mary

I think this is nearly ready, left one last round of suggestions

@gmarouli gmarouli requested a review from andreidan April 6, 2023 12:53
Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on this Mary

🚀

@gmarouli
Copy link
Contributor Author

gmarouli commented Apr 6, 2023

Thanks for the review @andreidan !! 🚀 shipping!

@gmarouli
Copy link
Contributor Author

gmarouli commented Apr 6, 2023

@elasticmachine update branch

@gmarouli
Copy link
Contributor Author

gmarouli commented Apr 6, 2023

@elasticmachine update branch

@gmarouli
Copy link
Contributor Author

gmarouli commented Apr 7, 2023

@elasticmachine update branch

@gmarouli
Copy link
Contributor Author

@elasticmachine update branch

@gmarouli gmarouli merged commit e8d4211 into elastic:main Apr 10, 2023
@gmarouli gmarouli deleted the dlm-add-auto-rollover-condition-max-age branch December 10, 2024 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature Team:Data Management Meta label for data/management team v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants