-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a configurable bufferPeriod between when a segment is marked unused and deleted by KillUnusedSegments duty #12599
Merged
Merged
Changes from 4 commits
Commits
Show all changes
50 commits
Select commit
Hold shift + click to select a range
c836aa8
Add new configurable buffer period to create gap between mark unused …
capistrant e18c6f1
Changes after testing
capistrant f974016
fixes and improvements
capistrant 7b69b68
changes after initial self review
capistrant c78b282
Merge branch 'master' into implements-12526
capistrant d8b4cc7
self review changes
capistrant c571162
update sql statement that was lacking last_used
capistrant 95ccd4e
shore up some code in SqlMetadataConnector after self review
capistrant 5d38b4c
Merge branch 'master' into implements-12526
capistrant 8a8e518
fix derby compatibility and improve testing/docs
capistrant 7b52874
fix checkstyle violations
capistrant 2fb019e
Merge branch 'master' into implements-12526
capistrant 4e0efb0
Fixes post merge with master
capistrant 0621295
add some unit tests to improve coverage
capistrant a1e9735
ignore test coverage on new UpdateTools cli tool
capistrant d8cb285
another attempt to ignore UpdateTables in coverage check
capistrant 2fc7f19
Merge branch 'master' into implements-12526
capistrant bab850e
change column name to used_flag_last_updated
capistrant 91828c3
fix a method signature after column name switch
capistrant f5b934f
update docs spelling
capistrant 539cc35
Merge branch 'master' into implements-12526
capistrant acfbe06
Update spelling dictionary
capistrant 51f4632
Merge branch 'master' into implements-12526
capistrant df53363
Merge branch 'master' into implements-12526
capistrant 788dafc
Merge branch 'master' into implements-12526
capistrant d2f16a2
Fixing up docs/spelling and integrating altering tasks table with my …
capistrant de2922e
Update NULL values for used_flag_last_updated in the background
capistrant 537c598
Remove logic to allow segs with null used_flag_last_updated to be kil…
capistrant 3d642f0
remove unneeded things now that the new column is automatically updated
capistrant cb9ef38
Test new background row updater method
capistrant 60fcee5
fix broken tests
capistrant 6ae30bd
Merge branch 'master' into implements-12526
capistrant 82454d8
fix create table statement
capistrant 720fa0d
cleanup DDL formatting
capistrant 8d36eb3
Merge branch 'master' into implements-12526
capistrant ec59ed6
Merge branch 'master' into implements-12526
capistrant edd9db8
Merge branch 'master' into implements-12526
capistrant ddc5e8f
Merge branch 'master' into implements-12526
capistrant 9fa11d3
Revert adding columns to entry table by default
capistrant 58e0942
fix compilation issues after merge with master
capistrant 666837e
Merge branch 'master' into implements-12526
capistrant 4647bef
Merge branch 'master' into implements-12526
capistrant 2e4c6ec
discovered and fixed metastore inserts that were breaking integration…
capistrant 225b8be
fixup forgotten insert by using pattern of sharing now timestamp acro…
capistrant 2899c11
Merge branch 'master' into implements-12526
capistrant cd43ba5
Merge branch 'master' into implements-12526
capistrant 862f6e9
fix issue introduced by merge
capistrant 3e0f21f
Merge branch 'master' into implements-12526
capistrant 5940760
fixup after merge with master
capistrant c30817c
add some directions to docs in the case of segment table validation i…
capistrant File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
--- | ||
id: upgrade-prep | ||
title: "Upgrade Prep" | ||
--- | ||
|
||
<!-- | ||
~ Licensed to the Apache Software Foundation (ASF) under one | ||
~ or more contributor license agreements. See the NOTICE file | ||
~ distributed with this work for additional information | ||
~ regarding copyright ownership. The ASF licenses this file | ||
~ to you under the Apache License, Version 2.0 (the | ||
~ "License"); you may not use this file except in compliance | ||
~ with the License. You may obtain a copy of the License at | ||
~ | ||
~ http://www.apache.org/licenses/LICENSE-2.0 | ||
~ | ||
~ Unless required by applicable law or agreed to in writing, | ||
~ software distributed under the License is distributed on an | ||
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
~ KIND, either express or implied. See the License for the | ||
~ specific language governing permissions and limitations | ||
~ under the License. | ||
--> | ||
|
||
## Upgrade to `0.24+` from `0.23` and earlier | ||
|
||
### Altering segments table | ||
|
||
**If you have set `druid.metadata.storage.connector.createTables` to `true` (which is the default), and your metadata connect user has DDL priviliges, you can disregard this section. You are urged to still evaluate the optional section below** | ||
|
||
**The coordinator and overlord services will fail if you do not execute this change prior to the upgrade** | ||
|
||
A new column, `last_used`, is needed in the segments table to support new | ||
segment killing functionality. You can manually alter the table, or you can use | ||
a pre-written tool to perform the update. | ||
|
||
#### Pre-written tool | ||
|
||
Druid provides a `metadata-update` tool for updating Druid's metadata tables. | ||
|
||
In the example commands below: | ||
|
||
- `lib` is the Druid lib directory | ||
- `extensions` is the Druid extensions directory | ||
- `base` corresponds to the value of `druid.metadata.storage.tables.base` in the configuration, `druid` by default. | ||
- The `--connectURI` parameter corresponds to the value of `druid.metadata.storage.connector.connectURI`. | ||
- The `--user` parameter corresponds to the value of `druid.metadata.storage.connector.user`. | ||
- The `--password` parameter corresponds to the value of `druid.metadata.storage.connector.password`. | ||
- The `--action` parameter corresponds to the update action you are executing. In this case it is: `add-last-used-to-segments` | ||
|
||
##### MySQL | ||
|
||
```bash | ||
cd ${DRUID_ROOT} | ||
java -classpath "lib/*" -Dlog4j.configurationFile=conf/druid/cluster/_common/log4j2.xml -Ddruid.extensions.directory="extensions" -Ddruid.extensions.loadList=[\"mysql-metadata-storage\"] -Ddruid.metadata.storage.type=mysql org.apache.druid.cli.Main tools metadata-update --connectURI="<mysql-uri>" --user <user> --password <pass> --base druid --action add-last-used-to-segments | ||
``` | ||
|
||
##### PostgreSQL | ||
|
||
```bash | ||
cd ${DRUID_ROOT} | ||
java -classpath "lib/*" -Dlog4j.configurationFile=conf/druid/cluster/_common/log4j2.xml -Ddruid.extensions.directory="extensions" -Ddruid.extensions.loadList=[\"postgresql-metadata-storage\"] -Ddruid.metadata.storage.type=postgresql org.apache.druid.cli.Main tools metadata-update --connectURI="<postgresql-uri>" --user <user> --password <pass> --base druid --action add-last-used-to-segments | ||
``` | ||
|
||
|
||
#### Manual ALTER TABLE | ||
|
||
```SQL | ||
ALTER TABLE druid_segments | ||
ADD last_used varchar(255); | ||
``` | ||
|
||
### Populating `last_used` column of the segments table after upgrade (Optional) | ||
|
||
This is an optional step to take **after** you upgrade the Overlord and Coordinator to `0.24+` (from `0.23` and earlier). If you do not take this action and are also using `druid.coordinator.kill.on=true`, the logic to identify segments that can be killed will not honor `druid.coordinator.kill.bufferPeriod` for the rows in the segments table where `last_used == null`. | ||
|
||
#### Pre-written tool | ||
|
||
Druid provides a `metadata-update` tool for updating Druid's metadata tables. Note that this tool will update `last_used` for all rows that match `used = false` in one transaction. | ||
|
||
In the example commands below: | ||
|
||
- `lib` is the Druid lib directory | ||
- `extensions` is the Druid extensions directory | ||
- `base` corresponds to the value of `druid.metadata.storage.tables.base` in the configuration, `druid` by default. | ||
- The `--connectURI` parameter corresponds to the value of `druid.metadata.storage.connector.connectURI`. | ||
- The `--user` parameter corresponds to the value of `druid.metadata.storage.connector.user`. | ||
- The `--password` parameter corresponds to the value of `druid.metadata.storage.connector.password`. | ||
- The `--action` parameter corresponds to the update action you are executing. In this case it is: `add-last-used-to-segments` | ||
|
||
##### MySQL | ||
|
||
```bash | ||
cd ${DRUID_ROOT} | ||
java -classpath "lib/*" -Dlog4j.configurationFile=conf/druid/cluster/_common/log4j2.xml -Ddruid.extensions.directory="extensions" -Ddruid.extensions.loadList=[\"mysql-metadata-storage\"] -Ddruid.metadata.storage.type=mysql org.apache.druid.cli.Main tools metadata-update --connectURI="<mysql-uri>" --user <user> --password <pass> --base druid --action populate-last-used-column-in-segments | ||
``` | ||
|
||
##### PostgreSQL | ||
|
||
```bash | ||
cd ${DRUID_ROOT} | ||
java -classpath "lib/*" -Dlog4j.configurationFile=conf/druid/cluster/_common/log4j2.xml -Ddruid.extensions.directory="extensions" -Ddruid.extensions.loadList=[\"postgresql-metadata-storage\"] -Ddruid.metadata.storage.type=postgresql org.apache.druid.cli.Main tools metadata-update --connectURI="<postgresql-uri>" --user <user> --password <pass> --base druid --action populate-last-used-column-in-segments | ||
``` | ||
|
||
|
||
#### Manual UPDATE | ||
|
||
Note that we choose a random date string for this example. We reccommend using the current UTC time when you invoke the command. If you have lots of rows that match the conditional `used = false`, you may want to incrementlly update the table using a limit clause. | ||
|
||
```SQL | ||
UPDATE druid_segment | ||
SET last_used = '2022-01-01T00:00:00.000Z' | ||
where used = false; | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -55,8 +55,8 @@ public SQLMetadataSegmentPublisher( | |
this.config = config; | ||
this.connector = connector; | ||
this.statement = StringUtils.format( | ||
"INSERT INTO %1$s (id, dataSource, created_date, start, %2$send%2$s, partitioned, version, used, payload) " | ||
+ "VALUES (:id, :dataSource, :created_date, :start, :end, :partitioned, :version, :used, :payload)", | ||
"INSERT INTO %1$s (id, dataSource, created_date, start, %2$send%2$s, partitioned, version, used, payload, last_used) " | ||
+ "VALUES (:id, :dataSource, :created_date, :start, :end, :partitioned, :version, :used, :payload, :last_used)", | ||
config.getSegmentsTable(), connector.getQuoteString() | ||
); | ||
} | ||
|
@@ -73,7 +73,8 @@ public void publishSegment(final DataSegment segment) throws IOException | |
(segment.getShardSpec() instanceof NoneShardSpec) ? false : true, | ||
segment.getVersion(), | ||
true, | ||
jsonMapper.writeValueAsBytes(segment) | ||
jsonMapper.writeValueAsBytes(segment), | ||
DateTimes.nowUtc().toString() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be tidier to call |
||
); | ||
} | ||
|
||
|
@@ -87,7 +88,8 @@ void publishSegment( | |
final boolean partitioned, | ||
final String version, | ||
final boolean used, | ||
final byte[] payload | ||
final byte[] payload, | ||
final String lastUsed | ||
) | ||
{ | ||
try { | ||
|
@@ -128,6 +130,7 @@ public Void withHandle(Handle handle) | |
.bind("version", version) | ||
.bind("used", used) | ||
.bind("payload", payload) | ||
.bind("last_used", lastUsed) | ||
.execute(); | ||
|
||
return null; | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
24h seems aggressive as a default. Personally I'd be more comfortable with 30 days, if it's a cluster I'm operating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should describe that if you updated from an earlier version, this won't apply to all segments. We can also link from this to the doc about how to do the manual update. Maybe like: