Skip to content

Commit

Permalink
Address more comments
Browse files Browse the repository at this point in the history
  • Loading branch information
leventov committed Jan 23, 2020
1 parent 10587c1 commit 2ad6dd5
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 38 deletions.
20 changes: 10 additions & 10 deletions docs/operations/rule-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@ title: "Retaining or automatically dropping data"
In Apache Druid, Coordinator processes use rules to determine what data should be loaded to or dropped from the cluster. Rules are used for data retention and query execution, and are set on the Coordinator console (http://coordinator_ip:port).

There are three types of rules, i.e., load rules, drop rules, and broadcast rules. Load rules indicate how segments should be assigned to different historical process tiers and how many replicas of a segment should exist in each tier.
Drop rules indicate when segments should be dropped entirely from the cluster. Finally, broadcast rules indicate how segments of different data sources should be co-located in Historical processes.
Drop rules indicate when segments should be dropped entirely from the cluster. Finally, broadcast rules indicate how segments of different datasources should be co-located in Historical processes.

The Coordinator loads a set of rules from the metadata storage. Rules may be specific to a certain data source and/or a
The Coordinator loads a set of rules from the metadata storage. Rules may be specific to a certain datasource and/or a
default set of rules can be configured. Rules are read in order and hence the ordering of rules is important. The
Coordinator will cycle through all used segments and match each segment with the first rule that applies. Each segment
may only match a single rule.
Expand Down Expand Up @@ -170,8 +170,8 @@ The interval of a segment will be compared against the specified period. The per

## Broadcast Rules

Broadcast rules indicate how segments of different data sources should be co-located in Historical processes.
Once a broadcast rule is configured for a data source, all segments of the data source are broadcasted to the servers holding _any segments_ of the co-located data sources.
Broadcast rules indicate how segments of different datasources should be co-located in Historical processes.
Once a broadcast rule is configured for a datasource, all segments of the datasource are broadcasted to the servers holding _any segments_ of the co-located datasources.

### Forever Broadcast Rule

Expand All @@ -185,7 +185,7 @@ Forever broadcast rules are of the form:
```

* `type` - this should always be "broadcastForever"
* `colocatedDataSources` - A JSON List containing data source names to be co-located. `null` and empty list means broadcasting to every process in the cluster.
* `colocatedDataSources` - A JSON List containing datasource names to be co-located. `null` and empty list means broadcasting to every process in the cluster.

### Interval Broadcast Rule

Expand All @@ -200,7 +200,7 @@ Interval broadcast rules are of the form:
```

* `type` - this should always be "broadcastByInterval"
* `colocatedDataSources` - A JSON List containing data source names to be co-located. `null` and empty list means broadcasting to every process in the cluster.
* `colocatedDataSources` - A JSON List containing datasource names to be co-located. `null` and empty list means broadcasting to every process in the cluster.
* `interval` - A JSON Object representing ISO-8601 Periods. Only the segments of the interval will be broadcasted.

### Period Broadcast Rule
Expand All @@ -217,14 +217,14 @@ Period broadcast rules are of the form:
```

* `type` - this should always be "broadcastByPeriod"
* `colocatedDataSources` - A JSON List containing data source names to be co-located. `null` and empty list means broadcasting to every process in the cluster.
* `colocatedDataSources` - A JSON List containing datasource names to be co-located. `null` and empty list means broadcasting to every process in the cluster.
* `period` - A JSON Object representing ISO-8601 Periods
* `includeFuture` - A JSON Boolean indicating whether the load period should include the future. This property is optional, Default is true.

The interval of a segment will be compared against the specified period. The period is from some time in the past to the future or to the current time, which depends on `includeFuture` is true or false. The rule matches if the period *overlaps* the interval.

> broadcast rules don't guarantee that segments of the data sources are always co-located because segments for the colocated data sources are not loaded together atomically.
> If you want to always co-locate the segments of some data sources together, it is recommended to leave colocatedDataSources empty.
> broadcast rules don't guarantee that segments of the datasources are always co-located because segments for the colocated datasources are not loaded together atomically.
> If you want to always co-locate the segments of some datasources together, it is recommended to leave colocatedDataSources empty.
## Permanently deleting data

Expand All @@ -236,5 +236,5 @@ submit a [kill task](../ingestion/tasks.md) to the [Overlord](../design/overlord

Data that has been dropped from a Druid cluster cannot be reloaded using only rules. To reload dropped data in Druid,
you must first set your retention period (i.e. changing the retention period from 1 month to 2 months), and then mark as
used all segments belonging to the data source in the Druid Coordinator console, or through the Druid Coordinator
used all segments belonging to the datasource in the Druid Coordinator console, or through the Druid Coordinator
endpoints.
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,6 @@ SegmentIdWithShardSpec allocatePendingSegment(

/**
* Delete all pending segments belonging to the given data source from the pending segments table.
* The {@code created_date} field of the pending segments table is checked to find segments to be deleted.
*
* @return number of deleted pending segments
* @see #deletePendingSegmentsCreatedInInterval(String, Interval) similar to this method but also accepts interval for
Expand Down
54 changes: 27 additions & 27 deletions web-console/src/views/datasource-view/datasource-view.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -152,10 +152,10 @@ export interface DatasourcesViewState {
showUnused: boolean;
retentionDialogOpenOn?: RetentionDialogOpenOn;
compactionDialogOpenOn?: CompactionDialogOpenOn;
dataSourceToMarkAsUnusedAllSegmentsIn?: string;
dataSourceToMarkAllNonOvershadowedSegmentsAsUsedIn?: string;
datasourceToMarkAsUnusedAllSegmentsIn?: string;
datasourceToMarkAllNonOvershadowedSegmentsAsUsedIn?: string;
killDatasource?: string;
dataSourceToMarkSegmentsByIntervalIn?: string;
datasourceToMarkSegmentsByIntervalIn?: string;
useUnuseAction: 'use' | 'unuse';
useUnuseInterval: string;
hiddenColumns: LocalStorageBackedArray<string>;
Expand Down Expand Up @@ -349,68 +349,68 @@ GROUP BY 1`;
}

renderUnuseAction() {
const { dataSourceToMarkAsUnusedAllSegmentsIn } = this.state;
if (!dataSourceToMarkAsUnusedAllSegmentsIn) return;
const { datasourceToMarkAsUnusedAllSegmentsIn } = this.state;
if (!datasourceToMarkAsUnusedAllSegmentsIn) return;

return (
<AsyncActionDialog
action={async () => {
const resp = await axios.delete(
`/druid/coordinator/v1/datasources/${dataSourceToMarkAsUnusedAllSegmentsIn}`,
`/druid/coordinator/v1/datasources/${datasourceToMarkAsUnusedAllSegmentsIn}`,
{},
);
return resp.data;
}}
confirmButtonText="Mark as unused all segments in data source"
confirmButtonText="Mark as unused all segments"
successText="All segments in data source have been marked as unused"
failText="Failed to mark as unused all segments in data source"
intent={Intent.DANGER}
onClose={() => {
this.setState({ dataSourceToMarkAsUnusedAllSegmentsIn: undefined });
this.setState({ datasourceToMarkAsUnusedAllSegmentsIn: undefined });
}}
onSuccess={() => {
this.datasourceQueryManager.rerunLastQuery();
}}
>
<p>
{`Are you sure you want to mark as unused all segments in '${dataSourceToMarkAsUnusedAllSegmentsIn}'?`}
{`Are you sure you want to mark as unused all segments in '${datasourceToMarkAsUnusedAllSegmentsIn}'?`}
</p>
</AsyncActionDialog>
);
}

renderUseAction() {
const { dataSourceToMarkAllNonOvershadowedSegmentsAsUsedIn } = this.state;
if (!dataSourceToMarkAllNonOvershadowedSegmentsAsUsedIn) return;
const { datasourceToMarkAllNonOvershadowedSegmentsAsUsedIn } = this.state;
if (!datasourceToMarkAllNonOvershadowedSegmentsAsUsedIn) return;

return (
<AsyncActionDialog
action={async () => {
const resp = await axios.post(
`/druid/coordinator/v1/datasources/${dataSourceToMarkAllNonOvershadowedSegmentsAsUsedIn}`,
`/druid/coordinator/v1/datasources/${datasourceToMarkAllNonOvershadowedSegmentsAsUsedIn}`,
{},
);
return resp.data;
}}
confirmButtonText="Mark as used all non-overshadowed segments in data source"
confirmButtonText="Mark as used all segments"
successText="All non-overshadowed segments in data source have been marked as used"
failText="Failed to mark as used all non-overshadowed segments in data source"
intent={Intent.PRIMARY}
onClose={() => {
this.setState({ dataSourceToMarkAllNonOvershadowedSegmentsAsUsedIn: undefined });
this.setState({ datasourceToMarkAllNonOvershadowedSegmentsAsUsedIn: undefined });
}}
onSuccess={() => {
this.datasourceQueryManager.rerunLastQuery();
}}
>
<p>{`Are you sure you want to mark as used all non-overshadowed segments in '${dataSourceToMarkAllNonOvershadowedSegmentsAsUsedIn}'?`}</p>
<p>{`Are you sure you want to mark as used all non-overshadowed segments in '${datasourceToMarkAllNonOvershadowedSegmentsAsUsedIn}'?`}</p>
</AsyncActionDialog>
);
}

renderUseUnuseActionByInterval() {
const { dataSourceToMarkSegmentsByIntervalIn, useUnuseAction, useUnuseInterval } = this.state;
if (!dataSourceToMarkSegmentsByIntervalIn) return;
const { datasourceToMarkSegmentsByIntervalIn, useUnuseAction, useUnuseInterval } = this.state;
if (!datasourceToMarkSegmentsByIntervalIn) return;
const isUse = useUnuseAction === 'use';
const usedWord = isUse ? 'used' : 'unused';
return (
Expand All @@ -419,26 +419,26 @@ GROUP BY 1`;
if (!useUnuseInterval) return;
const param = isUse ? 'markUsed' : 'markUnused';
const resp = await axios.post(
`/druid/coordinator/v1/datasources/${dataSourceToMarkSegmentsByIntervalIn}/${param}`,
`/druid/coordinator/v1/datasources/${datasourceToMarkSegmentsByIntervalIn}/${param}`,
{
interval: useUnuseInterval,
},
);
return resp.data;
}}
confirmButtonText={`Mark as ${usedWord} segments in the interval in data source`}
confirmButtonText={`Mark as ${usedWord} segments in the interval`}
confirmButtonDisabled={!/.\/./.test(useUnuseInterval)}
successText={`Segments in the interval in data source have been marked as ${usedWord}`}
failText={`Failed to mark as ${usedWord} segments in the interval in data source`}
intent={Intent.PRIMARY}
onClose={() => {
this.setState({ dataSourceToMarkSegmentsByIntervalIn: undefined });
this.setState({ datasourceToMarkSegmentsByIntervalIn: undefined });
}}
onSuccess={() => {
this.datasourceQueryManager.rerunLastQuery();
}}
>
<p>{`Please select the interval in which you want to mark segments as ${usedWord}?`}</p>
<p>{`Please select the interval in which you want to mark segments as ${usedWord} in '${datasourceToMarkSegmentsByIntervalIn}'?`}</p>
<FormGroup>
<InputGroup
value={useUnuseInterval}
Expand Down Expand Up @@ -466,7 +466,7 @@ GROUP BY 1`;
);
return resp.data;
}}
confirmButtonText="Permanently delete unused segments in data source"
confirmButtonText="Permanently delete unused segments"
successText="Kill task was issued. Unused segments in data source will be deleted"
failText="Failed submit kill task"
intent={Intent.DANGER}
Expand Down Expand Up @@ -619,7 +619,7 @@ GROUP BY 1`;

onAction: () =>
this.setState({
dataSourceToMarkAllNonOvershadowedSegmentsAsUsedIn: datasource,
datasourceToMarkAllNonOvershadowedSegmentsAsUsedIn: datasource,
}),
},
{
Expand Down Expand Up @@ -648,7 +648,7 @@ GROUP BY 1`;
title: 'Mark as used all segments (will lead to reapplying retention rules)',
onAction: () =>
this.setState({
dataSourceToMarkAllNonOvershadowedSegmentsAsUsedIn: datasource,
datasourceToMarkAllNonOvershadowedSegmentsAsUsedIn: datasource,
}),
},
{
Expand All @@ -669,7 +669,7 @@ GROUP BY 1`;

onAction: () =>
this.setState({
dataSourceToMarkSegmentsByIntervalIn: datasource,
datasourceToMarkSegmentsByIntervalIn: datasource,
useUnuseAction: 'use',
}),
},
Expand All @@ -679,15 +679,15 @@ GROUP BY 1`;

onAction: () =>
this.setState({
dataSourceToMarkSegmentsByIntervalIn: datasource,
datasourceToMarkSegmentsByIntervalIn: datasource,
useUnuseAction: 'unuse',
}),
},
{
icon: IconNames.IMPORT,
title: 'Mark as unused all segments',
intent: Intent.DANGER,
onAction: () => this.setState({ dataSourceToMarkAsUnusedAllSegmentsIn: datasource }),
onAction: () => this.setState({ datasourceToMarkAsUnusedAllSegmentsIn: datasource }),
},
{
icon: IconNames.TRASH,
Expand Down

0 comments on commit 2ad6dd5

Please sign in to comment.