You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Observations might be gathered from multiple sources with multiple different durations, start and stop times.
In order to perform aggregation up a tree to the grouping nodes, all the observations needed to be for the same time-buckets and durations so that aggregation can happen across synchronous slices of time.
To support that we created a builtin plugin called TimeSync. TimeSync is a builtin which snaps observations onto a global grid, e.g. every 1hr during a day.
The reason it was built as a plugin rather than a framework feature was that we were not sure where in the process of computation we could always enforce time syncing. Some plugins imported data from other places and generated observations, so the time sync would need to be after those plugins, some plugins depended on the time such as the WattTime plugin so the time syncing should be before them. This means the user has to know in advance where in the pipeline to execute TimeSync. The correct position is not always completely obvious, and mistakes in positioning can lead to IF failures. This adds some fragility and an overall less than optimal developer experience.
Problem statement
Figuring out exactly where you need to insert the time-sync plugin in your pipeline is a bit of a friction point in IF development. It is not always obvious where TimeSync should be positioned in the pipeline, but it has to be correct to ensure the observations are synced in advance of any aggregation or execution of plugins that rely on regular, corrected timing.
To perform aggregation which is a builtin feature and configured at the top of the manifest file, you might need to ensure each of your pipelines has a time-sync plugin at the right step in the pipeline. There is a very strong dependency between a framework feature “aggregation” and a pipeline plugin, which can be awkward to reason about.
Proposed solution
First, we need to have shipped the tasks in the idempotence epic. This breaks IF execution into three distinct phases: observe, group and compute.
In this case, TimeSync has a clear, fixed position int he execution flow. It should happen immediately after group and immediately before compute.
Once IF has phased execution, this will always be the right moment to synchronize time. This is because group should always yield individual time series with unique, non-repeated timestamps that can be handled by TimeSync and compute will always operate over synchronized time series.
This means TimeSync can be an IF feature rather than a plugin. We still need some config from the manifest, which can be provided at the top level in the manifest's context.
If this config is present, TimeSync should be executed automatically between the group and compute stage of execution.
Background
Observations might be gathered from multiple sources with multiple different durations, start and stop times.
In order to perform aggregation up a tree to the grouping nodes, all the observations needed to be for the same time-buckets and durations so that aggregation can happen across synchronous slices of time.
To support that we created a builtin plugin called
TimeSync
.TimeSync
is a builtin which snaps observations onto a global grid, e.g. every 1hr during a day.The reason it was built as a plugin rather than a framework feature was that we were not sure where in the process of computation we could always enforce time syncing. Some plugins imported data from other places and generated observations, so the time sync would need to be after those plugins, some plugins depended on the time such as the
WattTime
plugin so the time syncing should be before them. This means the user has to know in advance where in the pipeline to executeTimeSync
. The correct position is not always completely obvious, and mistakes in positioning can lead to IF failures. This adds some fragility and an overall less than optimal developer experience.Problem statement
Figuring out exactly where you need to insert the time-sync plugin in your pipeline is a bit of a friction point in IF development. It is not always obvious where
TimeSync
should be positioned in the pipeline, but it has to be correct to ensure the observations are synced in advance of any aggregation or execution of plugins that rely on regular, corrected timing.To perform aggregation which is a builtin feature and configured at the top of the manifest file, you might need to ensure each of your pipelines has a time-sync plugin at the right step in the pipeline. There is a very strong dependency between a framework feature “aggregation” and a pipeline plugin, which can be awkward to reason about.
Proposed solution
First, we need to have shipped the tasks in the idempotence epic. This breaks IF execution into three distinct phases:
observe
,group
andcompute
.In this case,
TimeSync
has a clear, fixed position int he execution flow. It should happen immediately aftergroup
and immediately beforecompute
.Once IF has phased execution, this will always be the right moment to synchronize time. This is because
group
should always yield individual time series with unique, non-repeated timestamps that can be handled byTimeSync
andcompute
will always operate over synchronized time series.This means
TimeSync
can be an IF feature rather than a plugin. We still need some config from the manifest, which can be provided at the top level in the manifest'scontext
.If this config is present,
TimeSync
should be executed automatically between thegroup
andcompute
stage of execution.Related discussion
#771
Tasks
Note all the tasks in
idempotence
epic are prerequisite for the following tasks:TimeSync
a builtin feature #823TimeSync
aftergroup
#824TimeSync
#825The text was updated successfully, but these errors were encountered: