New epic: Global Time-sync #771
jmcook1186
announced in
Epics
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Here's another upcoming change - I'm signalling it in advance here so you all have time to comment and give feedback before it gets worked up as a ticket in one of our development sprints. Take a read through and let us know your thoughts!
Background
Observations might be gathered from multiple sources with multiple different durations, start and stop times.
In order to perform aggregation up a tree to the grouping nodes, all the observations needed to be for the same time-buckets and durations so that aggregation can happen across synchronous slices of time.
To support that we created a builtin plugin called
TimeSync
.TimeSync
is a builtin which snaps observations onto a global grid, e.g. every 1hr during a day.The reason it was built as a plugin rather than a framework feature was that we were not sure where in the process of computation we could always enforce time syncing. Some plugins imported data from other places and generated observations, so the time sync would need to be after those plugins, some plugins depended on the time such as the
WattTime
plugin so the time syncing should be before them. This means the user has to know in advance where in the pipeline to executeTimeSync
. The correct position is not always completely obvious, and mistakes in positioning can lead to IF failures. This adds some fragility and an overall less than optimal developer experience.Problem statement
Figuring out exactly where you need to insert the time-sync plugin in your pipeline is a bit of a friction point in IF development. It is not always obvious where
TimeSync
should be positioned in the pipeline, but it has to be correct to ensure the observations are synced in advance of any aggregation or execution of plugins that rely on regular, corrected timing.To perform aggregation which is a builtin feature and configured at the top of the manifest file, you might need to ensure each of your pipelines has a time-sync plugin at the right step in the pipeline. There is a very strong dependency between a framework feature “aggregation” and a pipeline plugin, which can be awkward to reason about.
Proposed solution
First, we need to have shipped the tasks in the idempotence epic. This breaks IF execution into three distinct phases:
observe
,group
andcompute
.In this case,
TimeSync
has a clear, fixed position int he execution flow. It should happen immediately aftergroup
and immediately beforecompute
.Once IF has phased execution, this will always be the right moment to synchronize time. This is because
group
should always yield individual time series with unique, non-repeated timestamps that can be handled byTimeSync
andcompute
will always operate over synchronized time series.This means
TimeSync
can be an IF feature rather than a plugin. We still need some config from the manifest, which can be provided at the top level in the manifest'scontext
.If this config is present,
TimeSync
should be executed automatically between thegroup
andcompute
stage of execution.Tasks
Note all the tasks in
idempotence
epic are prerequisite for the following tasks:TimeSync
so that it accepts config from the top level manifestcontext
TimeSync
so it is a builtin feature likeaggregate
rather than a plug in inbuiltins
if
so thatTimeSync
is always executed aftergroup
if its config is available in the manifest.How you can help
You can read through this post and give feedback in comments, especially if you are a plugin developer that currently relies on node-level config. Later, when the specific tasks are available as tickets on our issue board you can let us know if you want to work on one. There may be some that are reserved for core developers, but in general we are keen to open up IF development to the community.
@jawache @zanete @narekhovhannisyan @MariamKhalatova @manushak
Beta Was this translation helpful? Give feedback.
All reactions