-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add /timeseries endpoints #33
Conversation
8a664f0
to
a827171
Compare
3cbccc1
to
265351d
Compare
3a4253e
to
96c28d6
Compare
titiler/cmr/timeseries.py
Outdated
Optional[str], | ||
Query( | ||
description="Start datetime for timeseries request", | ||
), | ||
] = None | ||
end_datetime: Annotated[ | ||
Optional[str], | ||
Query( | ||
description="End datetime for timeseries request", | ||
), | ||
] = None | ||
step: Annotated[ | ||
Optional[str], | ||
Query( | ||
description="Time step between timeseries intervals, expressed as [ISO 8601 duration](https://en.wikipedia.org/wiki/ISO_8601#Durations)" | ||
), | ||
] = None | ||
step_idx: Annotated[ | ||
Optional[int], | ||
Query(description="Optional (zero-indexed) index of the desired time step"), | ||
] = None | ||
exact: Annotated[ | ||
Optional[bool], | ||
Query( | ||
description="If true, queries will be made for a point-in-time at each step. If false, queries will be made for the entire interval between steps" | ||
), | ||
] = None | ||
datetimes: Annotated[ | ||
Optional[str], | ||
Query( | ||
description="Optional list of comma-separated specific time points or time intervals to summarize over" | ||
), | ||
] = None | ||
|
||
|
||
def parse_duration(duration: str) -> relativedelta: | ||
"""Parse ISO 8601 duration string to relativedelta.""" | ||
match = re.match( | ||
r"P(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)W)?(?:(\d+)D)?(?:T(?:(\d+)H)?(?:(\d+)M)?(?:(\d+(?:\.\d+)?)S)?)?", | ||
duration, | ||
) | ||
if not match or not any(m for m in match.groups()): | ||
raise ValueError(f"{duration} is an invalid duration format") | ||
|
||
years, months, weeks, days, hours, minutes, seconds = [ | ||
float(g) if g else 0 for g in match.groups() | ||
] | ||
return relativedelta( | ||
years=int(years), | ||
months=int(months), | ||
weeks=int(weeks), | ||
days=int(days), | ||
hours=int(hours), | ||
minutes=int(minutes), | ||
seconds=int(seconds), | ||
microseconds=int((seconds % 1) * 1e6), | ||
) | ||
|
||
|
||
def generate_datetime_ranges( | ||
start_datetime: datetime, end_datetime: datetime, step: str, exact: bool = False | ||
) -> List[Union[Tuple[datetime], Tuple[datetime, datetime]]]: | ||
"""Generate datetime ranges""" | ||
start = start_datetime | ||
end = end_datetime | ||
step_delta = parse_duration(step) | ||
|
||
ranges: List[Union[Tuple[datetime], Tuple[datetime, datetime]]] = [] | ||
current = start | ||
|
||
step_timedelta = (current + step_delta) - current | ||
is_small_timestep = step_timedelta <= timedelta(seconds=1) | ||
|
||
while current < end: | ||
if exact: | ||
# For exact case, return a tuple with just one exact datetime | ||
next_step = current + step_delta | ||
ranges.append((current,)) | ||
else: | ||
next_step = min(current + step_delta, end) | ||
if next_step == end: | ||
ranges.append((current, next_step)) | ||
break | ||
|
||
if is_small_timestep: | ||
# Subtract 1 millisecond for small timesteps | ||
ranges.append((current, next_step - timedelta(microseconds=1))) | ||
else: | ||
# Subtract 1 second for larger timesteps | ||
ranges.append((current, next_step - timedelta(seconds=1))) | ||
|
||
current = next_step | ||
|
||
if current == end: | ||
ranges.append((end,)) | ||
|
||
if not ranges: | ||
return [(start, end)] | ||
|
||
return ranges |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of this would eventually move to titiler.extensions.timeseries
but I wanted to do the development all in one place to facilitate easier iteration during early stages of development.
96c28d6
to
7b82fa4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few of the classes and models defined at the top of this module will eventually go to titiler.extensions.timeseries
so they can be re-used by other applications.
7b82fa4
to
5d2998c
Compare
5d2998c
to
72f6e94
Compare
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good to me, I just made comments about clarifying some of the function and dependency descriptions. I focused on the notebook and titiler/cmr/timeseries.py. If time allows, I would like to understand how the VCR and data/podaac... files are being used (presumably in the tests).
base_url=str(factory.url_for(request, "geojson_statistics")), | ||
request=request, | ||
param_list=query, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I understand how this works: the timeseries params are generated as part of the dependency timeseries_query_no_bbox
- which generates a query object that is list of objects each containing a concept id and datetime. Build request urls returns a list of urls, each of which is a statistics endpoint, and then in the line below a request is made to each of those statistics endpoints. Is that understanding correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abarciauskas-bgse yes that is correct! The code in the /timeseries
endpoints just converts the timeseries query parameters into a list of datetime
parameters, then a fires off a lower-level request for each unique datetime
parameter. Then there is a little bit of code to organize the results (e.g. list of PNGs to a GIF, list of statistics geojson output into a single geojson).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm totally blown away by the GIF possibilities 🎉 🌍 I unfortunately wasn't able to fully dive into a review today, so I will try it out more on Monday but wanted to share a couple questions/comments so far.
My main question is about the pros/cons of having a new /timeseries/bbox
url rather than adding support for returning image/gif
to the images/bbox
url. They share a ton of parameters and to me seem conceptually very similar with the return type just dependent on whether you have discrete mosaicing in time. If it remains under /timeseries
and will be specific to gif return types, would it help for clarity to just name if timeseries/gif
rather than timeseries/bbox
?
I think we could just extend the original
This would definitely make sense if we are just going to stick with the I think the |
@@ -26,5 +26,5 @@ RUN if [ -z "$EARTHDATA_USERNAME" ] || [ -z "$EARTHDATA_PASSWORD" ]; then \ | |||
# http://www.uvicorn.org/settings/ | |||
ENV HOST 0.0.0.0 | |||
ENV PORT 80 | |||
CMD uv run uvicorn titiler.cmr.main:app --host ${HOST} --port ${PORT} --log-level debug | |||
CMD uv run uvicorn titiler.cmr.main:app --host ${HOST} --port ${PORT} --log-level debug --reload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't really need the reload
here because the code is copied
not mounted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it because I mount the titiler/
directory in docker-compose.yml. It is redundant to COPY
and mount the volume but it is convenient for local development to have the reload functionality.
The |
I went with |
sounds good, I think this could be considered a documentation issue. the self-describing APIs are nice for maintenance, but a downside is new people may not find it sufficient for finding features. |
Thank you all for your thoughtful reviews of these changes!! After thinking it over some more I want to stick with |
The line count looks scary but the big changes here include ~600 lines of code in the new
titiler/cmr/timeseries.py
file. The rest is in a notebook, a VCR cassette, and tests.The goal is to start refining the timeseries API spec by testing out a bunch of CMR collections and seeing what kind of parameters make sense for most cases.
To see some examples of the
/timeseries
endpoints in action and some descriptions of the API for specifying a timeseries, check out the new timeseries notebookI am really interested in getting some feedback on the specification of the timeseries parameters. It feels like there are a lot of knobs so maybe there are ways to simplify it.