You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Query outside time window will use cache partially?
Here it says "This configuration will only accelerate data from the federated source that ... is less than 1 day old".
Below it says "By default, accelerated datasets only return locally materialized data.". What if use_source: true and the query spans 2 days, will the federated source be queried for 1 or 2 days?
The limitation to only fall back on full, makes me wonder how to append but limit that by a time window. Is refresh_data_window: true already changing the default behavior?
In general, a more formal specification of the caching behavior would be good. Such specification could start with combinations that are valid (not ignored and maybe required) and a minimal behavior that is always true when enabled.
The text was updated successfully, but these errors were encountered:
I agree that the documentation is confusing, I got confused by the same things. But by looking at the OG issue I think there is an answer to your second question:
The solution works as a shorthand of refresh sql with temporal column constraints
The documentation of refresh sql is pretty clear that Queries for data that have been filtered out will not fallback to querying the federated table., so it will not read 2-week old data from the federated store and combine it with the fast local data.
Unfortunately.
The exception is if there is no data in the last week, so the result is completely empty, and you have on_zero_results: use_source.
Thank you for the report on these docs. I’ve updated them to clarify which options are supported in which modes and have revised the information around refresh SQL and data windows. In particular, I focused on the behavior of on_zero_results and how it interacts with refresh SQL.
Additionally, I’ve added some scenario-based examples to demonstrate different ways these parameters can be used together.
I’ll be closing this issue now, but please re-open it if these new docs haven't hit the mark! 😄
There are two places that left me guessing:
This example includes the respective option with append:
This explains that append is not a valid mode for refresh cycle:
Also given #3702, it seems supported.
Here it says "This configuration will only accelerate data from the federated source that ... is less than 1 day old".
Below it says "By default, accelerated datasets only return locally materialized data.". What if
use_source: true
and the query spans 2 days, will the federated source be queried for 1 or 2 days?The limitation to only fall back on
full
, makes me wonder how toappend
but limit that by a time window. Isrefresh_data_window: true
already changing the default behavior?In general, a more formal specification of the caching behavior would be good. Such specification could start with combinations that are valid (not ignored and maybe required) and a minimal behavior that is always true when enabled.
The text was updated successfully, but these errors were encountered: