-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
versioning #386
versioning #386
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't have a strong opinion on the naive/aware timezone thing, whether we should block them completely, or try to figure out a workaround. Asking in the internal discord.
It's still good to have the docstring adhere by best practices.
docs/data_engine.md
Outdated
- both "x" and Field("x") can be used | ||
- alias, as_of_time - are optional | ||
- the list should make sense, i.e ```python.select(Field("x", as_of_time=t1), Field("x", as_of_time=t2))``` does not make sense since there is no alias to differentiate, the result will not reflect the intention. also ```python .select("x","x")``` | ||
- when no select list specified all datapoint enrichements are returned, else only those specified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this answered my question. This needs to be written at the very beginning of the function doc basically.
Also we 100% should have a flag to include the regular fields along with the selected ones.
Especially considering the way we write the docs, people might assume that the selected fields will be added to the original ones. That's how it read to me
docs/data_engine.md
Outdated
- the list should make sense, i.e ```python.select(Field("x", as_of_time=t1), Field("x", as_of_time=t2))``` does not make sense since there is no alias to differentiate, the result will not reflect the intention. also ```python .select("x","x")``` | ||
- when no select list specified all datapoint enrichements are returned, else only those specified. | ||
##### Global as_of behavior: | ||
- it applies to all entities unless otherwise specified, i.e if we use ```python Field("x", as_of_time=t1))``` than this t will take precedence over a t2 specified in .as_of(t2). the sensibility of the results is up to the caller. you could get datapoints that existed in t1 < t2 based on a condition applied on their enrichmnts in t2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also needs some rewording probably.
Fixing typos:
- it applies to all entities unless otherwise specified, i.e if we use ```python Field("x", as_of_time=t1))``` than this t will take precedence over a t2 specified in .as_of(t2). the sensibility of the results is up to the caller. you could get datapoints that existed in t1 < t2 based on a condition applied on their enrichmnts in t2. | |
- it applies to all entities unless otherwise specified, i.e if we use ```python Field("x", as_of_time=t1))``` then this t will take precedence over a t2 specified in .as_of(t2). the sensibility of the results is up to the caller. you could get datapoints that existed in t1 < t2 based on a condition applied on their enrichments in t2. |
Alternatively trying to reword it myself:
as_of specificity:
as_of
filters defined inside ofselect()
take complete precedence over theas_of
defined on top of the query.
This means that a query:
q = ds.select("size", Field("size", as_of_time=t1, alias="size_t1"), "train")
.as_of(t2)
.all()
Will return fields size
and train
as they were as of t2
, and size_t1
will have size as of t1
.
This behavior is kept regardless of whethert1
came before or after t2
TODO @kbolashev suggestions:
if select +include_all is used, we'll get all columns. its implemented in client by adding all names. would be more efficient to do it in backend
Guy suggests '*' as give me all, otherwise only selected