-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: read_pandas inline respects location #412
Conversation
options = bigframes.BigQueryOptions(location="europe-west1") | ||
session = bigframes.Session(options) | ||
|
||
df = session.read_pandas(pd.DataFrame([[1, 2, 3], [4, 5, 6]])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a test for inline data too, and verify that it creates the result tables in the intended location? bpd.Dataframe([[1, 2, 3], [4, 5, 6]])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@@ -155,6 +155,7 @@ def __hash__(self): | |||
@dataclass(frozen=True) | |||
class ReadLocalNode(BigFrameNode): | |||
feather_bytes: bytes | |||
session: bigframes.session.Session | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Local data is session-independent, we don't want to add a session constrain to the node. Don't worry about a dataframe/block not having session, that just means you can execute it anywhere, as all the data sources are local.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data itself is independent of session, yes. But when reading a local data, a specific session will be used. And when executing the query, we'd call that particular session. Do we have other options than keep a record here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A session will be used to execute the tree, yes, but the choice of session need not be constrained by the tree itself. You can check a tree to see if it has a required session, and otherwise, just use the default session to execute trees that don't depend on a specific session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, lets just make the session an optional field that we set when users have a specific session they used to read the local data with session.read_gbq
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Merge-on-green attempted to merge your PR for 6 hours, but it was not mergeable because either one of your required status checks failed, one of your required reviews was not approved, or there is a do not merge label. Learn more about your required status checks here: https://help.github.com/en/github/administering-a-repository/enabling-required-status-checks. You can remove and reapply the label to re-run the bot. |
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes b/327544164 🦕