Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dataframe.collect #219

Closed
wants to merge 1 commit into from
Closed

Conversation

MarcoGorelli
Copy link
Contributor

this has come up in a few discussions

closes #120

@@ -838,3 +838,16 @@ def to_array_object(self, dtype: Any) -> Any:
understanding that consuming libraries would then use the
``array-api-compat`` package to convert it to a Standard-compliant array.
"""

def collect(self) -> Any:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing APIs of lazy dataframes look quite different and the equivalent of the collect() function accepts various implementation-specific kwargs 1, 2.

Can it be assumed that each library has "reasonable defaults" that will work for most use-cases, and those can be used in their implementations of collect()? (I think it can, but just making sure we consider this).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should this be part of DataFrame, or should a new LazyFrame interface be introduced?

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep the discussion on gh-120? I don't see anything that has changed here, conceptually this does not seem like a good idea. Will comment more on gh-120.

@MarcoGorelli MarcoGorelli marked this pull request as draft August 2, 2023 12:44
@MarcoGorelli
Copy link
Contributor Author

closing as this is being addressed by #249

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFrame.collect() for lazy dataframes?
4 participants