Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'date_dtype', 'datetime_dtype', 'time_dtype' and 'timetamp_dtype' to the 'to_dataframe' APIs #1546

Closed
chelsea-lin opened this issue Apr 7, 2023 · 0 comments · Fixed by #1547
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery API.

Comments

@chelsea-lin
Copy link
Contributor

Is your feature request related to a problem? Please describe.
We want to add `date_dtype, datetime_dtype, time_dtype, and timestamp_dtype to the to_dataframe API, similar to bool_dtype, int_dtype, float_dtype, and string_dtype (#1529).

Pyarrow's conversion of data values to nanosecond precision can result in an out-of-bounds error for datetime values like 9999-12-31T23:59:59.999999. Therefore, the default datetime_dtype mapping will be ignored, and all datetime columns will be of object type. Once pandas supports pyarrow dtype backend (link), we can use more flexible custom time dtypes, such as pandas.ArrowDtype(pyarrow.timestamp("us", tz="UTC")).

Describe the solution you'd like
Add custom dtype mapping for these time dtypes conversion.

Describe alternatives you've considered
We can always cast the object dtypes to these custom dtype but would cause performance loss.

@chelsea-lin chelsea-lin self-assigned this Apr 7, 2023
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant