Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas.NaT returned when converting series to standard python datetime, but does not support datetime functionality #16248

Closed
gbrand-salesforce opened this issue May 5, 2017 · 3 comments
Labels
Datetime Datetime data dtype Usage Question

Comments

@gbrand-salesforce
Copy link

Code Sample, a copy-pastable example if possible

import pandas as pd
pandas_datetimes = pd.Series(pd.date_range('20130101',periods=3,tz='US/Eastern'))
pandas_datetimes[1] = pd.NaT
python_datetimes = pandas_datetimes.dt.to_pydatetime()
# python_datetimes' elements should all behave as datetime.datetime objects
for time in python_datetimes:
    print(time.utcoffset())   # crash on second element.

Problem description

Converting pandas datetime to the standard datetime is done for interoperability reasons - for example, when converting a DataFrame into a list of dictionaries to be serialized into mongo.
Having NaT returned in this case breaks the interoperability, since the receiving code handles it as if it is datetime.datetime, but NaT does not support the interface of datetime.datetime.

I think the best solution is, when converting NaT into datetime.datetime, return another object, e.g. pyNaT, which does provide the datetime interface.

In my given example of working with mongo, trying to save a DataFrame with NaT in it results in

File "/usr/local/lib/python2.7/dist-packages/pymongo/collection.py", line 540, in insert
gen(), check_keys, self.uuid_subtype, client)
File "pandas/tslib.pyx", line 870, in pandas.tslib._make_error_func.f (pandas/tslib.c:17775)
ValueError: NaTType does not support utcoffset

This could have been avoided.
bokeh also has such a problem when operating with pandas

This is probably related to #12976

@jreback
Copy link
Contributor

jreback commented May 5, 2017

and what would .utcoffset() return for a null datetime?

@jreback
Copy link
Contributor

jreback commented May 6, 2017

This could have been avoided.

not sure what you mean here.

NaT functions perfectly well as a missing value representation for datetimelikes. Exporting to a non-missing value aware system and using datetimes requires care when operating (and of course would be completely non-performant).

@jreback jreback closed this as completed May 6, 2017
@jreback jreback added this to the won't fix milestone May 6, 2017
@jreback jreback added Datetime Datetime data dtype Usage Question labels May 6, 2017
@gbrand-salesforce
Copy link
Author

and what would .utcoffset() return for a null datetime?

it would return None, just like it does when calling tzinfo() on it...

Exporting to a non-missing value aware system and using datetimes requires care when operating (and of course would be completely non-performant).

I did not suggest such a thing, just have another NaT type, which supports the datetime interface, in addition to the current one, which supports the TimeStamp interface, and when performing to_pydatetime() return the datetime NaT instead of the TimeStamp NaT...

I don't see why it would impact performance or require specialized code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Usage Question
Projects
None yet
Development

No branches or pull requests

3 participants