Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

👽 Implement tzinfo.fromutc for TzInfo #864

Merged
merged 7 commits into from
Aug 23, 2023

Conversation

lig
Copy link
Contributor

@lig lig commented Aug 7, 2023

Change Summary

Implement tzinfo.fromutc method in TzInfo as default behavior (see: https://docs.python.org/3/library/datetime.html#datetime.tzinfo.fromutc) isn't compatible with TzInfo implementation.

Related issue number

See pydantic/pydantic#6735

Checklist

  • Unit tests for the changes exist
  • Documentation reflects the changes where applicable
  • Pydantic tests pass with this pydantic-core (except for expected changes)
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @adriangb

@codecov
Copy link

codecov bot commented Aug 7, 2023

Codecov Report

Merging #864 (4c8c7ba) into main (7f7d03d) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #864      +/-   ##
==========================================
+ Coverage   93.78%   93.79%   +0.01%     
==========================================
  Files         105      105              
  Lines       15331    15362      +31     
  Branches       25       25              
==========================================
+ Hits        14378    14409      +31     
  Misses        947      947              
  Partials        6        6              
Files Changed Coverage Δ
python/pydantic_core/__init__.py 92.59% <ø> (ø)
src/input/datetime.rs 98.77% <100.00%> (+0.10%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7f7d03d...4c8c7ba. Read the comment docs.

@lig
Copy link
Contributor Author

lig commented Aug 7, 2023

please review

@codspeed-hq
Copy link

codspeed-hq bot commented Aug 7, 2023

CodSpeed Performance Report

Merging #864 will not alter performance

Comparing lig/6735-tzinfo-type-doesnt-provide-proper-dst (4c8c7ba) with main (7f7d03d)

Summary

✅ 138 untouched benchmarks

src/input/datetime.rs Outdated Show resolved Hide resolved
Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please can we test fromutc directly, and also check if there are some nice tests for it in cpython and copy them?

I have no idea if the current implementation is correct.

@@ -496,6 +496,11 @@ impl TzInfo {
None
}

fn fromutc<'p>(&self, py: Python<'p>, dt: &'p PyDateTime) -> PyResult<&'p PyDateTime> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn fromutc<'p>(&self, py: Python<'p>, dt: &'p PyDateTime) -> PyResult<&'p PyDateTime> {
fn fromutc<'py>(&self, py: Python<'py>, dt: &'py PyDateTime) -> PyResult<&'py PyDateTime> {

use use py everywhere.

@lig
Copy link
Contributor Author

lig commented Aug 10, 2023

please can we test fromutc directly, and also check if there are some nice tests for it in cpython and copy them?

I have no idea if the current implementation is correct.

Well, the current implementation of utcoffset is a shortcut as well and doesn't take into account dst or zone info data for the specific point of time of the argument. So, the whole class is pretty naive implementation. I have strong doubts it can pass all CPython tests.

Copy link
Contributor

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoting the Python docs:

Most implementations of dst() will probably look like one of these two:

def dst(self, dt):
    # a fixed-offset class:  doesn't account for DST
    return timedelta(0)

...

... if our TzInfo type represents a fixed time zone, maybe we want to implement dst like that too?

Comment on lines 499 to 534
fn fromutc<'p>(&self, py: Python<'p>, dt: &'p PyDateTime) -> PyResult<&'p PyDateTime> {
dt.call_method1("__add__", (self.utcoffset(py, py.None().as_ref(py))?,))?
.extract()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being unable to resist a little golfing here:

  • No need to take py separately as you can get it from dt.py()
  • I don't think there's value in casting the return value to datetime, that's just spending CPU cycles
Suggested change
fn fromutc<'p>(&self, py: Python<'p>, dt: &'p PyDateTime) -> PyResult<&'p PyDateTime> {
dt.call_method1("__add__", (self.utcoffset(py, py.None().as_ref(py))?,))?
.extract()
}
fn fromutc<'py>(&self, dt: &'py PyDateTime) -> PyResult<&'py PyAny> {
let py = dt.py();
dt.call_method1("__add__", (self.utcoffset(py, py.None().as_ref(py))?,))
}

(self.utcoffset() could actually get similar treatment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of taking py from dt? I thought the preferred way is requesting it with the function argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just that it's all the same py so no need to have a separate argument for it. It's a bit like the following would be redundant:

fn f(x: i32, (_, y): (i32, i32)) { ... }

f(x, (x, y))  // passed x twice, ignored in second case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting it with the function argument is better than Python::with_gil if you don't already have it for sure.

@lig lig force-pushed the lig/6735-tzinfo-type-doesnt-provide-proper-dst branch from 19858e9 to 9aa86b3 Compare August 17, 2023 12:34
@lig
Copy link
Contributor Author

lig commented Aug 17, 2023

@davidhewitt

if our TzInfo type represents a fixed time zone, maybe we want to implement dst like that too?

Returning 0 from dst explicitly means that there is no day light saving time in effect. However, there could be but we just do not know it in our implementation. So, having it return None in such a case seems fair.

@lig lig assigned davidhewitt and samuelcolvin and unassigned lig Aug 17, 2023
@davidhewitt
Copy link
Contributor

However, there could be but we just do not know it in our implementation.

Is this the case? If we parse 2032-01-01T01:01:00+02:00 I would be ok with treating that as a fixed offset which we can assume to be +02:00 with dst offset of 0?

Looks like https://datatracker.ietf.org/doc/draft-ietf-sedate-datetime-extended/ is needed for RFC3339 datetimes to support zones with dst, in which case it's passed explicitly.

@lig
Copy link
Contributor Author

lig commented Aug 21, 2023

Is this the case? If we parse 2032-01-01T01:01:00+02:00 I would be ok with treating that as a fixed offset which we can assume to be +02:00 with dst offset of 0?

We cannot assume that. For example, CEST is exactly +02:00 but it has dst offset of +01:00 now included in that +02:00. 2032-01-01T01:01:00+02:00 could be CEST with dst+1h or some +02:00 timezone that doesn't use DST and thus has dst+0h now.

@davidhewitt
Copy link
Contributor

Returning 0 from dst explicitly means that there is no day light saving time in effect. However, there could be but we just do not know it in our implementation. So, having it return None in such a case seems fair.

Discussed offline - datetime.timezone always returns None, so we'll do the same.

@lig
Copy link
Contributor Author

lig commented Aug 21, 2023

@lig
Copy link
Contributor Author

lig commented Aug 21, 2023

please review

@lig lig requested a review from adriangb August 21, 2023 16:11
@lig lig force-pushed the lig/6735-tzinfo-type-doesnt-provide-proper-dst branch from 94b1b36 to 605a8f9 Compare August 21, 2023 16:27
Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise LGTM.

@@ -508,11 +512,18 @@ pub struct TzInfo {
#[pymethods]
impl TzInfo {
#[new]
fn new(seconds: i32) -> Self {
Self { seconds }
fn py_new(seconds: f32) -> PyResult<Self> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to add

impl TryFrom<i32> for TzInfo {
    type Error = PyErr;

    fn try_from(seconds: i32) -> PyResult<Self> {
        if seconds.abs() >= 86400 {
            Err(PyValueError::new_err(format!(
                "TzInfo offset must be strictly between -86400 and 86400 (24 hours) seconds, got {seconds}"
            )))
        } else {
            Ok(Self { seconds })
        }
    }
}

Then use (seconds.trunc() as i32).try_into() here, then you can use let tz_info: TzInfo = offset.try_into()?; above and avoid going i32 -> f32 -> i32.

DSTEND = datetime(1, 10, 25, 1)


class TestTzInfo(unittest.TestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming this is copied from cpython somewhere, please include a link to the original in a docstring.

self.EST = TzInfo(-timedelta(hours=5).total_seconds())
self.DT = datetime(2010, 1, 1)

def test_str(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can probably use parameterize for these and thereby make any error easier to read.

Ideally these should be rewritten in pytest style, it shouldn't take very long, but I know you've already spend lots of time on this, so don't worry if you can't do it quickly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda like the idea of keeping them as unittest. @lig can you add a link / note to permalink to the standard library for future selfs?

src/input/datetime.rs Show resolved Hide resolved
Copy link
Member

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, I like that we are trying to match str() and such as closely as possible to the standard library 👍🏻

@lig lig force-pushed the lig/6735-tzinfo-type-doesnt-provide-proper-dst branch from 605a8f9 to 4c8c7ba Compare August 22, 2023 22:20
@lig lig enabled auto-merge (squash) August 22, 2023 22:21
@lig lig requested a review from samuelcolvin August 22, 2023 22:32
@lig lig disabled auto-merge August 22, 2023 22:32
@lig lig assigned samuelcolvin and unassigned lig Aug 22, 2023
@samuelcolvin samuelcolvin merged commit 8477825 into main Aug 23, 2023
30 checks passed
@samuelcolvin samuelcolvin deleted the lig/6735-tzinfo-type-doesnt-provide-proper-dst branch August 23, 2023 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants