Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YYYY week into a date #6796

Closed
2 tasks done
Bidek56 opened this issue Feb 10, 2023 · 8 comments
Closed
2 tasks done

YYYY week into a date #6796

Bidek56 opened this issue Feb 10, 2023 · 8 comments
Labels
bug Something isn't working python Related to Python Polars

Comments

@Bidek56
Copy link
Contributor

Bidek56 commented Feb 10, 2023

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

Attempting to parse YYYY week into date results in an error: ComputeError: strict conversion to dates failed, maybe set strict=False

import polars as pl
pl.DataFrame(
{
 "week": [201901, 201902, 201903, 201942, 201943, 201944]
}).with_columns(pl.col('week').cast(pl.Utf8).str.strptime(pl.Date, fmt='%Y%U').alias("date"))

According to the Rust Chrono specifiers, %U is supported.

Pandas can read it: df['date'] = pd.to_datetime(df.week.astype(str) + '0', format='%Y%W%w')

Thanks to SO, I have found a workaround but I think it's a bug.

Reproducible example

import polars as pl
pl.DataFrame(
{
 "week": [201901, 201902, 201903, 201942, 201943, 201944]
}).with_columns(pl.col('week').cast(pl.Utf8).str.strptime(pl.Date, fmt='%Y%U').alias("date"))

Expected behavior

date

2019-01-01
2019-01-08
2019-01-15
2019-10-15
2019-10-22
2019-10-29

Installed versions

---Version info---
Polars: 0.16.3
Index type: UInt32
Platform: Windows-10-10.0.22621-SP0
Python: 3.11.2 (tags/v3.11.2:878ead1, Feb  7 2023, 16:38:35) [MSC v.1934 64 bit (AMD64)]
---Optional dependencies---
pyarrow: 11.0.0
pandas: 1.5.3
numpy: 1.24.2
fsspec: 2023.1.0
connectorx: <not installed>
xlsx2csv: 0.8.1
deltalake: <not installed>
matplotlib: 3.6.3
@Bidek56 Bidek56 added bug Something isn't working python Related to Python Polars labels Feb 10, 2023
@MarcoGorelli
Copy link
Collaborator

MarcoGorelli commented Feb 10, 2023

Hi @Bidek56

Looks like this doesn't quite work in chrono:

use chrono::{NaiveDate};

fn main() {
    let date_str = "201901";
    let naive_date = NaiveDate::parse_from_str(date_str, "%Y%U").unwrap();
    println!("{:?}", naive_date);
}
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ParseError(NotEnough)', src/main.rs:5:66
note: [run with `RUST_BACKTRACE=1` environment variable to display a backtrace](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)

I think you need to also pass the day of the week, e.g.

    let date_str = "201901Mon";
    let naive_date = NaiveDate::parse_from_str(date_str, "%Y%U%a").unwrap();

@Bidek56
Copy link
Contributor Author

Bidek56 commented Feb 11, 2023

Does the code below work for you? b/c "%Y%U%a" works but not "%Y%W%a" but the docs says that it should.

    let date_str = "201901Mon";
    let naive_date = NaiveDate::parse_from_str(date_str, "%Y%W%a").unwrap();

@FObersteiner
Copy link

FObersteiner commented Feb 11, 2023

I think YYYY-WW is just not a precise date. For example, take the ISO week date definition

A precise date is specified by the ISO week-numbering year in the format YYYY, a week number in the format ww prefixed by the letter 'W', and the weekday number, a digit d from 1 through 7, beginning with Monday and ending with Sunday.

That's probably why you need to specify the day of the week - although one might think that with the week number only, it should default to a certain day of week (but which one?!).

So I'd conclude this is not a bug but a feature to avoid ambiguity. Only the chrono docs could be a bit more specific, IMHO.

@Bidek56
Copy link
Contributor Author

Bidek56 commented Feb 11, 2023

Hmmm, I wonder why "%Y%W%a" works for you in Rust but it does not work in Python.

@FObersteiner
Copy link

FObersteiner commented Feb 12, 2023

@Bidek56 sorry, that was a bit unclear. %W doesn't work for me either. Here's a playground example.

In Python, both %U and %W work, but give the same result if you set the weekday to 1 ?!

from datetime import datetime

print(datetime.strptime("2019-01-0", "%Y-%U-%w"))
# 2019-01-06 00:00:00
print(datetime.strptime("2019-01-0", "%Y-%W-%w"))
# 2019-01-13 00:00:00

print(datetime.strptime("2019-01-1", "%Y-%U-%w"))
# 2019-01-07 00:00:00
print(datetime.strptime("2019-01-1", "%Y-%W-%w"))
# 2019-01-07 00:00:00

It seem something strange is going on here; chrono errors out, and Python gives results that are unexpected for me. ISO week date parsing directives ("%G-%V-%u") give consistent results though. But I still don't fully understand what's going on here, so I put up a question on stackoverflow.

@Bidek56
Copy link
Contributor Author

Bidek56 commented Feb 12, 2023

 let date = NaiveDate::parse_from_str("2019-01-0", "%Y-%W-%w")?;

Returns: Error: ParseError(Impossible)

But

let date = NaiveDate::parse_from_str("2020-01-0", "%Y-%W-%w")?;

Returns: 2020-01-12

@FObersteiner
Copy link

FObersteiner commented Feb 13, 2023

 let date = NaiveDate::parse_from_str("2019-01-0", "%Y-%W-%w")?;

Returns: Error: ParseError(Impossible)

But

let date = NaiveDate::parse_from_str("2020-01-0", "%Y-%W-%w")?;

Returns: 2020-01-12

Exactly. This seems somehow related to year 2019. But this polars issue #6796 is the wrong place to discuss I think. We should open an issue at chrono or a question on stackoverflow.

@Bidek56
Copy link
Contributor Author

Bidek56 commented Feb 13, 2023

@FObersteiner I will close this ticket and submit a Chrono issue instead. Thanks for your help.

@Bidek56 Bidek56 closed this as completed Feb 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

3 participants