Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISO 8601 parsers #1143

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

ISO 8601 parsers #1143

wants to merge 8 commits into from

Conversation

pitdicker
Copy link
Collaborator

@pitdicker pitdicker commented Jun 10, 2023

My main motivation for adding an ISO 8601 parser is that chrono is currently unable to parse is own Debug output if the year is outside the 0..=9999 range supported by RFC 3339. Edit: I forgot the FromString implementation of DateTime is very forgiving and accepts this.

A second motivation is that this crate frequently mentions ISO 8601. Then it is strange it does not even have a parser for that format 😆.

ISO 8601 date/time strings have a pretty complex format. I see it as a combination of 12 different formats for the date, 12 formats for the time, and 4 for the offset.

I have made the parser public as a parse_from_iso8601 method on NaiveDate, NaiveTime, NaiveDateTime and DateTime::<FixedOffset>.

Sorry that this is such a large PR.
The only thing I can say is that about half of it are tests and documentation.

Fixes #587.

@pitdicker
Copy link
Collaborator Author

I wondered if I could use floats as they may not be supported by all targets, and the CI gives the answer: no.
I'll figure something out with just integers.

parse("1985102T235030,5+01"),
Ok(datetime(1985, 4, 12, 23, 50, 30, 500_000_000, 3600))
);
assert_eq!(parse("1985102T2350,5+01"), Ok(datetime(1985, 4, 12, 23, 50, 30, 0, 3600)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test test_parse_from_iso8601 should have asserts for errors on empty strings, unicode oddities, and a large variety of near-misses.

src/format/parse.rs Outdated Show resolved Hide resolved
src/format/parse.rs Outdated Show resolved Hide resolved
src/format/parse.rs Outdated Show resolved Hide resolved
src/format/parse.rs Outdated Show resolved Hide resolved
src/format/parse.rs Outdated Show resolved Hide resolved
src/naive/date.rs Show resolved Hide resolved
src/naive/datetime/mod.rs Show resolved Hide resolved
assert_eq!(parse("1985-W15-5T10:15"), Ok(datetime(1985, 4, 12, 10, 15, 0, 0)));
// Test 24:00:00 wraps to the next day
assert_eq!(parse("2023-06-09T24:00:00"), Ok(datetime(2023, 6, 10, 0, 0, 0, 0)));
}
Copy link
Contributor

@jtmoon79 jtmoon79 Jun 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs error variations

assert_eq!(parse("1985-04-12T10:15:3"), Err(INVALID)); // incomplete seconds
assert_eq!(parse("1985-04-12T10:1:30"), Err(INVALID)); // incomplete minutes
assert_eq!(parse("1985-04-12T0:15:30"), Err(INVALID)); // incomplete hours
assert_eq!(parse("1985-04-1T10:15:30"), Err(INVALID)); // incomplete day
assert_eq!(parse("1985-4-12T10:15:30"), Err(INVALID)); // incomplete month
assert_eq!(parse("1985-0-12T10:15:30"), Err(INVALID)); // bad month
assert_eq!(parse("99999-04-12T10:15:30"), Err(INVALID)); // bad year
assert_eq!(parse("1985-04-12_10:15:30"), Err(INVALID)); // bad separator
assert_eq!(parse("1985-04-12💀10:15:30"), Err(INVALID)); // bad separator
assert_eq!(parse(""), Err(INVALID)); // empty string
assert_eq!(parse("1"), Err(INVALID)); // very short string
assert_eq!(parse("💀"), Err(INVALID)); // unexpected data

I'm not sure which of these are valid INVALID asserts, but you get the idea.

src/naive/time/mod.rs Show resolved Hide resolved
Copy link
Contributor

@jtmoon79 jtmoon79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstrings should describe all returned items. Many users have never used or looked at chrono so they have no idea what any of this means nor can they quickly figure it out. They shouldn't have to hunt + peck around the documentation to understand the implied meaning of remainder.

src/format/parse.rs Outdated Show resolved Hide resolved
src/format/parse.rs Outdated Show resolved Hide resolved
src/format/parse.rs Outdated Show resolved Hide resolved
src/format/parse.rs Outdated Show resolved Hide resolved
@pitdicker
Copy link
Collaborator Author

@jtmoon79 Thank you for the review. Sorry I am a bit slow slow with addressing your comments. I'll probably fix it up this week.

@codecov
Copy link

codecov bot commented Sep 15, 2023

Codecov Report

Merging #1143 (afa45c6) into 0.4.x (e730c6a) will increase coverage by 0.18%.
The diff coverage is 97.29%.

@@            Coverage Diff             @@
##            0.4.x    #1143      +/-   ##
==========================================
+ Coverage   91.35%   91.53%   +0.18%     
==========================================
  Files          38       39       +1     
  Lines       17034    17550     +516     
==========================================
+ Hits        15562    16065     +503     
- Misses       1472     1485      +13     
Files Changed Coverage Δ
src/format/mod.rs 85.04% <ø> (ø)
src/format/parse_iso8601.rs 95.67% <95.67%> (ø)
src/datetime/mod.rs 86.96% <100.00%> (+0.87%) ⬆️
src/datetime/tests.rs 96.59% <100.00%> (+0.09%) ⬆️
src/format/parse.rs 97.30% <100.00%> (+0.06%) ⬆️
src/format/scan.rs 99.26% <100.00%> (+0.01%) ⬆️
src/naive/date.rs 96.35% <100.00%> (+0.05%) ⬆️
src/naive/datetime/mod.rs 97.46% <100.00%> (+0.07%) ⬆️
src/naive/datetime/tests.rs 98.71% <100.00%> (+0.08%) ⬆️
src/naive/time/mod.rs 96.09% <100.00%> (+0.13%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants