-
-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamped parsing/formatting #236
Comments
I'm a bit mixed on this. I'm going to write out what I think you're describing against some data I have. Please correct me if I'm wrong in understanding your idea here.
OffsetDateTime::parse(
"{year}-{month}-{day}T{hour}_{minute}_{second}.{nanoseconds}")?; Have you thought of "simply" using the https://www.unicode.org/reports/tr35/tr35-dates.html#Date_Field_Symbol_Table OffsetDateTime::parse(
"yyyy-MM-dd'T'HH_mm_ss.SSSSSSSSS")?; Tracking back to my issue, Here is some more documentation on the |
There's already support for RFC3339, but that's not the format you're seeking. The variable number of decimal points was special-cased for that. Unicode isn't ISO, but I think it's far clearer to specify things using words rather than using a seemingly arbitrary letter in many cases. |
I've just added this issue to TWIR's call for participation. As such, here's an update on the basics: Lazily formatting is done, and was trivial to complete. The rewrite won't impact the ability to do this. Having specifier-specific modifiers is still planned, though a list of what's desired would be great. Right now, all I've got for certain is the various padding options (none, space, zero) for a number of specifiers as well as whether or not the UTC offset (currently I began rewriting the parser for formatting strings yesterday. It uses a superior design, as it is a thin wrapper around a With regard to the format itself, I'm moving forward with bracketed specifiers. To output a literal Ignoring the necessary modifiers for padding, the equivalent of
|
While the compiler allows it, lazy formatting turns out to be unidiomatic, as it requires fallibility originating from the time crate.
As such, lazy formatting will be removed in 0.3. I'm investigating ways to handle formatting at all; right now I'm leaning towards an implementation of Note that it may still be possible to have infallible formatting (and as such would implement |
Revamped formatting has been fully implemented. There is a The format description can be constructed manually or it can be parsed from a textual representation of it; the latter requires an allocator. |
The full syntax is documented here. Both formatting and parsing are fully implemented and tested. |
While not ideal, I think it would be best to revamp the parsing and formatting of the various structs in the time crate.
According to tokei, the
src/format
directory has ~1100 lines of code. There is also a small amount of code in the files for each struct. While some of this code will be able to be reused, most of it will likely be replaced.Major changes that I think would be sensible to make are:
Eliminate single-letter specifiers
It's just plain confusing. Some specifiers (like
%Y
) are easily remembered, but most are not. Can you tell me the difference between%w
and%W
without looking at the reference? I certainly can't.More modifiers, even if specifier-specific
Let's allow for tons of options! Allow colons to be present (or not) for a UTC offset! There are certainly other things that could be allowed in the future.
Ability to lazily formatThis is by far the easiest one, as it's just an API addition. It would probably be best to just returnimpl Display
, so as to avoid any doc-hidden structs or other API guarantees.The combination of the first two leads to an inherent problem: when does the parser (of formatting strings) know when the specifier is over? Luckily, there's a solution that is both simple and keeps the parser simple: use a bracketed/parenthesized grouping delimiter. Due to the necessarily longer names of specifiers, the modifiers can be separated (both visually and logically) by a single space.
Another change that could prove useful for performance is a public API for the various specifier-modifier combinations. A macro could then be provided that would parse the formatting string at compile-time, such that the formatting string parser could be dropped by rustc as dead code.
This is certainly quite a bit to put out all at once. If you have any thoughts (for or against), leave a comment!
Edit: Some notes for myself as to intent.
It would be nice to be able to assume some default value for a component, such that it need not necessarily be present in the string being parsed. It'll probably be necessary to expose the raw values that were parsed, which would also allow a third-party to use those values freely.
The text was updated successfully, but these errors were encountered: