What's the purpose of `tendermint::Time`? #865

hdevalence · 2021-04-19T21:56:47Z

The tendermint crate has a Time(DateTime<Utc>) structure that wraps the chrono::DateTime<Utc> type and implements a bunch of custom methods for it.

Currently in #862, I translated the Protobuf Timestamp type directly to a DateTime<Utc>, which I think is a more ergonomic type for downstream users and would prefer to use, unless there is a reason not to use it.

I wasn't sure from reading the documentation what the purpose of the custom time type was, but I didn't see an explanation of how it differs from a DateTime<Utc>. Because Time has impl From<DateTime<Utc>> for Time, the tendermint crate's public API is still locked to chrono 0.4, and because From impls are infallible, there's no way for the Time type to enforce any additional invariants on a DateTime<Utc>. So I'm not sure that there's really a meaningful encapsulation benefit.

Another possible motivation could be for use with JSON serialization -- from looking at the commit history, it seemed like there was a bug involved in the custom serialization (#775). It's easy to add custom serialization to a newtype wrapper compared to a third party type. But if this was the motivation, an alternative that doesn't involve creating and maintaining a new type could be to define JSON serialization on the tendermint_proto structs, rather than on all of the domain types, and use the From/TryFrom impls to apply that serialization to the domain types.

The text was updated successfully, but these errors were encountered:

greg-szabo · 2021-04-22T14:17:57Z

All right, so if I understand correctly, you're advocating for getting rid of tendermint::Time in lieu of tendermint_proto::Timestamp since there's no real added value.

So far, looking at the code I found that

tendermint::Time in Go is also based on google::protobuf::Timestamp (https://github.com/tendermint/spec/blob/d46cd7f573a2c6a2399fcab2cde981330aa63f37/spec/core/data_structures.md#time)
JSON serialization has to be moved into tendermint_proto, but we already control Timestamp there so it's not a big deal to apply it.
Or maybe it is, we need to check: there is a JSON serialization already applied to tendermint_proto::google::protobuf::Timestamp for RFC3339 conversion. I have to check if it's the same thing implemented in tendermint and why was it implemented twice.

Apart from checking out the JSON serialization I didn't find any other reason why this wouldn't be feasible to do. I'm going to play around with serialization and then attempt to implement this and see what happens. We might just end up with simpler code.

greg-szabo · 2021-04-22T14:20:32Z

Just confirmed, tendermint::Time is falling back to tendermint_proto::Timestamp for serialization. No issues there.

greg-szabo · 2021-04-22T14:30:56Z

tendermint::Time has a lot of little convenience functions, like now(), duration_since(), Display, String traits, etc. These are used in all kinds of tests and also possibly in downstream projects. All of them need to be updated or alternatives made available or somehow communicated that these are not available anymore.

We could add these to tendermint_proto so it can be easily reused, but it still requires a new crate to be imported to access these functions.

Or we could keep tendermint::Time around with a deprecation warning for a release cycle to get feedback from the downstream projects. Even if we do this, I'd rewrite these convenience functions to call into tendermint_proto so downstream projects have an example on how to do it after these functions go away.

I'd be happy to hear additional ideas and opinions before I sink my teeth too much into it.

hdevalence · 2021-04-22T16:21:44Z

All right, so if I understand correctly, you're advocating for getting rid of tendermint::Time in lieu of tendermint_proto::Timestamp since there's no real added value.

No, sorry for being unclear, I'd rather use the DateTime<Utc> that tendermint::Time holds internally:

Currently in #862, I translated the Protobuf Timestamp type directly to a DateTime<Utc>, which I think is a more ergonomic type for downstream users and would prefer to use, unless there is a reason not to use it.

I'd prefer not to use the raw proto types except for cases that really need to care about proto serialization specifically.

greg-szabo · 2021-04-24T18:49:57Z

Huh, ok, that's interesting. If you want to use it directly, there's no way to add additional functionality to it, like convenient protobuf-encoding/decoding or JSON serialization/deserialization. As Tony mentioned in the Signal discussion, the newtype tendermint::Time was created exactly to add these additional functions.

I would like to entertain the idea but I can't yet imagine how it would work. We rip out tendermint::Time and anywhere it's used we replace it with DateTime<Utc>, right? We'd lose all the little functions (but possibly gain ones defined in DateTime) and when a type that is protobuf-encoded uses time, it has to have a "RawType" that converts it from/to DateTime so it can be encoded. A first-grade example is CanonicalProposal which works exactly like that. This looks doable. (Other candidates Proposal, Vote.)

Now, let's look at JSON-serialization. (I'm not going to repeat the blog entry I sent on Signal :) ). I went through the structs with a struct-analyzer and saw that most structs either use a RawType already (like CanonicalVote or block::Header) or they are not serialized/deserialized at all. This is important, because serde does implicit serialization if no RawType is defined: you add a field that is a struct with a field of Time and serde will follow the definitions and serialize case-by-case.

There is one place where this happens: the Genesis struct has a Time field and it does not have a RawType. Exchanging this to DateTime<Utc> makes us trust serde that it can deserialize an Rfc3339-compatible string into a DateTime<Utc>. Based on the custom deserializers we've already encountered, I believe we will have to do deserialization manually here. Interestingly, there is already a deserialize_time function right under the Genesis definition, which is currently unused. (I guess when Time got its serialization, this deserializer was removed from the struct attributes.)

Genesis JSON serialization is used in (at least) two places: In the RPC /genesis endpoint and when reading the genesis.json file during the first startup. Adding a custom time deserializer here sounds fine, as long as we have some kind of validation that downstream projects won't have to do the same.

So, yeah, I think I can entertain this idea. :) I'd be happy to hear others' opinions.

hdevalence · 2021-04-26T20:33:04Z

If you want to use it directly, there's no way to add additional functionality to it

I don't think this is quite true, it's possible to monkey-patch additional functionality onto third-party types using extension traits. The downside of this is that the user has to import the extension trait, but the upside is that they maintain access to all of the original API. In contrast, wrapping a struct to add functionality requires either removing access to the original API, making the user jump through a hoop to get it (conversions), or reimplementing it.

That said, in terms of the concrete functionality you mentioned:

like convenient protobuf-encoding/decoding or JSON serialization/deserialization. As Tony mentioned in the Signal discussion, the newtype tendermint::Time was created exactly to add these additional functions.

I would like to entertain the idea but I can't yet imagine how it would work. We rip out tendermint::Time and anywhere it's used we replace it with DateTime<Utc>, right? We'd lose all the little functions (but possibly gain ones defined in DateTime) and when a type that is protobuf-encoded uses time, it has to have a "RawType" that converts it from/to DateTime so it can be encoded. A first-grade example is CanonicalProposal which works exactly like that. This looks doable. (Other candidates Proposal, Vote.)

All the RawType proto types are generated from the proto definitions already though, right? So it's not that you'd need to be defining anything extra, but just defining conversions to the types that already exist.

Now, let's look at JSON-serialization. (I'm not going to repeat the blog entry I sent on Signal :) ). I went through the structs with a struct-analyzer and saw that most structs either use a RawType already (like CanonicalVote or block::Header) or they are not serialized/deserialized at all. This is important, because serde does implicit serialization if no RawType is defined: you add a field that is a struct with a field of Time and serde will follow the definitions and serialize case-by-case.

There is one place where this happens: the Genesis struct has a Time field and it does not have a RawType. Exchanging this to DateTime<Utc> makes us trust serde that it can deserialize an Rfc3339-compatible string into a DateTime<Utc>. Based on the custom deserializers we've already encountered, I believe we will have to do deserialization manually here. Interestingly, there is already a deserialize_time function right under the Genesis definition, which is currently unused. (I guess when Time got its serialization, this deserializer was removed from the struct attributes.)

Genesis JSON serialization is used in (at least) two places: In the RPC /genesis endpoint and when reading the genesis.json file during the first startup. Adding a custom time deserializer here sounds fine, as long as we have some kind of validation that downstream projects won't have to do the same.

So, yeah, I think I can entertain this idea. :) I'd be happy to hear others' opinions.

I guess my feeling here is that, for all of the types that have an associated proto, there's already a generated proto type, so the Serde annotations that define the exact shape of the JSON output would be best defined on the proto type, so that there's only one place in the code that's defining the serialization.

For structures that don't have a corresponding proto, like apparently the Genesis struct (it's never sent over ABCI?), I think it's better to define a serialization for that structure explicitly, rather than implicitly defining that structure's serialization by defining that timestamps in general should be serialized in the way that specific structure requires.

mzabaluev · 2021-11-26T19:30:26Z

In #1030, I'm steering Time to be more explicit in the value range it accepts (since we probably want RFC 3339 fmt to work with any Time value, which means the year must be in 0..=9999), hide the struct's internal representation, and generally provide a more disciplined API than the prost-generated Timestamp struct with public members.

A case can be made for Time becoming more of an operational data type, optimized for arithmetic and comparisons which are performed on this type, for example, in the light client. These operations are slow on parsed date-time, whether it's implemented by chrono or time. With changes made in #1030, it should be easy to later change the internal representation to an i128 Unix timestamp in nanoseconds (or even u128 if we don't need timestamps before 1970 and don't want one extra fallible conversion for arithmetic with core::time::Duration). This will lose some memory efficiency on 32-bit platforms, but in regard to CPU time, the extra overhead on conversion to/from human time data types will likely be negligible in parse/fmt use cases.

For serde, Time should delegate to the proto Timestamp type. The serde impls for that could use a more efficient binary struct serialization on non-human-readable serializers, in case someone decides to use one of those on our library types.

I should write these and a few other ideas up in an ADR, probably some time next week.

hdevalence · 2021-11-27T06:20:16Z

A case can be made for Time becoming more of an operational data type, optimized for arithmetic and comparisons which are performed on this type, for example, in the light client. These operations are slow on parsed date-time, whether it's implemented by chrono or time.

Is there actually a performance issue using chrono::DateTime<Utc>?

EDIT: looking at the chrono source, it stores the time as a pair of u32s, together with an extra i32 encoding the date. It's hard to understand how a different representation would be significantly more efficient.

mzabaluev · 2021-11-27T08:47:58Z

@hdevalence For adding or subtracting a Duration on a human time struct, the arithmetic is either performed member-wise, which is not pretty, or the struct is converted to an integer timestamp, then after adding/subtracting, the result is broken down back into human time. Comparisons are also performed member-to-member. With a linear timestamp value, it's simple integer checked arithmetic.

Memory efficiency is another matter. The parsed time struct can fit into 12 bytes with the alignment of 4. A timestamp can be broken down to an (i64, u32) pair, but on 64-bit platforms the structure size will be 16 bytes due to the alignment.

So the choice of internal representation depends on the optimization priorities for users of the Time struct. If it's going to be used in bulk workloads doing e.g. validation logic, I'd suggest going with the arithmetic-friendly timestamp representation.

mzabaluev · 2021-11-29T08:45:19Z

Filed an ADR: #1035

hdevalence added the enhancement New feature or request label Apr 19, 2021

greg-szabo self-assigned this Apr 22, 2021

greg-szabo removed their assignment Nov 15, 2021

tony-iqlusion mentioned this issue Nov 29, 2021

Serialization of Tendermint RFC3339 Timestamps tendermint/tendermint#6233

Closed

hdevalence mentioned this issue Nov 29, 2021

Replace chrono with time 0.3 #1030

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the purpose of `tendermint::Time`? #865

What's the purpose of `tendermint::Time`? #865

hdevalence commented Apr 19, 2021

greg-szabo commented Apr 22, 2021

greg-szabo commented Apr 22, 2021

greg-szabo commented Apr 22, 2021 •

edited

Loading

hdevalence commented Apr 22, 2021

greg-szabo commented Apr 24, 2021

hdevalence commented Apr 26, 2021

mzabaluev commented Nov 26, 2021 •

edited

Loading

hdevalence commented Nov 27, 2021 •

edited

Loading

mzabaluev commented Nov 27, 2021 •

edited

Loading

mzabaluev commented Nov 29, 2021

What's the purpose of tendermint::Time? #865

What's the purpose of tendermint::Time? #865

Comments

hdevalence commented Apr 19, 2021

greg-szabo commented Apr 22, 2021

greg-szabo commented Apr 22, 2021

greg-szabo commented Apr 22, 2021 • edited Loading

hdevalence commented Apr 22, 2021

greg-szabo commented Apr 24, 2021

hdevalence commented Apr 26, 2021

mzabaluev commented Nov 26, 2021 • edited Loading

hdevalence commented Nov 27, 2021 • edited Loading

mzabaluev commented Nov 27, 2021 • edited Loading

mzabaluev commented Nov 29, 2021

What's the purpose of `tendermint::Time`? #865

What's the purpose of `tendermint::Time`? #865

greg-szabo commented Apr 22, 2021 •

edited

Loading

mzabaluev commented Nov 26, 2021 •

edited

Loading

hdevalence commented Nov 27, 2021 •

edited

Loading

mzabaluev commented Nov 27, 2021 •

edited

Loading