-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for well-known datatypes like date/time #1098
Comments
YAML has the concept of scalar nodes, which are defined as "an opaque datum that can be presented as a series of zero or more Unicode characters". Scalar nodes (like other kinds of node) can have a tag. "Scalar tags must also provide a mechanism for converting formatted content to a canonical form for supporting equality testing." |
The concept for the language feature is to introduce a new basic type, called string-formatted data, or sdata, for representing data that is conventionally represented in a specialized string format. The sdata basic type is readonly, and is included in anydata but not json. Like the simple data types and string, sdata values do not have storage identity. The sdata basic type is divided into named subtypes, one for each string format. The semantics of each named subtype is defined in terms of an underlying value type, which is a subtype of anydata, together with conversion operations between that underlying type and its string format. For example, a timestamp data type might be defined with a value type of A program constructs a literal value of type sdata by using the subtype name followed by the string representation in backticks. The language specification defines
Definitions can be provided either by
There is no mechanism for definitions to be provided from outside the platform. The platform will only define subtypes that are widely interoperable. This preserves the program-independent aspect of anydata. |
The definition provides the following information:
|
sdata values support the following operations:
|
There is a langlib module lang.sdata that is the langlib module for this new sdata basic type, which provides the following:
User programs would not typically need to import the lang.sdata module (just as they do not need to import the lang.value type). When method call syntax is used for an sdata value, the function is searched for in order in the following modules (this is similar to what happens for existing basic types):
|
The standard library provides a The module for a standard library defined type with tag t is
It then also provides type-specific functions that can be implemented in terms of these primitive functions; each of these will usually take The mechanism that the ballerina/data and ballerina/data.t modules use to provide these definitions in |
Currently we have two kinds of equality:
There are two differences between equality and exact equality:
When the unpacked representation of an sdata value contains a decimal (which it probably will for types involving time), then === for the sdata value needs to consider the precision (because values that are === should be indistinguishable) but not the storage identity. This means we need another kind of equality, which
Let's call this precise equality. Then
|
Most of this has been done as part of adding #1132. We are calling these things tagged data type. Compared to what was described earlier, we haven't yet needed to expose the value data structure. |
Ballerina currently does not handle data types like date/time well. These data types have a conventional string syntax, but they also have a higher-level semantic that can be represented by a Ballerina value that is not just a string. For example, date-time has a conventional string syntax (e.g. RFC 3339/ISO 8601). But the semantics of date-time would be better represented by numbers e.g. (number of units of time from some epoch for a timestamp, or triple of year/month/day integers for date). In Ballerina currently, we have to choose between two alternatives, neither of which are completely satisfactory:
For data-types built into Ballerina (e.g. decimal and xml) we do not have this problem. We have the right semantics and we have the string syntax. It should be possible to do this for other data types also. For example, it should be possible to have for example a Timestamp data type that:
There are quite a number of other common data types that are like this.
None of these are specific to a particular program: they all fit into the concept of anydata.
JSON schema handles this by allowing an assertion that a string has a specific format. (In theory, JSON schema allows this for values other than strings, but all the formats it defines are for strings.) Protocol Buffers have the concept of a well-known type.
One solution to this would be to have the language include a separate basic type for each of these. But that wouldn't be a great solution: it should be possible to evolve these data types independently of the language specification. Also it should not be necessary to include these in the language. We can define each of the data types in terms of concepts we already have
The goal then is to devise a language feature that we can use to add a data type that works very similarly to a built-in data type, without needing to add something to the language specification for each such data type. If one of these data types was a basic type, then it would
fromBalString
andtoBalString
functionsSo, for example, if
data:Timestamp
referred to a timestamp type, thenThe text was updated successfully, but these errors were encountered: