-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YAML fixes #4838
YAML fixes #4838
Conversation
src/yaml/from_yaml.cr
Outdated
nil | ||
else | ||
raise YAML::ParseException.new("Expected nil, not #{value}", 0, 0) | ||
end | ||
end | ||
|
||
def Bool.new(pull : YAML::PullParser) | ||
pull.read_scalar == "true" | ||
case value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need
case value = pull.read_scalar
here
You must add examples to yaml_spec. |
src/yaml/from_yaml.cr
Outdated
value = pull.read_scalar | ||
if value.empty? | ||
case value = pull.read_scalar | ||
when "", "NULL", "Null", "null" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be necessary because libyaml recognizes null values and it looks wrong, because it would also recognize a quoted string "null"
as nil value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, but it doesn't I will have to look more in depth to libyaml again tonight. But I was running into issues where when I inspect the return value of read_scalar
I get "null"
.
The parsers for Bool and Nil are wrong because they don't check whether it's a plain value or an escaped string: require "yaml"
class Foo
YAML.mapping(bool: Bool, null: Nil)
end
pp Foo.from_yaml(%(bool: true\nnil: null)) # => #<Foo @bool=true, @null=nil>
pp Foo.from_yaml(%(bool: "true"\nnull: "")) # => #<Foo @bool=true, @null=nil> YAML's boolean type only recognizes |
@akzhan the details of the yaml spec are cited in the commit messages. |
@straight-shoota ruby implements the language independent types.
|
Oh, and pp YAML.parse(%(bool: "true"\nnull: "")) # => {"bool" => "true", "null" => ""}
pp YAML.parse(%(bool: true\nnil: null)) # => {"bool" => "true", "nil" => "null"} |
There should be specs for all these special cases. |
@straight-shoota The following is wrong require "yaml"
class FooBar
YAML.mapping({
foo: Nil,
bar: Bool
})
end
yaml = FooBar.from_yaml <<-yaml
foo: null
bar: true
yaml
puts yaml
|
First and foremost, the current parser relies on libyaml which is a YAML 1.1 parser, not a YAML 1.2 parser. Trying to parse YAML 1.2 on top of a YAML 1.1 may prove problematic, if not impossible. Now, you can't assume a scalar value that looks like It seems libyaml does report the scalar style, but this isn't exposed by the class YAML::PullParser
def style
@event.data.scalar.style
end
end
style, str = parser.style, parser.read_scalar
if style.plain?
case str
when "true", "True", "TRUE"
true
when "false", "False", "FALSE"
false
when "null", "Null", "NULL", "~", ""
nil
when ".inf", ".Inf", ".INF", "+.inf", "+.Inf", "+.INF"
Float64::INFINITY
when "-.inf", "-.Inf", "-.INF"
-Float64::INFINITY
when ".nan", ".NaN", ".NAN"
Float64::NAN
else
str.to_i?(whitespace: false, underscore: false) ||
str.to_f?(whitespace: false, underscore: false) ||
str
end
else
str
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per my previous comment: the scalar style must be plain for any transformation to actually happen, otherwise it was a string.
Furthermore, yes
, on
, no
and off
do not represent boolean values as per the YAML 1.2 specification, and are thus invalid.
src/yaml/from_yaml.cr
Outdated
nil | ||
else | ||
raise YAML::ParseException.new("Expected nil, not #{value}", 0, 0) | ||
end | ||
end | ||
|
||
def Bool.new(pull : YAML::PullParser) | ||
pull.read_scalar == "true" | ||
case value = pull.read_scalar | ||
when "true", "True", "TRUE", "on", "On", "ON", "y", "Y", "yes", "Yes", "YES" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10.3.2 states that only "true", "True" and "TRUE" represent the boolean true value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values are from the Language-Independent Types for YAML. see my comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@straight-shoota correct, @ysbaddaden also the 1.2 spec is not in libyaml. libyaml is YAML 1.1
src/yaml/from_yaml.cr
Outdated
case value = pull.read_scalar | ||
when "true", "True", "TRUE", "on", "On", "ON", "y", "Y", "yes", "Yes", "YES" | ||
true | ||
when "false", "False", "FALSE", "off", "Off", "OFF", "n", "N", "no", "No", "NO" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10.3.2 states that only "false", "False" and "FALSE" represent the boolean false value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values are from the Language-Independent Types for YAML. see my comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@straight-shoota correct, @ysbaddaden also the 1.2 spec is not in libyaml. libyaml is YAML 1.1
@ysbaddaden made a ton of changes doing just that, I can remove the YAML 1.1 boolean types, I also need to make the inclusions for the other types you mentioned. What do you think of the changes thus far? |
@ysbaddaden I am actually going to push back on the YAML 1.2 compliance. libyaml is only 1.1 compliant, and unless we roll our own parser in the future, I think we need to stay inline with the libyaml dependency. |
bee14b7
to
60135da
Compare
Any other changes I need to make here? |
@jwaldrip I'm just the stranger but Travis reported:
|
src/yaml.cr
Outdated
when ".nan", ".NaN", ".NAN" | ||
"NaN" | ||
else | ||
if value.to_i64? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure: Are the number formats accepted by to_i64
and to_f64
identical to the YAML number format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@straight-shoota For the basic types yes, it does not accept the base variants. I was looking into adding that, but would make for a much larger refactor. Ruby doesn't even implement them correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about both false positives and false negatives. For starters, whitespace
argument should be false
(this particularly shouldn't really matter, but whitespaces are not allowed in YAML numbers) and prefix
should be true
as well as underscore
and strict
(which is true
by default).
YAML int type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you must specify to_i?(whitespace: false, underscore: false)
and keep strict
and prefix
enabled, otherwise integers and floats with leading/suffix spaces or underscores that are valid Crystal numbers would get parsed, even though they're invalid YAML numbers.
src/yaml.cr
Outdated
@@ -147,4 +148,28 @@ module YAML | |||
def self.dump(object, io : IO) | |||
object.to_yaml(io) | |||
end | |||
|
|||
# Determines if the scaler is reserved | |||
def self.reserved_scalar(value : String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method shouldn't return strings but symbols or enum values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like this method much. For example it tries to linearize the scalar value (using a non-standard naming: infinity
, NaN
), but sometimes returns a detected tag (int
, float
) which means the integer/float number will have to be parsed again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ysbaddaden I am not sure I understand. infinity
and NaN
are represented with .nan
and .inf
how do you mean int/float will have to be parsed again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think he's referring to the last branch, where you're calling value.to_i64?
and value.to_f64?
where you're parsing a number just to throw it away and (maybe) parse it again. This is bad.
In most cases, this method is only used to check if the plain scalar has a specific value. This could be implemented directly. It'd also a simplification: In Nil.new
you only need to know if it is a representation of nil
, there is no need to test if it might be true
or a number at this point. String.to_yaml
is the only place where the current behaviour - a full check over all possible plain values - is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. I would move each simplified case into each Class#from_yaml
, then have String#to_yaml
implement a complete check for reserved scalar values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ysbaddaden something like this. Although I havent tested it yet, it would fail because not all subclasses support the required arguments for new.
def Class.new(pull : YAML::PullParser)
return pull.read_scalar if pull.style != LibYAML::ScalarStyle::PLAIN
location = pull.location
case value = pull.read_plain_scalar
when /^(y|Y|yes|Yes|YES|true|True|TRUE|on|On|ON)$/
true
when /^(n|N|no|No|NO|false|False|FALSE|off|Off|OFF)$/
false
when /^[-+]?0b[0-1_]$/ # base2 int
new(value, base: 2, underscore: true)
when /^[-+]?0[0-7_]+$/ # base8 int
new(value, base: 8, underscore: true)
when /^[-+]?([0-9][0-9_]*)?\.[0-9.]*([eE][-+][0-9]+)?$/ # Base 10 int/float
new(value, underscore: true)
when /^[-+]?0x[0-9a-fA-F_]+$/ # base16 int
new(value, base: 16, underscore: true)
when /^[-+]?[0-9][0-9_]*(:[0-5]?[0-9])+\.[0-9_]*$/ # Base 60 int/float
deg, min, sec = value.split(":")
.map_with_index { |v| new(v, underscore: true, whitepsace: false) / 60 ** i }
deg >= 0 ? deg + (min + sec) : deg - (min + sec)
when /^[-+]?\.(inf|Inf|INF)$/ # (infinity)
new(value.sub(/^([-+])?\.(inf|Inf|INF)$/, "\\1Infinity"))
when /^\.(nan|NaN|NAN)$/ # (not a number)
new("NaN")
when /^()~|null|Null|NULL)$/
nil
else
raise YAML::ParseException.new("Expected #{self.name}, not #{value}", *location)
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this?? You should separate the parsing - not make it even worse by using unnecessary regular expressions everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't introduce new regexes into hot paths.
src/yaml/from_yaml.cr
Outdated
location = pull.location | ||
style = pull.data.scalar.style | ||
value = pull.read_scalar | ||
return value unless style == LibYAML::ScalarStyle::PLAIN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is badly structured. Please refactor this to express the purpose better:
if style == LibYAML::ScalarStyle::PLAIN && YAML.reserved_scalar(value)
YAML::ParseException.new("Expected string, not #{value}", *location)
end
return value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plain style merely says: plain value without formatting, no quotes, no multiline, ... Even when it looks like a value, it can perfectly be a string. A good example is Shards' SPEC where version: 1.0
is valid and 1.0
will be a string!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ysbaddaden incorrect, 1.0 is reserved and is not a string, it can only be a float. I can give you numerous examples from other languages that have properly implemented yaml. Even ruby treats this accordingly.
$ ruby -r yaml -e 'puts YAML.load("1.0").class'
Float
$ ruby -r yaml -e 'puts YAML.load(%("1.0")).class'
String
src/yaml/from_yaml.cr
Outdated
Time::Format::ISO_8601_DATE_TIME.parse(pull.read_scalar) | ||
location = pull.location | ||
begin | ||
Time::Format::ISO_8601_DATE_TIME.parse(pull.read_scalar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YAML timestamp type uses a specific subset of ISO 8601 which requires a custom parser. This doesn't need to be implemented in this PR but I'd recommend putting in a TODO note.
spec/std/yaml/from_yaml_spec.cr
Outdated
require "yaml" | ||
|
||
private class ObjectStub | ||
def self.new(pp : YAML::PullParser) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend to use a more descriptive variable name, even in such a simple case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, parser
is a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im going to use pull
as thats how its referenced everywhere else.
spec/std/yaml/from_yaml_spec.cr
Outdated
|
||
private class ObjectStub | ||
def self.new(pp : YAML::PullParser) | ||
pp.read_scalar.should eq "hello" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's bad design to have the actual test in a stub object it should be directly in the it
spec.
spec/std/yaml/from_yaml_spec.cr
Outdated
Nil.from_yaml(nil.to_yaml).should eq nil | ||
end | ||
|
||
it "should return nil if #{value}" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add delimiters around #{value}
to make the empty string recognizable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or just use .inspect
to quote it automatically.
spec/std/yaml/from_yaml_spec.cr
Outdated
Bool.from_yaml(true.to_yaml).should eq true | ||
end | ||
|
||
values = ["true", "True", "TRUE", "on", "On", "ON", "y", "Y", "yes", "Yes", "YES"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like adding so many specs for each single plain value. It would probably be better to put all variations together in one spec to have three specs with inner loops of 11 values instead of 33 individual specs which test exactly the same behaviour. This just adds dozens of mostly meaningless specs. I think the different keywords can be safely combined into a single spec.
This applies to the other ocations as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@straight-shoota Done, see changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only downside to this is that if there is a particular failure on one of the values, but not the other values, I don't think you'll be able to differentiate that in the error message. That is you'll see something like this:
1) Bool.from_yaml true values should return true if a truthy value
Failure/Error: Bool.from_yaml(value).should eq true
expected: true
got: false
# ./spec/std/yaml/from_yaml_spec.cr:69
When I've done this in other projects, I've always been forced to append some custom message to the expectation that says "for value #{foo}", so I can tell them apart.
Additionally, if the first value fails, then it never tests the other values, essentially making it fail-fast.
@straight-shoota Thoughts in general to this? Maybe it doesn't apply in this particular case, but I'm curious what you think overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fryguy I originally had it where each value would generate its own test @straight-shoota recommended I inline it into a single test, I do agree with @Fryguy though... even though each one would generate its own test if multiple tests fail, you are able to fix them completely without having to run the test each time.
spec/std/yaml/from_yaml_spec.cr
Outdated
end | ||
|
||
it "should parse a double quoted string" do | ||
String.from_yaml("\"hello\"").should eq "hello" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use alternative string delimiters to avoid escaping: %("hello")
spec/std/yaml/from_yaml_spec.cr
Outdated
end | ||
|
||
context "integers" do | ||
it "should raise if an int is passed" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This spec should not wrap other specs.
spec/std/yaml/from_yaml_spec.cr
Outdated
end | ||
|
||
context "floats" do | ||
it "should raise if an int is passed" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dito
src/yaml.cr
Outdated
|
||
# Determines if the scaler is reserved | ||
def self.reserved_scalar(value : String) | ||
case value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would downcase the value to simplify this logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, downcasing would incorrectly recognize nULL
as a null value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, OK. I just checked the spec and you are right. Ruby seems to implement it wrong:
require "yaml"
YAML.load("[nUll, yEs, 1, 2]") # => [nil, true, 1, 2]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is going a good way, though it's becoming a little complex. My biggest issue is disallowing to parse strings that look like another type, but would still be valid (1.0
can be a plain string).
spec/std/yaml/from_yaml_spec.cr
Outdated
require "yaml" | ||
|
||
private class ObjectStub | ||
def self.new(pp : YAML::PullParser) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, parser
is a better name.
spec/std/yaml/from_yaml_spec.cr
Outdated
context "integers" do | ||
it "should raise if an int is passed" do | ||
expect_raises(YAML::ParseException) do | ||
String.from_yaml("1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually String.from_yaml("1")
should pass, since a non-quoted 1
can be a plain string.
A string doesn't depend on the scalar style (plain, quoted, ...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give me an example of another language that allows plain scalar values to be strings? I cannot seem to find one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jwaldrip Other languages are dynamic. If you say "I want this attribute to be a String" (because of YAML.mapping
) and I write 1.2 in it, then it must be considered as the string "1.2". In Ruby this is not possible because there's no such thing as a mapping, 1.2 will always be interpreted as a float.
spec/std/yaml/from_yaml_spec.cr
Outdated
context "floats" do | ||
it "should raise if an int is passed" do | ||
expect_raises(YAML::ParseException) do | ||
String.from_yaml("1.1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: 1.1
can be a plain string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give me an example of another language that allows plain scalar values to be strings? I cannot seem to find one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ysbaddaden I'm not sure this is true. Any evidence for this? I couldn't find any specific rules about that, but such an ambiguity would feel strange. Of course, you could always parse any YAML value as a string because they are encoded as strings.
src/yaml.cr
Outdated
when ".nan", ".NaN", ".NAN" | ||
"NaN" | ||
else | ||
if value.to_i64? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you must specify to_i?(whitespace: false, underscore: false)
and keep strict
and prefix
enabled, otherwise integers and floats with leading/suffix spaces or underscores that are valid Crystal numbers would get parsed, even though they're invalid YAML numbers.
src/yaml.cr
Outdated
@@ -147,4 +148,28 @@ module YAML | |||
def self.dump(object, io : IO) | |||
object.to_yaml(io) | |||
end | |||
|
|||
# Determines if the scaler is reserved | |||
def self.reserved_scalar(value : String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like this method much. For example it tries to linearize the scalar value (using a non-standard naming: infinity
, NaN
), but sometimes returns a detected tag (int
, float
) which means the integer/float number will have to be parsed again.
src/yaml/from_yaml.cr
Outdated
location = pull.location | ||
style = pull.data.scalar.style | ||
value = pull.read_scalar | ||
return value unless style == LibYAML::ScalarStyle::PLAIN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plain style merely says: plain value without formatting, no quotes, no multiline, ... Even when it looks like a value, it can perfectly be a string. A good example is Shards' SPEC where version: 1.0
is valid and 1.0
will be a string!
@ysbaddaden incorrect, 1.0 is reserved and is not a valid string, it can only be a float. I can give you numerous examples from other languages that have properly implemented yaml. Even ruby treats this accordingly.
|
src/yaml/from_yaml.cr
Outdated
location = pull.location | ||
begin | ||
value = pull.read_plain_scalar | ||
return NAN if YAML.reserved_scalar(value) == :nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of three calls to the same method the result should be stored in a variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or better, as an example of removing reserved_scalar
:
case value = pull.read_plain_scalar
when ".nan", ".NaN", ".NAN" then NAN
when ".inf", ".Inf", ".INF", "+.inf", "+.Inf", "+.INF" then INFINITY
when "-.inf", "-.Inf", "-.INF" then -INFINITY
else new(value)
end
Each possible value is detailed here, not hidden elsewhere; no useless checks (for other types); no intermediary values (symbols or strings).
How can we get back on track here. I don't agree with some of @ysbaddaden's suggestions. Can we get someone like @asterite to weigh in here? The method of matching with regex comes closest to matching the YAML spec. |
The regexes are used to describe the value format, but you don't have to implement it that way. It makes particularly no sense to do this for numbers because there are dedicated parse methods in Crystal. No need for regexes. And you don't need them to match full strings either. |
The fact is that regexes are slow for matching short strings, so we can't use them. |
@jwaldrip If you'd like my opinion, the first or second commit you sent were fine: match the value in each And no regexes (like you had initially), I did a small benchmark and regexes are slower here (and consume more memory). I don't know what were the problems that led to these many changes. |
@jwaldrip Maybe you misunderstood @ysbaddaden. He suggested to move the checks into each (individual) |
src/yaml.cr
Outdated
!reserved.nil? | ||
case {value[0]?, value[1]?, value[2]?, value[3]?, value[4]?} | ||
when {.try(&.ascii_number?), .try(&.ascii_number?), .try(&.ascii_number?), .try(&.ascii_number?), '-'} | ||
!!(Time::Format::ISO_8601_DATE_TIME.parse(value) rescue false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parenthesis should not be needed here, because false
is a Bool anyway
src/yaml.cr
Outdated
{'+', '.', .try(&.ascii_number?), _, _} | ||
clean_value = value.gsub('_', "") | ||
!!clean_value.to_f64? || !!clean_value.to_i64?(prefix: true) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should return false
if none of the cases matches.
@jwaldrip I'm pretty sure this review process has become a PITA for you. I'll probably take this over, if that's OK with you. |
I am happy as long as the changes get in. I don't mind making them... it is a learning process. What is left to get this thing approved and merged? |
@jwaldrip I now think YAML.parse should support the core schema by default. But this only changes how YAML.mapping works. So most changes should be done in YAML::PullParser, I think. In any case, don't worry, we could probably merge this like that and later refactor it. But I don't know yet. |
I've thought "just this minor change then i'll merge it" several times now but I keep thinking something more... |
@straight-shoota I'll tackle this issue myself in small chunks |
Why not let @jwaldrip complete it? As I understand it, he is happy to continue working and he has already put much effort in it. |
@straight-shoota Fine by me, though I can see an infinite number of change requests coming (like in the past history of this PR). |
I think the code is good enough now. One thing I'd like to improve is moving all code related to the YAML core schema to In any case, we can merge this and I can later make those changes. Or you can make them. Or I can build a separate PR based on your code. What do you prefer? I also still have a big doubt about # YAML file
version: 1.2 If I parse it from a dynamic language the value will be the float 1.2. But this is because parsing is always done with the core schema. In such cases one can probably do # YAML file
version: "1.2" and it's uglier to write. And this might actually break existing shards! One can use And we can always update |
@asterite I think we should merge is as is. As we discussed before, other languages just support the core schema, so that seems like a further enhancement to support more than one schema. The advance would make sense pulled out if all the |
@asterite also, just because existing shards were using a method that was not native the core YAML spec, does not mean you should make an exception to implementing it the right way. The language is still in alpha and breaking changes should be expected. Also, shards that have a full semver version should be just fine as |
There's still time until the next release so this doesn't have to be rushed. Let's not merge yet. In the meantime I'll try to implement the same but with a different code organization, and maybe support for both failsafe and core schemas. If I fail (or if my code doesn't end up being liked) we can merge this. In any case we'll thank you in the changelog :) |
@asterite Code organization I can understand. Supporting both 1.2 and 1.1 schemas I can understand. But why the need for the failsafe schema? Who would use it. Are we over optimizing by including more than one schema to begin with? Almost every other language with YAML in the standard lib supports just one schema. I think we should strive for parity before iterating further. It just sounds like "gold plating" and "scope creep" to me. |
We've all suffered trying to keep track of this PR for too long, we should merge this PR using this PR's approach, and then add support for other schemas later. Regarding the arguments in this PR regarding number parsing, my opinion is that making |
I don't want to merge this. Please don't merge it. In the past I've merged PRs just because it seems there was a rush to merge things. This led to code that's hard to understand, refactor and maintain. I'm not saying that this is the case in this PR. I just want to be cautious. There's really no rush. I'm also preparing a different PR where the full core YAML schema is implemented, is faster than this PR and is better organized in terms of type definitions and methods. We can later review both and choose which one we think is more suitable. @RX14 My approach does what you suggest. It greatly simplifies the code. |
@asterite sorry, I misread and thought your idea was simply adding support for other schemas based on this PR. If you have a better idea of how to organise the code then we should wait for that. |
No description provided.