Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YAML fixes #4838

Closed
wants to merge 31 commits into from
Closed
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
d55225d
Make YAML behave like the YAML 1.1 spec, fixes #3101, fixes #4384, an…
jwaldrip Aug 28, 2017
f3c89b4
make suggested changes
jwaldrip Aug 28, 2017
b607b65
update from_yaml to match the changes in to_yaml
jwaldrip Aug 29, 2017
b6a2e09
reserved values performance
jwaldrip Aug 29, 2017
f14198b
raise with a location
jwaldrip Aug 29, 2017
8346108
add todo
jwaldrip Aug 29, 2017
016dfd4
format
jwaldrip Aug 29, 2017
d6886fc
use case
jwaldrip Aug 29, 2017
d11fd05
Merge branch 'master' of https://github.com/crystal-lang/crystal into…
jwaldrip Aug 29, 2017
1bc96b5
use a char reader to give hints
jwaldrip Aug 30, 2017
ffccba4
use a case statement testing all the reserved cases
jwaldrip Aug 30, 2017
de2a1b7
add date to test case
jwaldrip Aug 30, 2017
d2476e2
use try's
jwaldrip Aug 30, 2017
922fa83
move logic into YAML::PullParser
jwaldrip Aug 31, 2017
47ee91c
use the method check
jwaldrip Sep 1, 2017
86f3f4b
dont raise
jwaldrip Sep 1, 2017
1faeb45
spelling & grammer
jwaldrip Sep 6, 2017
f5fc600
Merge branch 'yaml-spec' of github.com:jwaldrip/crystal into yaml-spec
jwaldrip Sep 6, 2017
0c605a8
Merge branch 'master' of github.com:jwaldrip/crystal into yaml-spec
jwaldrip Sep 6, 2017
cf95b76
re-inline specs
jwaldrip Sep 7, 2017
5a90882
remove checks
jwaldrip Sep 8, 2017
b3f6436
make suggested changes
jwaldrip Sep 8, 2017
3157fab
more spec changes
jwaldrip Sep 8, 2017
ceb9db9
fix quoted spec
jwaldrip Sep 8, 2017
e3df0d6
inline bool checks
jwaldrip Sep 10, 2017
50d5705
return false, remove parens
jwaldrip Sep 10, 2017
e2c8a1a
Merge branch 'master' of https://github.com/crystal-lang/crystal into…
jwaldrip Sep 10, 2017
69ee625
make `YAML.parse` support the core spec
jwaldrip Sep 16, 2017
31ca301
Merge branch 'yaml-spec' of github.com:jwaldrip/crystal into yaml-spec
jwaldrip Sep 16, 2017
6a5b820
Merge branch 'master' of https://github.com/crystal-lang/crystal into…
jwaldrip Sep 16, 2017
7a36cdc
Merge branch 'master' of https://github.com/crystal-lang/crystal into…
jwaldrip Sep 18, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions spec/std/yaml/any_spec.cr
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ describe YAML::Any do
end

it "can compare with ===" do
("1" === YAML.parse("1")).should be_truthy
(1_i64 === YAML.parse("1")).should be_truthy
end

it "exposes $~ when doing Regex#===" do
Expand All @@ -106,7 +106,7 @@ describe YAML::Any do
nums = YAML.parse("[1, 2, 3]")
nums.each_with_index do |x, i|
x.should be_a(YAML::Any)
x.raw.should eq((i + 1).to_s)
x.raw.should eq(i + 1)
end
end
end
357 changes: 357 additions & 0 deletions spec/std/yaml/from_yaml_spec.cr
Original file line number Diff line number Diff line change
@@ -0,0 +1,357 @@
require "spec"
require "yaml"

private struct ObjectStub
getter called : Bool

def self.new(pull : YAML::PullParser)
pull.read_scalar
new(true)
end

def initialize(@called)
end
end

describe "Object.from_yaml" do
it "should create call .new on with a YAML::PullParser" do
ObjectStub.from_yaml("hello").called.should be_true
end
end

describe "Nil.from_yaml" do
it "should serialize and deserialize" do
Nil.from_yaml(nil.to_yaml).should eq nil
end

values = YAML::NULL_VALUES

it "should return nil if a YAML null value" do
values.each do |value|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to write?

  {"", "NULL", "Null", "null", "~"}.each do |value|
    it "should return nil if a YAML \"#{value}\" value" do
      # Test will an array since a standalone empty value shows as STREAM_END
      Array(Nil).from_yaml("- " + value).should eq [nil]
    end

    it "should raise if a YAML \"#{value}\" value is quoted with double quotes" do
      expect_raises(YAML::ParseException) do
        Nil.from_yaml %("#{value}")
      end
    end

    it "should raise if YAML \"#{value}\" value is quoted with single quotes" do
      expect_raises(YAML::ParseException) do
        Nil.from_yaml "'#{value}'"
      end
    end
  end

and so on

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again... I did have this way but @straight-shoota suggested to inline all the tests into a single test. I would much rather have it the way above (the way I had it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@straight-shoota why? I prefer to know everything about broken spec. just by its description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've detailed above (comment), having a spec for each distinct value results in dozens of specs for actually the same behaviour. This is not necessary at all and results in slightly reduced test performance. We have fast-failing specs with multiple expectations all over stdlib. This might not always be a good idea but in this case I really see no benefit from having individual specs. If one of them fails, it's likely to assume others fail, too, and the reason is most likely the same. You'll have to run the tests again anyway. So we're not losing anything really but gaining execution speed for thousands of spec runs where there is no error in this part of the YAML parser.

# Test with an array since a standalone empty value shows as STREAM_END
Array(Nil).from_yaml("- " + value).should eq [nil]
end
end

it "should raise if a YAML null value is quoted" do
values.each do |value|
expect_raises(YAML::ParseException) do
Nil.from_yaml %("#{value}")
end
expect_raises(YAML::ParseException) do
Nil.from_yaml "'#{value}'"
end
end
end

it "should raise if not a null value" do
expect_raises(YAML::ParseException) do
Nil.from_yaml "hello"
end
end
end

describe "Bool.from_yaml" do
describe "true values" do
it "should serialize and deserialize" do
Bool.from_yaml(true.to_yaml).should eq true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add the same line for false?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see it's in the next spec group.

end

values = YAML::TRUE_VALUES

it "should return true if a truthy value" do
values.each do |value|
Bool.from_yaml(value).should eq true
end
end

it "should raise if a truthy value is quoted" do
values.each do |value|
expect_raises(YAML::ParseException) do
Bool.from_yaml %("#{value}")
end
expect_raises(YAML::ParseException) do
Bool.from_yaml %('#{value}')
end
end
end
end

describe "false values" do
it "should serialize and deserialize" do
Bool.from_yaml(false.to_yaml).should eq false
end

values = YAML::FALSE_VALUES

it "should return true if a falsey value" do
values.each do |value|
Bool.from_yaml(value).should eq false
end
end

it "should raise if a falsey value is quoted" do
values.each do |value|
expect_raises(YAML::ParseException) do
Bool.from_yaml %("#{value}")
end
end
values.each do |value|
expect_raises(YAML::ParseException) do
Bool.from_yaml %('#{value}')
end
end
end
end

it "should raise if not a bool value" do
expect_raises(YAML::ParseException) do
Bool.from_yaml "hello"
end
end
end

[Int8, Int16, Int32, Int64, UInt8, UInt16, UInt32, UInt64].each do |type|
describe "#{type.name}.from_yaml" do
it "should serialize and deserialize" do
type.from_yaml(type.new("1").to_yaml).should eq type.new("1")
end

it "should parse possible values" do
values = {"1", "-1", "0x_0A_74_AE", "0b1010_0111_0100_1010_1110", "02472256"}
values.each do |value|
type.from_yaml(value).class.should eq type
end
end

it "should parse a number into an #{type.name}" do
type.from_yaml("1").should eq type.new("1")
end

it "should raise if an int is quoted" do
expect_raises(YAML::ParseException) do
type.from_yaml %("1")
end
expect_raises(YAML::ParseException) do
type.from_yaml %('1')
end
end
end
end

[Float32, Float64].each do |type|
describe "#{type.name}.from_yaml" do
it "should serialize and deserialize" do
type.from_yaml(type.new("1.0").to_yaml).should eq type.new("1.0")
end

it "should parse possible values" do
values = {
".inf", ".Inf", ".INF", "+.inf", "+.Inf", "+.INF",
"-.inf", "-.Inf", "-.INF",
".nan", ".NaN", ".NAN",
"+1", "1", "-1",
"1.1", "+1.1", "-1.1",
}
values.each do |value|
type.from_yaml(value).class.should eq type
end
end

it "should parse a number into an #{type.name}" do
type.from_yaml("1").should eq type.new("1")
end

it "should parse a float into an #{type.name}" do
type.from_yaml("1.1").should eq type.new("1.1")
end

it "should raise if a float value is quoted" do
expect_raises(YAML::ParseException) do
type.from_yaml %("1.1")
end
expect_raises(YAML::ParseException) do
type.from_yaml %('1.1')
end
end
end
end

describe "String.from_yaml" do
it "should serialize and deserialize" do
String.from_yaml("hi".to_yaml).should eq "hi"
end

it "should parse a string" do
String.from_yaml("hello").should eq "hello"
end

it "should parse a quoted string" do
String.from_yaml(%("hello")).should eq "hello"
String.from_yaml(%('hello')).should eq "hello"
end

it "should parse a literal string" do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be more detailed tests about the specifics of literal style and folded style values, at least for strings. As far as I can tell, there are none in the stdlib specs as of now. Would be nice to add them here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you like me to add specifics?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example

String.from_yaml("|\n  hello\n  world").should eq "hello\nworld"
# and
String.from_yaml(">\n  hello\  world").should eq "hello world"

String.from_yaml("|\n hello").should eq "hello"
end

it "should parse a folded string" do
String.from_yaml(">\n hello").should eq "hello"
end

context "reserved values" do
Copy link
Member

@straight-shoota straight-shoota Sep 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about splitting these reserved values up and cover each set in the context of the specific type. It's not an inherent responsibility of the string type to recognize them. Only when a specific type is supported, it's special values
or formats are not considered strings but values of that type. String is only affected because it's the default type. If support for additional types is added, that would not mean to modify existing specs but just add the additional behaviour in a cohesive set of specs which cover that special values are not recognized as strings. In this case it actually makes sense to differentiate between individual specs for each type of reserved values.
The specs for raising if a string is to be read from a reserved value and passing if quoted could easily made reusable to employ them at these individual locations. This way, we don't need duplicate value literals as well, because they already exists at the designated places.
I'd like to hear other opinions though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that "1.0" is a valid yaml string, unless quoted. So why should it not be string's responsibility to detect that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is only an invalid string because it is a float value format. If the parser schema wouldn't include float type (like the Failsafe Schema), 1.0 would be recognized as a string: a sequence of zero or more Unicode characters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anywhere in the spec where it makes clear how the parser should handle situations like this. Maybe it is a string in the failsafe schema but we don't have an option to select schema do we? If we're not in the fail safe scheme then surely reading a string which is a valid float, which is unquoted, is incorrect. The way that yaml is parsed should be entirely independent of the way that its converted into crystal types. A certain value shouldn't be parsed one way of were expecting a string and the other way if we're expecting a float. Doing so is leaking details of the parsing. The fact is that 99% of yaml parsers will parse an unquoted 1.0 as a number. If we don't do the same, then our parser is incorrect.

That leads me to the conclusion that the way we're doing parsing is incorrect. We should be parsing tokens and deciding what type they are inside a single next method. We shouldn't be attempting to read a string and erroring if we get something which could be a float, we should have a single next method which returns the union of all yaml types, and parses the scalar to a concrete type irrespective of what type we've told the pullparser we're expecting. This is the same way it's done in the json pullparser.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RX14 I think these general considerations shouldn't go into a line comment. Will get lost easily this way...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It all depends on the used schema. If a parser supports Core Schema, it will of course recognize a float value. If the schema has no float, 1.0 is recognized as a string.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but we do recognize float/core schema.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a logical perspective, it's not a property of the string parser to recognize 1.0 as a reserved value but it is causatively tied to the float parser. That's why I think it's consequential to align the specs there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way it's easier to track for example further additions to the parser which might come in the future because these, too, would be introduced in one piece of matching values and non-matching string values.

Copy link
Contributor

@RX14 RX14 Sep 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What i'm suggesting is to move all the scalar parsing logic into read_next. read_next will read a scalar or whatever using libyaml, but then it will parse the value (if it's a scalar) to work out what scalar type it is. This makes the rest of the class simpler because the read_string method doesn't need to work out if it's a string: it's just called read_next which has set which scalar type the token is. This reduces duplication and is much more in-line with how the json pullparser works.

We can update the parser to have selectable schemas later. (failsafe schema will just error if you try to read a float for example)

values = {
"true", "True", "TRUE", "on", "On", "ON", "y", "Y", "yes", "Yes", "YES",
"false", "False", "FALSE", "off", "Off", "OFF", "n", "N", "no", "No", "NO",
"", "NULL", "Null", "null", "~",
".inf", ".Inf", ".INF", "+.inf", "+.Inf", "+.INF",
"-.inf", "-.Inf", "-.INF",
".nan", ".NaN", ".NAN",
"+1", "1", "-1",
"1.1", "+1.1", "-1.1",
"0x_0A_74_AE", "0b1010_0111_0100_1010_1110", "02472256",
"2001-12-15T02:59:43.1Z",
}

it "should raise if a reserved value" do
values.each do |value|
expect_raises(YAML::ParseException) do
String.from_yaml(value)
end
end
end

it "should parse if a reserved value is quoted" do
values.each do |value|
String.from_yaml(%("#{value}")).should eq value
end
values.each do |value|
String.from_yaml(%('#{value}')).should eq value
end
end
end
end

describe "Array.from_yaml" do
it "should serialize and deserialize" do
Array(String).from_yaml(["hi"].to_yaml).should eq ["hi"]
end

it "it should parse an array of the correct objects" do
result = Array(String).from_yaml <<-YAML
- one
- two
- three
YAML
result.should eq ["one", "two", "three"]
end

context "with a union type" do
it "it should parse an array of the correct objects" do
result = Array(String | Int32 | Float64).from_yaml <<-YAML
- one
- 1
- 1.0
YAML
result.should eq ["one", 1, 1.0]
end
end
end

describe "Hash.from_yaml" do
it "should serialize and deserialize" do
Hash(String, String).from_yaml({"hello" => "world"}.to_yaml).should eq ({"hello" => "world"})
end

it "it should parse an array of the correct objects" do
result = Hash(String, String).from_yaml <<-YAML
foo: one
bar: two
baz: three
YAML
result.should eq ({"foo" => "one", "bar" => "two", "baz" => "three"})
end

context "with a union type" do
it "it should parse an array of the correct objects" do
result = Hash(String, String | Int32 | Float64).from_yaml <<-YAML
foo: 1
bar: two
baz: 3.0
YAML
result.should eq ({"foo" => 1, "bar" => "two", "baz" => 3.0})
end
end
end

describe "Tuple.from_yaml" do
it "should serialize and deserialize" do
Tuple(String, String).from_yaml({"hello", "world"}.to_yaml).should eq ({"hello", "world"})
end

it "it should parse an array of the correct objects" do
result = Tuple(String, String, String).from_yaml "- one\n- two\n- three"
result.should eq ({"one", "two", "three"})
end

context "with a union type" do
it "it should parse an array of the correct objects" do
result = Tuple(String | Int32 | Float64, String | Int32 | Float64, String | Int32 | Float64).from_yaml "- one\n- 1\n- 1.0"
result.should eq ({"one", 1, 1.0})
end
end
end

describe "NamedTuple.from_yaml" do
it "it should parse an array of the correct objects" do
result = NamedTuple(foo: String, bar: String, baz: String).from_yaml "foo: one\nbar: two\nbaz: three"
result.should eq ({foo: "one", bar: "two", baz: "three"})
end
end

enum TestEnum
Red
Green
Blue
end

describe "Enum.from_yaml" do
it "should serialize and deserialize" do
TestEnum.from_yaml(TestEnum::Red.to_yaml).should eq TestEnum::Red
end

it "should parse from string" do
TestEnum.from_yaml("red").should eq TestEnum::Red
end

it "should parse from int" do
TestEnum.from_yaml("1").should eq TestEnum::Green
end
end

describe "Union.from_yaml" do
it "should compile into any of the types" do
(Bool | Int32).from_yaml("true").should eq true
(Bool | Int32).from_yaml("1").should eq 1
end

it "should raise if not a unions type" do
expect_raises(YAML::ParseException) do
(Bool | Int32).from_yaml("foo")
end
end
end

describe "Time.from_yaml" do
it "should parse time" do
time = Time.now
time_string = Time::Format::ISO_8601_DATE_TIME.format(time)
Time.from_yaml(time_string).epoch.should eq time.epoch
end

it "should not parse an invalid time string" do
expect_raises(YAML::ParseException) do
Time.from_yaml("not a time")
end
end
end
2 changes: 1 addition & 1 deletion spec/std/yaml/serialization_spec.cr
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ describe "YAML serialization" do
]
YAML
end
ex.message.should eq("Expected nil, not 1 at line 2, column 3")
ex.message.should eq("Expected null, not '1' at line 2, column 3")
ex.location.should eq({2, 3})
end

Expand Down
Loading