Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YAML fixes #4838

Closed
wants to merge 31 commits into from
Closed
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
d55225d
Make YAML behave like the YAML 1.1 spec, fixes #3101, fixes #4384, an…
jwaldrip Aug 28, 2017
f3c89b4
make suggested changes
jwaldrip Aug 28, 2017
b607b65
update from_yaml to match the changes in to_yaml
jwaldrip Aug 29, 2017
b6a2e09
reserved values performance
jwaldrip Aug 29, 2017
f14198b
raise with a location
jwaldrip Aug 29, 2017
8346108
add todo
jwaldrip Aug 29, 2017
016dfd4
format
jwaldrip Aug 29, 2017
d6886fc
use case
jwaldrip Aug 29, 2017
d11fd05
Merge branch 'master' of https://github.com/crystal-lang/crystal into…
jwaldrip Aug 29, 2017
1bc96b5
use a char reader to give hints
jwaldrip Aug 30, 2017
ffccba4
use a case statement testing all the reserved cases
jwaldrip Aug 30, 2017
de2a1b7
add date to test case
jwaldrip Aug 30, 2017
d2476e2
use try's
jwaldrip Aug 30, 2017
922fa83
move logic into YAML::PullParser
jwaldrip Aug 31, 2017
47ee91c
use the method check
jwaldrip Sep 1, 2017
86f3f4b
dont raise
jwaldrip Sep 1, 2017
1faeb45
spelling & grammer
jwaldrip Sep 6, 2017
f5fc600
Merge branch 'yaml-spec' of github.com:jwaldrip/crystal into yaml-spec
jwaldrip Sep 6, 2017
0c605a8
Merge branch 'master' of github.com:jwaldrip/crystal into yaml-spec
jwaldrip Sep 6, 2017
cf95b76
re-inline specs
jwaldrip Sep 7, 2017
5a90882
remove checks
jwaldrip Sep 8, 2017
b3f6436
make suggested changes
jwaldrip Sep 8, 2017
3157fab
more spec changes
jwaldrip Sep 8, 2017
ceb9db9
fix quoted spec
jwaldrip Sep 8, 2017
e3df0d6
inline bool checks
jwaldrip Sep 10, 2017
50d5705
return false, remove parens
jwaldrip Sep 10, 2017
e2c8a1a
Merge branch 'master' of https://github.com/crystal-lang/crystal into…
jwaldrip Sep 10, 2017
69ee625
make `YAML.parse` support the core spec
jwaldrip Sep 16, 2017
31ca301
Merge branch 'yaml-spec' of github.com:jwaldrip/crystal into yaml-spec
jwaldrip Sep 16, 2017
6a5b820
Merge branch 'master' of https://github.com/crystal-lang/crystal into…
jwaldrip Sep 16, 2017
7a36cdc
Merge branch 'master' of https://github.com/crystal-lang/crystal into…
jwaldrip Sep 18, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
359 changes: 359 additions & 0 deletions spec/std/yaml/from_yaml_spec.cr
Original file line number Diff line number Diff line change
@@ -0,0 +1,359 @@
require "spec"
require "yaml"

private struct ObjectStub
getter called : Bool

def self.new(pull : YAML::PullParser)
pull.read_scalar
new(true)
end

def initialize(@called)
end
end

describe "Object.from_yaml" do
it "should create call .new on with a YAML::PullParser" do
ObjectStub.from_yaml("hello").called.should be_true
end
end

describe "Nil.from_yaml" do
it "should serialize and deserialize" do
Nil.from_yaml(nil.to_yaml).should eq nil
end

values = {"", "NULL", "Null", "null", "~"}
Copy link
Member

@straight-shoota straight-shoota Sep 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably reuse YAML::NULL_VALUES. Same goes for the other value lists.
Altghough it disguises what is being tested and might lead to undetectable errors because there is only one source of the values. So I'm not 100% sure about that.


values.each do |value|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop should be inlined as well.

it %(zshould return nil if "#{value}") do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo it %(should ...

# Test will an array since a standalone empty value shows as STREAM_END
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo? # Test will -> # Test with

Array(Nil).from_yaml("- " + value).should eq [nil]
end
end

values.each do |value|
it %(should raise if "#{value}" is quoted with double quotes) do
expect_raises(YAML::ParseException) do
Nil.from_yaml %("#{value}")
end
end

it %(should raise if "#{value}" is quoted with single quotes) do
expect_raises(YAML::ParseException) do
Nil.from_yaml "'#{value}'"
end
end

it "should raise if not a null value" do
expect_raises(YAML::ParseException) do
Nil.from_yaml "hello"
end
end
end

describe "Bool.from_yaml" do
describe "true values" do
it "should serialize and deserialize" do
Bool.from_yaml(true.to_yaml).should eq true
end

values = {"true", "True", "TRUE", "on", "On", "ON", "y", "Y", "yes", "Yes", "YES"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this here again? I thought we agreed on inlining the values instead of having dozens of duplicate specs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably because I brought up a counter-argument here (I meant it more as a question/discussion)

Copy link
Member

@straight-shoota straight-shoota Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A discussion would probably be more appropriate than jumping back and forth... This change makes the previous comments disappear... (hide from plain sight).

I don't think your argument is valid, because specs' goal is primarily to identify if failures exists in the code, not necessarily exactly why and where. This would probably need some further exploration anyway. So testing all in one spec is equally safe as individual specs.
In this particular case, it should be quite sufficient to test if all values work: In the unlikely case of a failure, all values will probably be affected equally.
Having individual specs in this place has as good as no real benefit but reduces performance on each and every spec run. That's not worth it.
It also spams spec reports with dozens of successful results which mean nothing since they are all testing exactly the same behaviour.


values.each do |value|
it %(should return true if "#{value}") do
Bool.from_yaml(value).should eq true
end

it %(should raise if "#{value}" is quoted with double quotes) do
expect_raises(YAML::ParseException) do
Bool.from_yaml %("#{value}")
end
end

it %(should raise if "#{value}" is quoted with single quotes) do
expect_raises(YAML::ParseException) do
Bool.from_yaml "'#{value}'"
end
end
end
end
end

describe "false values" do
it "should serialize and deserialize" do
Bool.from_yaml(false.to_yaml).should eq false
end

values = {"false", "False", "FALSE", "off", "Off", "OFF", "n", "N", "no", "No", "NO"}

values.each do |value|
it %(should return false if "#{value}") do
Bool.from_yaml(value).should eq false
end

it %(should raise if "#{value}" is quoted with double quotes) do
expect_raises(YAML::ParseException) do
Bool.from_yaml %("#{value}")
end
end

it %(should raise if "#{value}" is quoted with single quotes) do
expect_raises(YAML::ParseException) do
Bool.from_yaml "'#{value}'"
end
end
end
end

it "should raise if not a bool value" do
expect_raises(YAML::ParseException) do
Bool.from_yaml "hello"
end
end
end

[Int8, Int16, Int32, Int64, UInt8, UInt16, UInt32, UInt64].each do |type|
describe "#{type.name}.from_yaml" do
it "should serialize and deserialize" do
type.from_yaml(type.new("1").to_yaml).should eq type.new("1")
end

it "should parse a number into an #{type.name}" do
type.from_yaml(type.new("1").to_yaml).should eq type.new("1")
end

it "should raise if quoted with double quotes" do
expect_raises(YAML::ParseException) do
type.from_yaml %("1")
end
end

it "should raise if quoted with single quotes" do
expect_raises(YAML::ParseException) do
type.from_yaml "'1'"
end
end

it "should raise when not a number" do
expect_raises(YAML::ParseException) do
type.from_yaml("hi")
end
end
end
end

[Float32, Float64].each do |type|
describe "#{type.name}.from_yaml" do
it "should serialize and deserialize" do
type.from_yaml(type.new("1.0").to_yaml).should eq type.new("1.0")
end

it "should parse a number into an {{type.id}}" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be it "should parse a number into an #{type.name}" do , like on line 123?

type.from_yaml("1").should eq type.new("1")
end

it "should parse a float into an {{type.id}}" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

type.from_yaml("1.1").should eq type.new("1.1")
end

it "should raise if quoted with double quotes" do
expect_raises(YAML::ParseException) do
type.from_yaml %("1.1")
end
end

it "should raise if quoted with single quotes" do
expect_raises(YAML::ParseException) do
type.from_yaml "'1.1'"
end
end

it "should raise when not a float or number" do
expect_raises(YAML::ParseException) do
type.from_yaml("hi")
end
end
end
end

describe "String.from_yaml" do
it "should serialize and deserialize" do
String.from_yaml("hi".to_yaml).should eq "hi"
end

it "should parse a string" do
String.from_yaml("hello").should eq "hello"
end

it "should parse a single quoted string" do
String.from_yaml("'hello'").should eq "hello"
end

it "should parse a double quoted string" do
String.from_yaml(%("hello")).should eq "hello"
end

it "should parse a literal string" do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be more detailed tests about the specifics of literal style and folded style values, at least for strings. As far as I can tell, there are none in the stdlib specs as of now. Would be nice to add them here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you like me to add specifics?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example

String.from_yaml("|\n  hello\n  world").should eq "hello\nworld"
# and
String.from_yaml(">\n  hello\  world").should eq "hello world"

String.from_yaml("|\n hello").should eq "hello"
end

it "should parse a folded string" do
String.from_yaml(">\n hello").should eq "hello"
end

context "reserved values" do
Copy link
Member

@straight-shoota straight-shoota Sep 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about splitting these reserved values up and cover each set in the context of the specific type. It's not an inherent responsibility of the string type to recognize them. Only when a specific type is supported, it's special values
or formats are not considered strings but values of that type. String is only affected because it's the default type. If support for additional types is added, that would not mean to modify existing specs but just add the additional behaviour in a cohesive set of specs which cover that special values are not recognized as strings. In this case it actually makes sense to differentiate between individual specs for each type of reserved values.
The specs for raising if a string is to be read from a reserved value and passing if quoted could easily made reusable to employ them at these individual locations. This way, we don't need duplicate value literals as well, because they already exists at the designated places.
I'd like to hear other opinions though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that "1.0" is a valid yaml string, unless quoted. So why should it not be string's responsibility to detect that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is only an invalid string because it is a float value format. If the parser schema wouldn't include float type (like the Failsafe Schema), 1.0 would be recognized as a string: a sequence of zero or more Unicode characters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anywhere in the spec where it makes clear how the parser should handle situations like this. Maybe it is a string in the failsafe schema but we don't have an option to select schema do we? If we're not in the fail safe scheme then surely reading a string which is a valid float, which is unquoted, is incorrect. The way that yaml is parsed should be entirely independent of the way that its converted into crystal types. A certain value shouldn't be parsed one way of were expecting a string and the other way if we're expecting a float. Doing so is leaking details of the parsing. The fact is that 99% of yaml parsers will parse an unquoted 1.0 as a number. If we don't do the same, then our parser is incorrect.

That leads me to the conclusion that the way we're doing parsing is incorrect. We should be parsing tokens and deciding what type they are inside a single next method. We shouldn't be attempting to read a string and erroring if we get something which could be a float, we should have a single next method which returns the union of all yaml types, and parses the scalar to a concrete type irrespective of what type we've told the pullparser we're expecting. This is the same way it's done in the json pullparser.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RX14 I think these general considerations shouldn't go into a line comment. Will get lost easily this way...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It all depends on the used schema. If a parser supports Core Schema, it will of course recognize a float value. If the schema has no float, 1.0 is recognized as a string.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but we do recognize float/core schema.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a logical perspective, it's not a property of the string parser to recognize 1.0 as a reserved value but it is causatively tied to the float parser. That's why I think it's consequential to align the specs there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way it's easier to track for example further additions to the parser which might come in the future because these, too, would be introduced in one piece of matching values and non-matching string values.

Copy link
Contributor

@RX14 RX14 Sep 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What i'm suggesting is to move all the scalar parsing logic into read_next. read_next will read a scalar or whatever using libyaml, but then it will parse the value (if it's a scalar) to work out what scalar type it is. This makes the rest of the class simpler because the read_string method doesn't need to work out if it's a string: it's just called read_next which has set which scalar type the token is. This reduces duplication and is much more in-line with how the json pullparser works.

We can update the parser to have selectable schemas later. (failsafe schema will just error if you try to read a float for example)

values = {
"true", "True", "TRUE", "on", "On", "ON", "y", "Y", "yes", "Yes", "YES",
"false", "False", "FALSE", "off", "Off", "OFF", "n", "N", "no", "No", "NO",
"", "NULL", "Null", "null", "~",
".inf", ".Inf", ".INF", "+.inf", "+.Inf", "+.INF",
"-.inf", "-.Inf", "-.INF",
".nan", ".NaN", ".NAN",
"+1", "1", "-1",
"1.1", "+1.1", "-1.1",
"0x_0A_74_AE", "0b1010_0111_0100_1010_1110", "02472256",
"2001-12-15T02:59:43.1Z",
}
values.each do |value|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop should be inlined as well.

it %(should raise if a reserved value "#{value}") do
expect_raises(YAML::ParseException) do
String.from_yaml(value)
end
end

it "should parse if a reserved value is quoted with double quotes" do
String.from_yaml(%("#{value}")).should eq value
end

it "should parse if a reserved value is quoted with single quotes" do
String.from_yaml("'#{value}'").should eq value
end
end
end
end

describe "Array.from_yaml" do
it "should serialize and deserialize" do
Array(String).from_yaml(["hi"].to_yaml).should eq ["hi"]
end

it "it should an array of the correct objects" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think the leading it in the String is superflous.
  • A verb is missing here -> should parse an array ... maybe?

result = Array(String).from_yaml <<-YAML
- one
- two
- three
YAML
result.should eq ["one", "two", "three"]
end

context "with a union type" do
it "it should an array of the correct objects" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

result = Array(String | Int32 | Float64).from_yaml <<-YAML
- one
- 1
- 1.0
YAML
result.should eq ["one", 1, 1.0]
end
end
end

describe "Hash.from_yaml" do
it "should serialize and deserialize" do
Hash(String, String).from_yaml({"hello" => "world"}.to_yaml).should eq ({"hello" => "world"})
end

it "it should an array of the correct objects" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

result = Hash(String, String).from_yaml <<-YAML
foo: one
bar: two
baz: three
YAML
result.should eq ({"foo" => "one", "bar" => "two", "baz" => "three"})
end

context "with a union type" do
it "it should an array of the correct objects" do
Copy link
Contributor

@Fryguy Fryguy Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto (there's a number of these below - didn't realize there were a bunch of them, so I'll stop now to avoid all the pinging 😉 )

result = Hash(String, String | Int32 | Float64).from_yaml <<-YAML
foo: 1
bar: two
baz: 3.0
YAML
result.should eq ({"foo" => 1, "bar" => "two", "baz" => 3.0})
end
end
end

describe "Tuple.from_yaml" do
it "should serialize and deserialize" do
Tuple(String, String).from_yaml({"hello", "world"}.to_yaml).should eq ({"hello", "world"})
end

it "it should an array of the correct objects" do
result = Tuple(String, String, String).from_yaml "- one\n- two\n- three"
result.should eq ({"one", "two", "three"})
end

context "with a union type" do
it "it should an array of the correct objects" do
result = Tuple(String | Int32 | Float64, String | Int32 | Float64, String | Int32 | Float64).from_yaml "- one\n- 1\n- 1.0"
result.should eq ({"one", 1, 1.0})
end
end
end

describe "NamedTuple.from_yaml" do
it "it should an array of the correct objects" do
result = NamedTuple(foo: String, bar: String, baz: String).from_yaml "foo: one\nbar: two\nbaz: three"
result.should eq ({foo: "one", bar: "two", baz: "three"})
end
end

enum TestEnum
Red
Green
Blue
end

describe "Enum.from_yaml" do
it "should serialize and deserialize" do
TestEnum.from_yaml(TestEnum::Red.to_yaml).should eq TestEnum::Red
end

it "should parse from string" do
TestEnum.from_yaml("red").should eq TestEnum::Red
end

it "should parse from int" do
TestEnum.from_yaml("1").should eq TestEnum::Green
end
end

describe "Union.from_yaml" do
it "should compile into any of the types" do
(Bool | Int32).from_yaml("true").should eq true
(Bool | Int32).from_yaml("1").should eq 1
end

it "should raise if not a unions type" do
expect_raises(YAML::ParseException) do
(Bool | Int32).from_yaml("foo")
end
end
end

describe "Time.from_yaml" do
it "should parse time" do
time = Time.now
time_string = Time::Format::ISO_8601_DATE_TIME.format(time)
Time.from_yaml(time_string).epoch.should eq time.epoch
end

it "should not parse an invalid time string" do
expect_raises(YAML::ParseException) do
Time.from_yaml("not a time")
end
end
end
2 changes: 1 addition & 1 deletion spec/std/yaml/serialization_spec.cr
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ describe "YAML serialization" do
]
YAML
end
ex.message.should eq("Expected nil, not 1 at line 2, column 3")
ex.message.should eq("Expected null, not '1' at line 2, column 3")
ex.location.should eq({2, 3})
end

Expand Down
Loading