Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolation of nothing should be an error #27352

Closed
staticfloat opened this issue May 31, 2018 · 20 comments · Fixed by #27829
Closed

Interpolation of nothing should be an error #27352

staticfloat opened this issue May 31, 2018 · 20 comments · Fixed by #27829
Labels
breaking This change will break code missing data Base.missing and related functionality
Milestone

Comments

@staticfloat
Copy link
Member

Now that we're firming up our notion of what nothing and missing actually mean, we should probably semantically enforce them a little bit more. Example:

Sys.which(cmd) will return nothing if the given cmd cannot be found on the PATH. This makes it easy to define a hierarchy of commands: coalesce(Sys.which("curl"), Sys.which("wget")), etc... However, if the user then interpolates the result into a Cmd, it makes sense that we should error out.

Jeff suggested disallowing all kinds of print() or string() conversion of nothing, leaving only repr(). I tend to agree, despite that this has the potential to be a quite breaking change.

I would also like to suggest that missing get a similar treatment, as it is a subtly different but largely identical concept.

@ararslan ararslan added missing data Base.missing and related functionality breaking This change will break code labels May 31, 2018
@nalimilan
Copy link
Member

I see the idea, but I can't help feeling that's not the right place to handle this. Forcing users to handle explicitly the possibility that a value is nothing is one of the two effects/goals of Some (the other being to distinguish nothing from Some(nothing)). Yet we have generally taken the stance that we don't wrap return values in Some, and instead expect users to check manually for nothing when appropriate. So I'm not sure why we should throw an error for some functions and not for others.

I guess the motivation in the present case is that Sys.which used to throw an error, so we are not used to it returning nothing, and we want to protect users from that. But isn't that just due to history? Also note that even if we don't throw an error on interpolation, the command will most likely fail trying to run nothing anyway, so it doesn't make a big difference.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jun 1, 2018

What is the actual utility of allowing nothing to interpolate as a string into things? Put another way, where is this functionality being used that is not currently a bug?

@nalimilan
Copy link
Member

I agree interpolation doesn't sound very useful nor legitimate. But string is defined as "Create a string from any values using the print function", so wouldn't it be weird that nothing would be the only type not to support it? That's something I've wondered for the case of missing, where we could also imagine returning missing rather than "missing".

@JeffBezanson JeffBezanson added the triage This should be discussed on a triage call label Jun 12, 2018
@StefanKarpinski
Copy link
Member

Argument for the specialness of nothing: there are a lot of situations now where you can do some operation and expect some kind of value like an integer or a string and the only other value you may get is nothing. Therefore we should be extra careful with nothing as a return value since it's special in the sense of potentially appearing when you may not have been expecting it.

@mbauman
Copy link
Member

mbauman commented Jun 14, 2018

Another argument is that printing of nothing is likely to occur upon writing to a CSV:

julia> b = IOBuffer(); writedlm(b, [1 2 nothing 4], ',')
       print(String(take!(b)))
1,2,nothing,4

Now perhaps this could be argued that it's the responsibility of the CSV writer to handle this case, but it is somewhat compelling to me. It's also interesting that the REPL doesn't even attempt to print anything when the result is nothing.

@yurivish
Copy link
Contributor

yurivish commented Jun 14, 2018

This is slightly off-topic, but writedlm also writes missing as "missing", which I found somewhat surprising earlier today when writing a CSV. Not sure what I expected, but the worrisome case is when the string "missing" is written into a String column that may also contain "missing" as a valid string value.

@JeffBezanson
Copy link
Member

Maybe CSV writers should print missing as an empty field instead?

@staticfloat
Copy link
Member Author

I think it makes sense for missing to in general be a non-fatal absence with regards to construction of larger objects, whereas nothing should be the fatal absence. In that sense, I think the following should be true:

@test_throws ArgumentError "$(nothing)"
@test "$(missing)" == ""
@test join([1,2,missing,4], "") == "1,2,,4"

@nalimilan
Copy link
Member

@test "$(missing)" == ""
@test join([1,2,missing,4], "") == "1,2,,4"

This would imply that print(missing) wouldn't print anything, as opposed to show, which needs to print missing. That would mean that the empty string is the "canonical (un-decorated) text representation" of missing, while "missing" is its "informative text representation". Would that be OK? I'm never sure where print and show are used down the line.

@staticfloat
Copy link
Member Author

staticfloat commented Jun 15, 2018 via email

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jun 15, 2018

Seems reasonable to me and worth trying. I like the "fatal absence" versus "non-fatal absence" distinction.

@ggggggggg
Copy link
Contributor

ggggggggg commented Jun 15, 2018

Consider a new user trying out something like

julia> f(x) = x>0?nothing:-x
f (generic function with 1 method)

julia> a = f(1);

julia> print("I ran f and got $a")
I ran f and got nothing

You really want them to get an ArgumentError here? I imagine that would be very frustrating.

@StefanKarpinski
Copy link
Member

That case can have a very helpful, specific error message, which makes it not so frustrating since it immediately tells you how to fix the problem: do "I ran f and got $(repr(a))" instead, which is good advice in general. The case of getting an unexpected nothing and splicing it into a command or whatever and not being able to figure out what went wrong, on the other hand, seems like it would be truly frustrating and not just for novices.

@JeffBezanson
Copy link
Member

Quite right --- getting an error (hopefully with a useful message) while trying things out is not a problem at all compared to silently getting nothing in your output stream, having it cause a problem much later, and then having to track down where it came from.

@ggggggggg
Copy link
Contributor

I guess with a good error message it wouldn't be so bad.

@StefanKarpinski
Copy link
Member

See #27729 which proposes pwd() == nothing when in a deleted directory but as an alternative returning some kind of NotFound object which could also potentially be returned by Sys.which to give a better error message upon usage. However that object would not be amenable to the coalesce / something code which is annoying.

@nalimilan
Copy link
Member

FWIW, I'm not opposed to print(nothing) throwing an error and print(missing) printing nothing. The latter definitely makes sense for CSV, even if it's generally more complex than that (e.g. whitespace-delimited files are usually parsed by skipping contiguous spaces, so that a special text value like NA is needed to represent a missing value). Anyway I'm not completely clear on where print is/should be used rather than show.

@StefanKarpinski
Copy link
Member

Triage is in favor, noting that if we change our minds, we can but if we don't do this now, we can't change it in the 1.x timeline.

@StefanKarpinski StefanKarpinski removed the triage This should be discussed on a triage call label Jun 28, 2018
@StefanKarpinski StefanKarpinski added this to the 0.7 milestone Jun 28, 2018
@maleadt
Copy link
Member

maleadt commented Jun 28, 2018

Any resolution wrt. #27650? Can logging of nothing still result in no message, or should it error as well? Or could we use missing for that, as @nalimilan proposed above?

@nalimilan
Copy link
Member

I don't think logging should use missing for that. Having a special case for nothing there would probably be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code missing data Base.missing and related functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants