Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rfc] Import Object#yield_self from Ruby 2.5 #5110

Closed
makenowjust opened this issue Oct 12, 2017 · 33 comments
Closed

[rfc] Import Object#yield_self from Ruby 2.5 #5110

makenowjust opened this issue Oct 12, 2017 · 33 comments

Comments

@makenowjust
Copy link
Contributor

Ruby 2.5 decide to introduce Object#yield_self (see here). This method yields self to the block and then returns its result. The implementation is very simple:

class Object
  def yield_self
    yield self
  end
end

Should we import this method into Crystal? What do you think?

@straight-shoota
Copy link
Member

It's basically Object#try which also works for Nil with a descriptive yet still uninviting long name.

@RX14
Copy link
Contributor

RX14 commented Oct 12, 2017

I don't really see the usecase..?

@makenowjust
Copy link
Contributor Author

@RX14 Yes, I'd like to ask Ruby committer about usecase.

@lbguilherme
Copy link
Contributor

Not that I defend it, but could be used to create a variable, without creating a variable. To help keep calculations inline:

result = long.chained.computation(12).foo(1, 2, 3).yield_self {|list| list.sum * list.product }.to_s.size

RethinkDB (a database whose query language is mostly functional) has a similar operation, called do: https://rethinkdb.com/api/ruby/do/.

It is like a map, but for the entire thing.

@ysbaddaden
Copy link
Contributor

It's similar to .tap in behavior and intent, expect it returns the block's value (not the tapped value).

I don't see much usage. Naming is verbose, and assigning to a value feels better:

list = long.chained.computation(12).foo(1, 2, 3)
result = (list.sum * list.product).to_s.size

@RX14 RX14 closed this as completed Oct 13, 2017
@makenowjust
Copy link
Contributor Author

@RX14 Why do you close this issue? (I can guess but...) If you could explain this, you should do it, or if you couldn't explain this, you shouldn't close this.

@RX14
Copy link
Contributor

RX14 commented Oct 13, 2017

The consensus was that it wasn't really very useful. If anyone has a counterexample to discuss we can always reopen.

@asterite
Copy link
Member

yield_self is for those that want a "pipe operator" in Ruby: http://mlomnicki.com/yield-self-in-ruby-25/

Like others here, I don't think it looks nice, nor I think it's readable. And since we have try (and you will probably never want to have a nil in a chain), it can work as good as yield_self.

@makenowjust
Copy link
Contributor Author

I found very useful usecase of Object#yield_self! See this simple fizz buzz example:

def fizz?(n)
  n % 3 == 0
end

def buzz?(n)
  n % 5 == 0
end

(1..100).each do |i|
  if fizz?(i) && buzz?(i)
    puts :FizzBuzz
  elsif fizz?(i)
    puts :Fizz
  elsif buzz?(i)
    puts :Buzz
  else
    puts i
  end
end

When we have Object#yield_self, we can rewrite this:

# `def fizz?` and `def buzz` is skipped

(1..100).each do |i|
  case i
  when .yield_self { |i| fizz?(i) && buzz?(i) }
    puts :FizzBuzz
  when .yield_self { |i| fizz?(i) }
    puts :Fizz
  when .yield_self { |i| buzz?(i) }
    puts :Buzz
  else
    puts i
  end
end

In other words, Object#yield_self provides a potential to call any method in when condition against case value.

However, yield_self is tooooooooo long to type. So, I suggest another name Object#let (derived from Kotlin). What do you think?

@RX14 Please re-open this. I believe this feature makes Crystal more useful and we can discuss about this more.

@RX14
Copy link
Contributor

RX14 commented Nov 9, 2017

But why would you want to do that? That's uglier than the if version by far. If we just want a way to use arbitrary booleans in case then we should discuss that without resorting to hacks.

@makenowjust
Copy link
Contributor Author

makenowjust commented Nov 9, 2017

Real world example, src/compiler/crystal/tools/doc/highlighter.cr:

      case token.type
      when :NEWLINE
        io.puts

      # ...snip...

      when :IDENT
        if last_is_def
          last_is_def = false
          highlight token, "m", io
        else
          case token.value
          when :def, :if, :else, :elsif, :end,
               :class, :module, :include, :extend,
               :while, :until, :do, :yield, :return, :unless, :next, :break, :begin,
               :lib, :fun, :type, :struct, :union, :enum, :macro, :out, :require,
               :case, :when, :then, :of, :abstract, :rescue, :ensure, :is_a?,
               :alias, :pointerof, :sizeof, :instance_sizeof, :as, :typeof, :for, :in,
               :undef, :with, :self, :super, :private, :protected, "new"
            highlight token, "k", io
          when :true, :false, :nil
            highlight token, "n", io
          else
            io << token
          end
        end
      when :"+", :"-", :"*", :"/", :"=", :"==", :"<", :"<=", :">", :">=", :"!", :"!=", :"=~", :"!~", :"&", :"|", :"^", :"~", :"**", :">>", :"<<", :"%", :"[]", :"[]?", :"[]=", :"<=>", :"==="
        highlight token, "o", io
      when :"}"
        if break_on_rcurly
          break
        else
          io << token
        end
      else
        io << token
      end

when condition for keywords and operators are too long, so I want to refactor this with such constants:

KEYWORDS = Set{
  :def, :if, :else, :elsif, :end,
  :class, :module, :include, :extend,
  :while, :until, :do, :yield, :return, :unless, :next, :break, :begin,
  :lib, :fun, :type, :struct, :union, :enum, :macro, :out, :require,
  :case, :when, :then, :of, :abstract, :rescue, :ensure, :is_a?,
  :alias, :pointerof, :sizeof, :instance_sizeof, :as, :typeof, :for, :in,
  :undef, :with, :self, :super, :private, :protected, "new"
}

OPERATORS = Set{
  :"+", :"-", :"*", :"/", :"=", :"==", :"<", :"<=", :">", :">=", :"!",
  :"!=", :"=~", :"!~", :"&", :"|", :"^", :"~", :"**", :">>", :"<<",
  :"%", :"[]", :"[]?", :"[]=", :"<=>", :"==="
}

But I can't use these constants in when condition because Set#=== is not specialized currently.

When there is Object#let, this code can be rewritten:

      case token.type
      when :NEWLINE
        io.puts

      # ...snip...

      when :IDENT
        if last_is_def
          last_is_def = false
          highlight token, "m", io
        else
          case token.value
          when .let { |v| KEYWORDS.includes? v }
            highlight token, "k", io
          when :true, :false, :nil
            highlight token, "n", io
          else
            io << token
          end
        end
      when .let { |t| OPERATORS.includes? t }
        highlight token, "o", io
      when :"}"
        if break_on_rcurly
          break
        else
          io << token
        end
      else
        io << token
      end

Of course Object#let is not needed if #5269 is merged. However it is actual thing that Object#let is the most generic way to call any method against case value.

@RX14 Please imagine. Why do you hate this issue?

@asterite
Copy link
Member

asterite commented Nov 9, 2017

I think it makes sense. It would be nice to see which is faster, but I guess Set#includes? is faster than a huge when (I think this was concluded in a recent issue).

@makenowjust
Copy link
Contributor Author

makenowjust commented Nov 9, 2017

Benchmark is here and result is this:

old 358.88  (  2.79ms) (± 2.53%)       fastest
new 337.75  (  2.96ms) (± 1.87%)  1.06× slower

Object#let version is slow, but I feel it is a bit, also fast. It looks no problem to me.

/cc @asterite


And the most important point I think:

it is actual thing that Object#let is the most generic way to call any method against case value.

@RX14
Copy link
Contributor

RX14 commented Nov 9, 2017

It feels like an ugly hack, like there should be a concrete, real, syntax change for supporting this, not hacking it into the stdlib. Perhaps this could work:

case foo
when { func(foo) }
end

@bcardiff
Copy link
Member

bcardiff commented Nov 9, 2017

@RX14 that would require some lookahead in the parser to disambiguate from tuples I think. Besides the .foo { } is using already existing constructs.

Maybe Object#bind, Object#apply or Object#itself(&block) are short enough ?

The let looks weird to me. expr.let { |var| S } instead of let var = expr in S everything seems twisted.

@asterite
Copy link
Member

asterite commented Nov 9, 2017

Another proposal which doesn't require a change:

if func(foo)
  # do something
end

@RX14
Copy link
Contributor

RX14 commented Nov 10, 2017

Well, {func(foo)} is a valid tuple so clearly my syntax idea isn't sound. However, I maintain that if the only point of adding this method is for case then we'd be better off fixing the root of the problem - case. Can't think up of a good syntax though.

@makenowjust
Copy link
Contributor Author

Object#into is possible candidate. (or maybe Object#in, but it is too short.) I think #apply or #itself is too long.

@Sija
Copy link
Contributor

Sija commented Nov 10, 2017

IMHO Object#yield_self isn't bad, but perhaps #tap! could be used?

@makenowjust
Copy link
Contributor Author

#tap! does not make sense. I think ! means mutable or danger, but this method is not mutable normally and safe. Additionally #tap is used with mutable method sometimes.

@makenowjust
Copy link
Contributor Author

Replacing with if is good sometimes, on the other hand, it is not so good sometimes. I think highlighter.cr is such an example. When normal when value condition and other method call is mixed, this method is really useful.

And another benefit of this method is to call any method in method chaining. Any Crystal users dislike this because they are genius, so they can think the best variable name every time. However I can't do it every time. Naming is important, but writing executable program is more important, so I like this.

@ronen
Copy link

ronen commented Dec 7, 2017

Re Object#let, FYI there's a ruby gem with exactly that name, see rubygems object-let.

Disclaimer: I'm the author. Not that there's much code in there :)

@sudo-nice
Copy link

I abuse #try for that too. Most time it serves well, but sometimes feels like a hack: some_method.not_nil!.try....

@elebow
Copy link
Contributor

elebow commented Feb 19, 2022

It seems that, other than a caveat with Nil#try, Crystal's Object#try already has the same semantics as Ruby's Object#yield_self/Object#then, but the method names and documentation suggest different intended uses.

Therefore, I propose two things:

1. I encourage the team to rethink the decision against Object#yield_self and Object#then in the standard library. The previous decision in this thread was made several years ago, when yield_self and then were still new and strange in Ruby. Now in 2022, they have become well-established parts of the language, and there have been several independent proposals to add these to Crystal referenced in this thread. See the two examples below of the expressiveness that these methods would allow.

2. If the first proposal is not accepted (or even if it is), the Object#try documentation should be amended to reassure users that it is an acceptable substitute for #yield_self/#then, and not hacky. The documentation could explicitly state that this is the Crystal equivalent of those Ruby methods, to improve discoverability of the feature. I can open a PR with suggested wording if this is indeed true.


Here are two real-world use cases for Object#yield_self and Object#then. The code has been genericized from a proprietary repo.

Suppose we have a web request handler that enqueues a background job and notifies the user of the result.

def enqueue_job
  # enqueues a background job
end

def notify_user
  # sends some JSON to the user's browser
end

def handle_request
  # ...

  enqueue_job.then do |result|
    case result
    when :accepted
      notify_user(:accepted)
    when :rejected
      notify_user(:rejected)
    else
      log(result)
      notify_user(:error)
    end
  end

  # ...
end

I find this construction in the functional-programming style nicer than storing result as a local variable. The block parameter is visually associated with enqueue_job, and the method's namespace is not crowded with an unnecessary variable.


A similar example, where yield_self is used to create an anonymous hash. This is neater than having an intermediate variable for the redacted attributes, and this also demonstrates a case where yield_self sounds better than then.

def handle_request(attrs)
  log_request attrs.redact_private_data
                   .yield_self do |redacted_attrs|
                     {
                       name: redacted_attrs["name"],
                       email: redacted_attrs["email"],
                       ip: request.remote_ip,
                       admin: current_user.admin?
                     }
                   end

  save_to_database(attrs)
end

@straight-shoota
Copy link
Member

Disclaimer: The following is just my personal opinion on code style.

Looking at the examples by @elebow, I don't buy that this is a good way to express the code's intention.

The first example looks much better readable to me when you assign the value in the case expression:
This makes it very clear that the value of enque_job is the target of the branching expressions and can furthermore be referred to as a local variable. There's IMO no need for a nested block. It just adds noise in my opinion.

def handle_request
  # ...

    case result = enqueue_job
    when :accepted
      notify_user(:accepted)
    when :rejected
      notify_user(:rejected)
    else
      log(result)
      notify_user(:error)
    end
  end

  # ...
end

For the second example, without looking into the exact semantics of the methods being called, just considering the structural shape, I think it does a bad job explaining what's going on. It's a chain of methods being called, with the last call receiving a block that constructs a hash. It's not very obvious that this hash is actually the return value of the entire call chain.

Consider that alternative implementation instead. It does exactly the same thing, with a lot less visual clutter and much clearer in expressing how the log request data is created.

def handle_request(attrs)
  redacted_attrs = attrs.redact_private_data
  log_request({
    name: redacted_attrs["name"],
    email: redacted_attrs["email"],
    ip: request.remote_ip,
    admin: current_user.admin?
  })

  save_to_database(attrs)
end

I think the major point is that I don't agree with this sentiment:

The block parameter is visually associated with enqueue_job, and the method's namespace is not crowded with an unnecessary variable.

IMO it's not worth going lengths just to avoid introducing additional local variables in the method namespace. Introducing an additional scope just for the sake of that doesn't pay out.

Maybe it could be an idea to think about finding a middle ground for limiting the scope of local variables in some situations. For example, a local variable declared in a conditional expression (such as case result = enqueue_job ...) could be scoped to the conditional clause. If you want the variable to be visible beyond the corresponding end, it can simply be declared outside the conditional expression: result = enqueue_job; case result .... I could see that this would make kind of sense. Not sure it's really a good idea, though. It could be surprising behaviour, definitely a breaking change and has limited benefit (wether the local var is visible beyond the conditional is not really a practical issue). So it's probably not worth the effort.

@elebow
Copy link
Contributor

elebow commented Feb 22, 2022

@straight-shoota, I agree your suggested code is neater than my examples. Still, I think the style has merit in some situations. See these other real-world examples:

As for a language change regarding the scope of local variables, I agree that would be detrimental. I would argue only for adding the methods to Object. Especially since the existing Object#try is so close! And, it is unlikely that the names Object#yield_self/#then would ever be wanted in the future for some other non-Ruby-compatible behavior.


(mentioning #1388 (comment) to link a related discussion that is not yet referenced in this thread).

@straight-shoota
Copy link
Member

straight-shoota commented Feb 22, 2022

I'm pretty sure that most of the examples you posted would actually be easier to read if they were written as independent expressions instead of as a chain with yield_self. IMO it introduces avoidable cognitive overload when the primary expression is transformed to something else in the manner of x = fetch_a.yield_self { its_actually_b }. 🧠 💣

#then would likely collide with promise pattern (e.g. https://github.com/spider-gazelle/promise/blob/0b2e527c0c9c4207456cad64bc374aa2c6a793fa/src/promise/resolved_promise.cr#L31).

Sure, #yield_self would be cheap to add and there's little chance of conflict.
But at least I would like to be convinced that it is actually useful for code improvement. So far I have the expression that it's mostly used for making code worse to read. 🤷‍♂️ Of course, in the end it's still a decision of the developer whether to use it or not. But when it has such a bad impact as I'm observing, then I'd rather prefer to not give the decision (at least not from stdlib; ofc anyone can add that method in user code if they wish).
Sorry for making this hard on you ;) I think this discussion is very good though, and I'm happy to approve this with convincing argument.

@ysbaddaden
Copy link
Contributor

I had to search, but I finally found one case where it may be nice (maybe):

def key
  @key ||= begin
    cipher = OpenSSL::Cipher.new(DEFAULT_CIPHER)
    cipher.random_key.hexstring
  end
end

# self_yield allows to skip the begin/end block:
def key
  @key ||= OpenSSL::Cipher.new(DEFAULT_CIPHER).yield_self do |cipher|
    cipher.random_key.hexstring
  end
end

@asterite
Copy link
Member

I still we should do it like Ruby and add a then method: https://ruby-doc.org/core-3.1.0/Kernel.html#method-i-then

Implementing that method is trivial! And it would solve all these scenarios:

  • People using try instead of then
  • People who want to have something like a pipe operator
  • Discussing these things over and over :-)

@straight-shoota
Copy link
Member

@ysbaddaden I don't quite understand. Why would you use either of those forms instead of just a call chain?

def key
  @key ||= OpenSSL::Cipher.new(DEFAULT_CIPHER).random_key.hexstring
end

@Sija
Copy link
Contributor

Sija commented Feb 22, 2022

Related #7188 #11487

@ysbaddaden
Copy link
Contributor

@straight-shoota 🤦 ...

Allright, let's imagine there is some other cipher.something calls with no chaining involved? I usually solve it by extracting a method, but it may be nice to not have to create yet-another-method?

def key
  @key ||= generate_key
end

private def generate_key
  # ...
end

@asterite As said above #then may collide when designing futures, but maybe that's not a problem, since a hypothetical Future#then would just override Object#then and the original #then becomes kinda moot, so 🤷

@grepsedawk
Copy link

I still we should do it like Ruby and add a then method: https://ruby-doc.org/core-3.1.0/Kernel.html#method-i-then

Implementing that method is trivial! And it would solve all these scenarios:

  • People using try instead of then
  • People who want to have something like a pipe operator
  • Discussing these things over and over :-)

I agree, #then seems super idiomatic and explains the operation (and matches ruby, which is pretty normal for crystal)

In terms of examples, the reasons I love .then/.try is to build out composable conditionals like such: https://thoughtbot.com/blog/using-yieldself-for-composable-activerecord-relations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests