Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better completion using types #734

Closed
tompng opened this issue Oct 18, 2023 · 22 comments
Closed

Better completion using types #734

tompng opened this issue Oct 18, 2023 · 22 comments
Labels
enhancement New feature or request

Comments

@tompng
Copy link
Member

tompng commented Oct 18, 2023

Background

Current RegexpCompletor has many limitations. It can't complete chained methods. Many wrong completions.
For better completion, we need to parse the incomplete input and perform type analysis.
We need to use RBS and an error tolerant parser like Prism.

PullRequest

WIP pull request is opend in #708.
Fallbacks to RegexpCompletor if Prism is not available.

Performance

RBS loading performance depends on rbs_collection.yml and project's rbs size under sig/ dir. It takes 0.168721s when there is no yml and sig.
Completion performane depends on code size editing in irb. For example below, it only analyze "n.times do\n _1." with binding that knows the actual value of n.

irb(main):123> n.times do
irb(main):124>   _1.

Analyzing 1161 lines takes 0.072726s. For comparision, coloring code takes 0.057202s

code = File.read 'lib/irb/type_completion/type_analyzer.rb'
# Force to analyze every single nodes
class IRB::TypeCompletion::TypeAnalyzer::DigTarget; def dig?(node)=true; end
measure
IRB::TypeCompletion::Completor.new.completion_candidates code, '.', '', bind: binding;
# => processing time: 0.072726s
IRB::Color.colorize_code(code);
# => processing time: 0.057202s

Accuracy

Comparision with current RegexpCompletor

As far as I know, the new completor can complete every case that RegexpCompletor can complete, even if RBS is not available.

# Completion without RBS that RegexpCompletor can't complete

def foo(n = 0)
  a = n; a = 1.0 if b; a. # Integer and Float methods
end

class User < ApplicationRecord
  has_█ # has_many, has_one
  def bar
    sa█ # save, save!
  end
end

module IRB
  A█ # IRB.constants and Object.constants
  @@a█ # IRB's class variable names
end

With RBS

When RBS is available, it can complete more.

# Completion with RBS

'a'.bytes.shuffle.join.encoding. # Encoding methods

10.times.map(&:to_s).map { _1.to_sym }.map(&:█) # Symbol methods

object.instance_eval do
  @a█ # object's instance method names
  self. # object's methods
end

%w[a b c].each_with_index do
  s = [:s] * _1
  a = [:a] * _2
  s[0]. # String methods
  a[0]. # Symbol methods
end

In Rails apps

Adding rbs_collection.yaml and sig/*.rbs to the Rails project will significantly improve accuracy. Article.first.author.email.und can complete String#underscore if types are defined in project's sig directory.
Providing good completion experience in rails console will be a motivation for users to write .rbs in their project.

Other type analysis libraries

This completor is a REPL-first type analyzer. It uses few information about the project codebase and many dynamic information from binding. Other type analyzer uses lots of project's code and no binding information.
I think it's hard to use another type analyzer and integrate with binding information. Maybe this could change in the future.

Where to have this code, IRB or bundled/default gem?

I want to have this code inside IRB's repository.

I want to enable it by default, so this code should be inside IRB or default/bundled gem, not normal gem that needs installation.
Want to completely replace RegexpCompletor in the future, I think default completor can be controlled inside IRB.
I think the possible user of REPL-first completor is only IRB.
It's easy to decide and have it inside IRB. As a bundled or default gem, it should be discussed in bugs.ruby-lang.org.
My opinion is, let's include it, get feedbacks, consider splitting it or not later.

@st0012
Copy link
Member

st0012 commented Oct 27, 2023

To help us better test this feature, what do you think about adding rbs files to IRB and add a CI step to typecheck it? Also, do you know if there are some open source projects that use RBS?

@tompng
Copy link
Member Author

tompng commented Oct 30, 2023

CI step to typecheck

Looks good. I don't know if its easy or hard, but we should try it.

Lrama https://github.com/ruby/lrama started to use RBS and Steep
List of gems using RBS as a library https://rubygems.org/gems/rbs/reverse_dependencies

@st0012
Copy link
Member

st0012 commented Nov 2, 2023

Firstly, thank you for the considerable effort you've put into this feature ❤️
Utilizing the power of typing in IRB is indeed a significant leap forward not just for IRB, but for the language as well.

General Feedback

  1. I propose we merge this feature before Ruby 3.3.
  2. However, I am against enabling it by default in 3.3.

The reason for the latter is due to the critical nature of this feature in the users' workflow. If anything goes wrong, it could heavily impact our users and potentially erode their trust in IRB. The introduction of the current autocompletion with Ruby 3.1, which was enabled by default, led to numerous issues as seen in #445. Although I believe this feature will be better tested and reviewed with our team of IRB maintainers, I'd like to avoid a similar situation as much as possible.

Public Rollout (Enabling by Default)

Personally, I think we can aim for the end of next year. However, before that, we need to:

  • Ensure all maintainers understand how this new feature works from the inside out. Some technical documentation, such as diagrams showing how components interact, would be incredibly helpful.
  • Encourage users to manually enable it and understand their reasons if they choose not to.
  • Address any issues that arise as adoption grows.

Hopefully by then we can drop Ruby 2.7 and add Prism as a dependency too.

Questions about Maintenance and Future Development

With the completion results now depending on a variety of factors, such as users' type signatures, community-contributed signatures, and the type analyzer itself, we need to consider:

  • How can we assist users in reporting issues and effectively investigate them?
  • Given that the type-analyzing mechanism will differ from Steep’s, how do we determine if a result is legitimate or not?
  • How do we evaluate if the feature is worth the extra maintenance effort?
  • Is there a contingency plan to replace this feature if our answer to the above question is negative?
  • Since the debug gem also uses IRB’s completion, do you plan to collaborate with its maintainer Koichi to adopt this feature there?

Additional Thoughts/Concerns

  • If we incorporate this directly into IRB, we essentially become maintainers of the type analyzers too, which is a substantial task. Relying on a single maintainer for such a significant feature is not ideal, and I'd like to avoid it. However, I'm unsure how we can effectively increase our bandwidth.

  • Providing good completion experience in rails console will be a motivation for users to write .rbs in their project.

    In my opinion, the current completion experience is already satisfactory, largely thanks to your improvements. I believe this feature will primarily enhance the experience for users who have already adopted RBS, rather than encouraging new adoption.

@st0012 st0012 added the enhancement New feature or request label Nov 2, 2023
@tompng
Copy link
Member Author

tompng commented Nov 3, 2023

How can we assist users in reporting issues and effectively investigate them?

Add Completion section to irb_info command. This is implemented in #708

Completion: Tab Complete, RegexpCompletor
Completion: Autocomplete, RegexpCompletor
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: loading)
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: 3.3.0.pre.2)
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: #<LoadError: cannot load such file -- rbs>)
Completion: Autocomplete, TypeCompletion::Completor(Prism: 0.16.0, RBS: #<RBS::Collection::Config::CollectionNotAvailable:"rbs collection is not initialized.\nRun `rbs collection install` to install RBSs from collection.\n">)

Given that the type-analyzing mechanism will differ from Steep’s, how do we determine if a result is legitimate or not?

Type analyzing is simple (less functional) than Steep. We should determine fixing the difference is possible in the simplified specification or not. (aims to avoid extra maintenance cost)

How do we evaluate if the feature is worth the extra maintenance effort?

Measure how many user enabling and re-disabling it. Any idea of comparing with extra maintenace effort?

Is there a contingency plan to replace this feature if our answer to the above question is negative?

We can simplify some of the analysis that causing maintenance problem. example:

  • Remove type analyzing, just parse by Prism and autocomplete literals and variable. (slightly better than RegexpCompletor)
  • Remove tracking local variable change like a=1; a+=0.5; a="" if condition; a.string_or_float_or_int_method
  • Remove supporting generic types, Array[Integer] to just Array[untyped]

Since the debug gem also uses IRB’s completion, do you plan to collaborate with its maintainer Koichi to adopt this feature there?

Yes, we should.
Not thinking how to yet. Maybe we can add an API to get appropriate completor

@st0012
Copy link
Member

st0012 commented Nov 3, 2023

Add Completion section to irb_info command.

That's helpful and I did notice that. But what I'm asking is in a broader sense, like:

  • What will a good repro for these issues look like? Is it possible to have a single file repro script template that has both rbs and Ruby code?
  • Should we maybe create a separate issue template so we can ask type-specific questions without bloating other IRB issues' reporting?

Type analyzing is simple (less functional) than Steep.

It will be great to have a few examples demonstrating this in the readme so we set the right expectation to other maintainers and our users.

I think there could also be cases where it's more powerful than Steep given we have runtime info? Examples of them should be mentioned too.

Measure how many user enabling and re-disabling it. Any idea of comparing with extra maintenace effort?

I think we have limited options to measure the benefit it brings to the community. Code searching may not be too helpful because if people enable this at project-level .irbrc, we generally won't see it. Perhaps the most effective way is to host polls on social media platforms, maybe around next RubyKaigi?

Wrt the extra effort, I think the we can get a sense of it by counting PRs associated with these components.

I haven't think of a way to compare the two though. It's likely that we won't be able to make any clear rule on this, but it's important that we keep this in mind.

We can simplify some of the analysis that causing maintenance problem. example:

Thank you for these examples 👍

Not thinking how to yet. Maybe we can add an API to get appropriate completor

That integration could require some extra work because currently its console can't talk to individual threads directly.

@tompng
Copy link
Member Author

tompng commented Nov 3, 2023

For reporting issues, single file repro is hard because of the setup to create binding(self object, module nesting, variables) and environment(sig, rbs_collection.yaml).
I think issue template works. These information will help us.

  • irb_info
  • Version of RBS and Prism
  • Content of sig/*.rbs, rbs_collection.yaml and rbs_collection.lock.yaml
  • IRB::TypeCompletion::Types.rbs_load_error and it's backtrace
  • IRB::TypeCompletion::Completor.last_completion_error and it's backtrace (the method is not implemented yet)
  • What you typed to IRB

have a few examples demonstrating this in the readme

host polls on social media platforms

Looks good 👍

@st0012
Copy link
Member

st0012 commented Nov 4, 2023

@tompng Can you start another PR, maybe on top of #708, to draft the documentation for this feature?

I think we need to cover these:

  • How to activate it and verify the activation
  • Where it collects signatures from
    • This is important because users can define signatures outside of sig/ and configure Steep to locate them, which won't work for this feature atm
  • Potential impact on performance
  • Cases where it's more/less accurate than Steep

@ima1zumi
Copy link
Member

ima1zumi commented Nov 5, 2023

I read part of IRB#708 and this discussion. Thanks for making the Issue.

My thoughts

  • I agree with the inclusion of Type Completor. Regexp Completor has a lot of wrong guesses, runs slow and is not user friendly, I think Type Completor will be a useful feature for users.
  • I have some concerns about whether I can maintain it myself. (I'm reading the PR, but it looks difficult.)
  • I agree with turning it off by default in 3.3. I would like feedback from users.

@hasumikin
Copy link
Collaborator

I vaguely think that it would be helpful to have a debugging feature that displays something like a backtrace of why the current list of autocomplete candidates was generated.

@st0012
Copy link
Member

st0012 commented Nov 6, 2023

In Sorbet, there's a T.reveal_type helper:

Screenshot 2023-11-06 at 14 16 47

I think if we have something like irb_reveal_type(my_var) and simply prints the type of my_var, it will be very helpful too.

@tompng
Copy link
Member Author

tompng commented Nov 6, 2023

#708 can do something like this:

irb(main):001>  a, b = 1, 'a'
=> [1, "a"]

irb(main):002> IRB::TypeCompletion::Completor.new.analyze('[a, b].map(&:a', binding)
=> [:call, Integer | String, "a", false]

irb(main):003> res = IRB::TypeCompletion::Completor.new.analyze('a.times {|aa| a', binding)
=> [:lvar_or_method, "a", #<IRB::TypeCompletion::Scope:0x0000000107740dd0>]

irb(main):004> res[2].local_variables
=> ["aa", "res", "a", "b", "_"]

irb(main):005> res[2].self_type
=> Object

I think adding a method or command that takes (code_to_complete, binding) and shows formatted output of these can help.

@st0012
Copy link
Member

st0012 commented Nov 8, 2023

Now asking to change the current implementation, but I think a benefit of extracting the completor implementation to a gem is that it will give us better dependency control and simpler implementation:

With the current approach, we need to passively check the version of prism installed by users and can't really bind upper bound restriction to rbs and prism to avoid breaking changes. This means that we always need to support the latest rbs and prism change and having ugly version branch in the implementation if we want to maintain backward compatibility.

If type_completion is a separate gem, we can add rbs and prism as dependencies like what we usually do and avoid the above problem. For example:

  • type_completion v1.1.0 - requires prism >= 0.17, <= 0.18
  • type_completion v1.2.0 - requires prism >= 0.18, <= 0.19

And in IRB, we simply ask users to have type_completion installed and don't care about prism or rbs changes.

@tompng
Copy link
Member Author

tompng commented Nov 16, 2023

I see. That's right, it makes sense.
I think we need to decide name and the api.

Naming

As a separate gem, the name type_completion is too generic because it's a completion only for binding-provided environment, not for editors or other.
What do you think of names like these?
repl_completion binding_completion irb_completion
repl_type_completion binding_type_completion irb_type_completion

Completion API

I think completion_candidates and doc_namespace api is enough for the first version.
Although, it's worth providing categorized/detailed completion candidates in the future.

@st0012
Copy link
Member

st0012 commented Nov 21, 2023

I've been thinking the word synthesis, which means

the combination of components or elements to form a connected whole.

I think it matches the fact that we combine typing and binding information together to power the completion.

But I'm not sure if people would think the name synthesis_completor too abstract/difficult to understand 🤔

If the above idea doesn't sound good, I'd go with type_completor.

@hasumikin
Copy link
Collaborator

Naming is always the hardest thing :)

I don't have a strong opinion but I feel it's better to include irb. So how about irb_type_completor?

@tompng
Copy link
Member Author

tompng commented Nov 23, 2023

I slightly prefer using repl_ over irb_ because it can be also used in debugger's REPL-feature, but irb_ is ok for me.
I like irb_type_completor because it looks straighforward and understandable that irb --type-completor needs it.

@st0012
Copy link
Member

st0012 commented Nov 23, 2023

Compared with irb_type_completor, I prefer repl_type_completor too so it's not too restricted to a single tool.

@ima1zumi
Copy link
Member

I agree with Stan. I prefer repl_type_completor too.

@hasumikin
Copy link
Collaborator

repl_type_completor 👍

@tompng
Copy link
Member Author

tompng commented Nov 28, 2023

Created https://github.com/ruby/repl_type_completor/

@tompng
Copy link
Member Author

tompng commented Dec 6, 2023

Finished by #772

@tompng tompng closed this as completed Dec 6, 2023
@st0012
Copy link
Member

st0012 commented Dec 6, 2023

@tompng Thank you for the vision and the tremendous effort you put into it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

4 participants