Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add command spoom srb sigs translate to translate RBI signatures into RBS comments #611

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ GEM
racc (1.8.0)
rainbow (3.1.1)
rake (13.2.1)
rbi (0.2.0)
rbi (0.2.1)
prism (~> 1.0)
sorbet-runtime (>= 0.5.9204)
rdoc (6.6.3.1)
Expand Down
4 changes: 4 additions & 0 deletions lib/spoom/cli/srb.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
require_relative "srb/bump"
require_relative "srb/coverage"
require_relative "srb/lsp"
require_relative "srb/sigs"
require_relative "srb/tc"

module Spoom
Expand All @@ -19,6 +20,9 @@ class Main < Thor
desc "bump", "Change Sorbet sigils from one strictness to another when no errors"
subcommand "bump", Spoom::Cli::Srb::Bump

desc "sigs", "Translate signatures from/to RBI and RBS"
subcommand "sigs", Spoom::Cli::Srb::Sigs

desc "tc", "Run typechecking with advanced options"
subcommand "tc", Spoom::Cli::Srb::Tc
end
Expand Down
49 changes: 49 additions & 0 deletions lib/spoom/cli/srb/sigs.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# typed: true
# frozen_string_literal: true

require "spoom/sorbet/translate_sigs"

module Spoom
module Cli
module Srb
class Sigs < Thor
include Helper

desc "translate", "Translate signatures from/to RBI and RBS"
option :from, type: :string, aliases: :f, desc: "From format", enum: ["rbi"], default: "rbi"
option :to, type: :string, aliases: :t, desc: "To format", enum: ["rbs"], default: "rbs"
def translate(*paths)
from = options[:from]
to = options[:to]
paths << "." if paths.empty?

files = paths.flat_map do |path|
if File.file?(path)
[path]
else
Dir.glob("#{path}/**/*.rb")
end
end

if files.empty?
say_error("No files to translate")
exit(1)
end

say("Translating signatures from `#{from}` to `#{to}` in `#{files.size}` files...\n\n")

files.each do |file|
contents = File.read(file)
contents = Spoom::Sorbet::TranslateSigs.rbi_to_rbs(contents)
File.write(file, contents)
rescue RBI::ParseError => error
say_warning("Can't parse #{file}: #{error.message}")
next
end

say("Translated signatures in `#{files.size}` files.")
end
end
end
end
end
164 changes: 164 additions & 0 deletions lib/spoom/sorbet/translate_sigs.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# typed: strict
# frozen_string_literal: true

require "rbi"

module Spoom
module Sorbet
class TranslateSigs
class << self
extend T::Sig

sig { params(ruby_contents: String).returns(String) }
def rbi_to_rbs(ruby_contents)
ruby_contents = ruby_contents.dup

tree = RBI::Parser.parse_string(ruby_contents)

translator = RBI2RBS.new
translator.visit(tree)
sigs = translator.sigs.sort_by { |sig, _rbs_string| -T.must(sig.loc&.begin_line) }

sigs.each do |sig, rbs_string|
scanner = Scanner.new(ruby_contents, Encoding::UTF_8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you know that the source is UTF-8?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be:

Suggested change
scanner = Scanner.new(ruby_contents, Encoding::UTF_8)
scanner = Scanner.new(ruby_contents, ruby_contents.encoding)

start_index = scanner.find_char_position(
line: T.must(sig.loc&.begin_line) - 1,
character: T.must(sig.loc).begin_column,
)
end_index = scanner.find_char_position(
line: sig.loc&.end_line&.-(1),
character: T.must(sig.loc).end_column,
)
ruby_contents[start_index...end_index] = rbs_string
end

ruby_contents
end
end

class RBI2RBS < RBI::Visitor
extend T::Sig

sig { returns(T::Array[[RBI::Sig, String]]) }
attr_reader :sigs

sig { void }
def initialize
super
@sigs = T.let([], T::Array[[RBI::Sig, String]])
end

sig { override.params(node: T.nilable(RBI::Node)).void }
def visit(node)
return unless node

case node
when RBI::Method
translate_method_sigs(node)
when RBI::Attr
translate_attr_sigs(node)
when RBI::Tree
visit_all(node.nodes)
end

super
end

private

sig { params(node: RBI::Method).void }
def translate_method_sigs(node)
node.sigs.each do |sig|
out = StringIO.new
p = RBI::RBSPrinter.new(out: out, indent: sig.loc&.begin_column)

if node.sigs.any?(&:is_abstract)
p.printn("# @abstract")
p.printt
end

if node.sigs.any?(&:is_override)
p.printn("# @override")
p.printt
end

if node.sigs.any?(&:is_overridable)
p.printn("# @overridable")
p.printt
end

p.print("#: ")
p.send(:print_method_sig, node, sig)

@sigs << [sig, out.string]
end
end

sig { params(node: RBI::Attr).void }
def translate_attr_sigs(node)
node.sigs.each do |sig|
out = StringIO.new
p = RBI::RBSPrinter.new(out: out)
p.print_attr_sig(node, sig)
@sigs << [sig, "#: #{out.string}"]
end
end
end

# From https://github.com/Shopify/ruby-lsp/blob/9154bfc6ef/lib/ruby_lsp/document.rb#L127
class Scanner
extend T::Sig

LINE_BREAK = T.let(0x0A, Integer)

# After character 0xFFFF, UTF-16 considers characters to have length 2 and we have to account for that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this complexity wrt to UTF-16 codepoints was only about the fact that VS Code only considers UTF-16 but our code is in UTF-8.

I am not sure if we need the same complexity here. We should be able to treat it as a string of codepoints (or even bytes).

SURROGATE_PAIR_START = T.let(0xFFFF, Integer)

sig { params(source: String, encoding: Encoding).void }
def initialize(source, encoding)
@current_line = T.let(0, Integer)
@pos = T.let(0, Integer)
@source = T.let(source.codepoints, T::Array[Integer])
@encoding = encoding
end

# Finds the character index inside the source string for a given line and column
sig { params(position: T::Hash[Symbol, T.untyped]).returns(Integer) }
def find_char_position(position)
# Find the character index for the beginning of the requested line
until @current_line == position[:line]
@pos += 1 until LINE_BREAK == @source[@pos]
@pos += 1
@current_line += 1
end

# The final position is the beginning of the line plus the requested column. If the encoding is UTF-16,
# we also need to adjust for surrogate pairs
requested_position = @pos + position[:character]

if @encoding == Encoding::UTF_16LE
requested_position -= utf_16_character_position_correction(@pos, requested_position)
end

requested_position
end

# Subtract 1 for each character after 0xFFFF in the current line from the column position, so that we hit the
# right character in the UTF-8 representation
sig { params(current_position: Integer, requested_position: Integer).returns(Integer) }
def utf_16_character_position_correction(current_position, requested_position)
utf16_unicode_correction = 0

until current_position == requested_position
codepoint = @source[current_position]
utf16_unicode_correction += 1 if codepoint && codepoint > SURROGATE_PAIR_START

current_position += 1
end

utf16_unicode_correction
end
end
end
end
end
Loading
Loading