Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the scala_protobuf aspect based #700

Merged
merged 34 commits into from
Mar 6, 2019

Conversation

ianoc-stripe
Copy link
Contributor

Working on a scala_protobuf aspect based rules.

These use the toolchain to configure the options instead of being on the rule. Allowing us to build out the shadow graph and share it.

The only part i haven't implemented and am not entirely sure how i should is the java conversions. This would also be a pretty large change for someone to implement. (Though when you'd want to mix/match these options usually in a repo i don't know, so it might not be much of a deal other than deleting some lines).

@johnynek
Copy link
Member

@ianoc-stripe this is great! thanks.

Did you see if this addresses some concerns from internal uses? Can you comment on them here so people on google can see some of our motivations?

@johnynek
Copy link
Member

Thanks taking this on.

I took a look, generally it looks like the right direction to me.

The challenge here is that the options are set once for all targets, I think. That may be okay, or may not. Options to aspects didn't have a lot of features when I looked at this with thrift.

@ittaiz can you and your team take a look at this?

load("//scala_proto:scala_proto_toolchain.bzl", "scala_proto_toolchain")

toolchain_type(
name = "toolchain_type",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I freely admit to not understanding this stuff. It seems really weird to me the whole toolchain_type, toolchain, impl of the toolchain.... but I guess maybe they are mapping a typed system onto dynamic values in skylark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know about how much of this is required/needed tbh. I cargo culted this from the scala tool chain

name = "default_toolchain_impl",
with_grpc = False,
with_flat_package=False,
# with_java=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the comment? to show the setting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this is commented because i haven't figured out how it should be implemented dynamically -- we need to depend on the java aspect only when this is true. And depending on the java aspect itself since its defined in java i haven't found the right magic invokcation for yet

# Pass inputs seprately because they doesn't always match to imports (ie blacklisted protos are excluded)
inputs = _colon_paths(compile_proto)
)
print(worker_content)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

compile_protos = sorted(target_ti.direct_sources)
transitive_protos = sorted(target_ti.transitive_sources)

print("transitive_descriptor_sets")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove or make a flag for debugging?

label.name + "_scalac.statsfile",
sibling = scalapb_jar,
)
print("Compiling %s -> %s" %(label, scalapb_jar))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove or make optional.

implementation = _scalapb_aspect_impl,
attr_aspects = ["deps"],
attrs = {
"_pluck_scalapb_scala": attr.label(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this pluck name is relevant here.

require(op.toFile.exists, s"Path $fullPath does not exist, which it should as a dependency of this rule")
val relativePath = rp.relativize(op)

println(s"Root: rp , op: $op and relativePath: $relativePath")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment out?


println(s"Root: rp , op: $op and relativePath: $relativePath")
relativePath.toFile.getParentFile.mkdirs
java.nio.file.Files.copy(op, relativePath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

want to import Files?

test/proto/BUILD Outdated
deps = [
":test2",
":test_proto_java_lib",
# ":test_proto_java_lib",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove the comment.

test/proto/BUILD Outdated
name = "test1_proto_scala",
deps = ["//test/proto2:test"],
generator = "@io_bazel_rules_scala//src/scala/scripts:scalapb_generator")
# scala_proto_srcjar(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we want to enable these before merging? somehow? I think passing the generator should not be an option, but building the test1 target.

@ianoc-stripe
Copy link
Contributor Author

What this is:
Replacing the existing rules with a scala proto library rule similar to the java one. With this we will effectively create synthetic scala protobuf libraries to match every proto_library instance. This means instead of one large target we will have a tree of smaller targets matching the proto_libraries.

Why/motivation:
Currently if you build a proto file, the transitive set of proto's are all handed to scalac and wind up i the final jar. This has three pretty large downsides:

  1. Compiling is super super slow in some of our targets when there is a large/wide dependency tree
  2. This means cache hits tend to be pretty poor
  3. We have duplicate classes on the classpath from every scala proto library, which makes the jars bigger and is poor hygiene.

Challenges/differences:

Since the scala_proto's now are going to be generated via the aspect we need to share a bunch of the generating options. I've moved these to a toolchain to enable them for a repo. Similar to how this would mostly be used in sbt from scalapb i hope its not a negative impact for folks.

One thing I haven't quite figured out yet is doing the with_java option, since effectively for each of our synthetically generated nodes we will need to depend on the corresponding java synthetic node at the same level. If i can get the right provider from the java_proto rule i think i can get it working, though not sure about dynamically enabling aspects like this based on the toolchain configuration.

Also still to be implemented:
proto blacklist, this will move to the toolchain also.

@ittaiz
Copy link
Member

ittaiz commented Feb 26, 2019

Indeed thanks a lot!
I’ll take a look tomorrow but I just now skimmed it and I’m concerned about two things:

  1. Current rule is decoupled from scalapb and only the macro gives the scalapb wrapping. We’d like to preserve that if possible.
  2. Somewhat related to the above is that we need to pass a custom generator. We need to pass it to all targets but one so not sure how we can tackle it.
  3. Related to 1, we used to pass the free form string flags which now doesn’t seem like it’s possible

@ianoc-stripe
Copy link
Contributor Author

@ittaiz on (1) you mean you like having the scala_library just referring to the src jar of the scalapb code? or how do you mean decoupled?

@ianoc
Copy link
Contributor

ianoc commented Mar 1, 2019

@ittaiz did you have a chance to take another look here? If you have a custom generator that needs to get passed to all targets we could do it via a toolchain config setting?

@johnynek
Copy link
Member

johnynek commented Mar 1, 2019

@ianoc-stripe is this still a WIP? Can we merge master?

@ianoc
Copy link
Contributor

ianoc commented Mar 1, 2019

@johnynek its still WIP just since there's two features not implemented blacklists and java converters, blacklists seem trivial, not entirely sure quite yet how to correctly handle blacklists. But also to understand for folks like @ittaiz that this is directionally usable for him or we need to possibly consider that this will need to be put in as a new set of rules. (Given aspect based stuff implies more global configuration from the tool chain and less local per-target config).

@ittaiz
Copy link
Member

ittaiz commented Mar 1, 2019

@simuons is the one leading our internal proto scala work and I asked him to take a look since I think this will severely break us :(
He took a look but hasn't gotten a chance to respond yet...
I'll try to take a look now as well

@johnynek
Copy link
Member

johnynek commented Mar 1, 2019

what do you think will break you @ittaiz ? Maybe we can focus on that.

Ideally, the majority of old targets can be made to work with no change, and otherwise have a canonical simple change for the rest.

We are really concerned with performance and this was MUCH faster for us (according to what @ianoc-stripe told me offline).

@ittaiz
Copy link
Member

ittaiz commented Mar 1, 2019

In general I'm very much in favor of this approach and want to merge it in even if it breaks some APIs and we need to do some work for it but it has to be compatible.
Re generator via toolchain I think that we have one target that is the inception point which needs a vanilla generator but maybe we can solve it via a nested workspace (a bit iffy but might stick)

@ianoc-stripe
Copy link
Contributor Author

Its a little odd seemingly but it might be easy enough to maintain if we aimed for the ability for a global generator + blacklist for said generator. You could blacklist the inception point then?

@ittaiz
Copy link
Member

ittaiz commented Mar 1, 2019

I don't remember the blacklist 100% unfortunately.
Let me take a look at the generator first.
Re coupling I mean that the toolchain has predefined flags (with_grpc for example) while we used scala_proto_srcjar which allowed free form flags and a custom generator.
We then had all targets but one use a macro which defined our custom generator and had a single target (don't know why exactly) use the vanilla generator (I think because of a cycle issue).

@ianoc-stripe ianoc-stripe changed the title [wip] Make the scala_protobuf aspect based Make the scala_protobuf aspect based Mar 5, 2019
@ianoc-stripe
Copy link
Contributor Author

So we will require bazel 0.23.0 for this, but now the tool can properly be configured on the toolchain.

Needs bazelbuild/bazel@5a20a44
until this it doesn't seem like it worked.

i've also made the changes to just use the google protoc now too.

With this change we can look closer at what would be required to support a custom generator for some specific labels

@ianoc
Copy link
Contributor

ianoc commented Mar 5, 2019

ok @ittaiz see what you think now maybe. With this you can specify a second code generator, and then a set of targets to give to that code generator. Would that work for Wix?

[ScalaPBImport],
],
attrs = {
"_protoc": attr.label(executable = True, cfg = "host", default = "@com_google_protobuf//:protoc"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simuons this is the same change you have, do you agree?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is.

Copy link
Member

@johnynek johnynek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really great to me. Thank you for such a thorough job and for designing a work around for Wix, even though we don't currently use this approach.

This looks like a major win to me, especially since our proto compilations are pretty slow now. This is exciting!

@ittaiz what do you think? It looks like your concerns were addressed. Would you agree?

@@ -2,48 +2,38 @@ package scripts

import java.nio.file.{Files, Path, Paths}

class PBGenerateRequest(val jarOutput: String, val scalaPBOutput: Path, val scalaPBArgs: List[String], val protoc: Path)
class PBGenerateRequest(val jarOutput: String, val scalaPBOutput: Path, val scalaPBArgs: List[String], val includedProto: List[(String, String)], val protoc: Path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder why this isn't a case class and we can remove all the vals...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will try make that change to make it shorter

val protoFiles = args.get(4).split(':')
val includedProto = args.get(1).drop(1).split(':').distinct.map { e =>
val p = e.split(',')
(p(0), p(1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just use Path here so we don't have to do the conversion later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep will update

)

scala_proto_srcjar(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we no longer directly exercise srcjar?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, we could bring it back if it was useful, though since the aspect generates one internally they would be duplicates of some form

code_generator = toolchain.override_code_generator


for lbl in toolchain.override_code_generator_targets:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is identical to three lines above. I guess it should be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ittaiz
Copy link
Member

ittaiz commented Mar 6, 2019

@ianoc @johnynek,
The PR currently doesn't work for us.
Previous implementation allowed for very low level free form manipulation that we used.
I think that @ianoc went out of his way to get the PR to work for our needs but because he doesn't have these pains (and because we weren't very good at explaining our needs) I think this is becoming a bit too hard
Because we don't want to block you more what we suggest is that we'll copy over the existing implementation to our own code and we'll work on a solution to our needs based on rules_scala master (with the aspect).
This will allow you to merge, allow Wix to continue to upgrade rules_scala and allow us to formulate a sound solution to our needs (which we think are general when you use proto in a large organization)
WDYT?
In general our needs are around being able to have multiple and custom generators per target and possible allowing these generators to have custom flags.

On a more abstract note- I have to say that I'm not sure how this should have turned out. One the one hand there is a significant advancement (written above) but on the other hand this is breaking existing features that work for us. Don't know...

@ianoc-stripe
Copy link
Contributor Author

@ittaiz instead of an override generator if that doesn't quite fit the bill -- how about we have a scala_proto_srcjar target similar to what we had before with just the free form flags for the generator in the args rule. And in that target though we assert that the target is in the toolchain's blacklist?

I'm not entirely sure tbh still of your needs here, but i think then you could construct the core targets anyway you need, followed by adding those targets into the toolchains dependency set so all the toolchain generated targets depend on them?

@ittaiz
Copy link
Member

ittaiz commented Mar 6, 2019

I hope @simuons will give a more thorough description of our needs but basically I under-represented the complexity. The core targets are a small issue.
The bigger issue is that we allow for many different additional custom generators. You can add tagless-final for FP oriented code, you can add accord validation and others.
The current approach is via a sort of dynamic decorator which takes a list of arguments (hence the need of free form flags in each target), splits them to scalapb args and additional generators list and according to the list runs additional generators.
This means that every target decides on the additional generators.
@simuons and his team are now trying to think and evaluate what is the best approach for this in bazel.
Their current approach, getting rid of the decorator generator in favor of multiple top level generators (multiple --out) or maybe having multiple protoc actions each with its own generator.

One core requirement is to allow the introduction of a custom generator without it needing to be blessed by the infra team. Another desired feature (not must) is to avoid generating all generators if a target doesn't want it. Not 100% sure about the latter.

One question though- I seem to remember that the aspect can read the attributes of the target it's inspecting. Why not have custom flags attribute on the rule?

@ianoc-stripe
Copy link
Contributor Author

@ittaiz there is a tree of targets generated that are shared by multiple calls to the scala proto library rule. So the rule we are building it for is the proto_library not the scala_proto_library. the scala_proto_library expresses then a dependency on one of the generated by aspect targets. So it becomes difficult to say how a shared target should be built based on different args passed for generators

@johnynek
Copy link
Member

johnynek commented Mar 6, 2019

@ittaiz the aspect mechanism does not allow for custom flags per rule. I think you could attach custom flags to the proto and have it propogate, but the whole idea is that each proto rule gets mapped to a scala proto src jar. There is no place to hook custom args, that I know of, onto the proto.

I don't know why you can't use one generator for everyone and control the behavior by adding annotations to the proto itself (or in some comment).

At twitter, which was a pretty large org when I was there, we only used 1 generator for thrift. We did not have some concept that each rule would pass bespoke options. We tried to push any options into the thrift sources, which you could control. I'm a bit skeptical that this must be solved by passing custom generators to proto rules, and I'm concerned that design clashes with aspects for the reason I mention above, so is somewhat ill-suited to a bazel best practice of aspects for code generation.

# method bodies
compile_scala(
ctx,
Label("%s-fast" % (label)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starlark really needs some kind of gensym thing that makes sure generated symbols are unique.

default = Label("@io_bazel_rules_scala//src/scala/scripts:scalapb_generator"),
allow_files=True
),
"override_code_generator_targets": attr.label_list(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove this override code if @ittaiz says it won't be useful for them anyway? simplify the code?

@ittaiz
Copy link
Member

ittaiz commented Mar 6, 2019

@johnynek thanks for your inputs. Given the generic proto toolchain @simuons found I have a feeling we can find a solution in bazel.
We'll see.

@johnynek johnynek merged commit 74192b2 into bazelbuild:master Mar 6, 2019
)
ctx.actions.write(output = argfile, content = worker_content)
ctx.actions.run(
executable = code_generator.files_to_run,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnynek even after registering the default toolchain in my WORKSPACE, I'm getting the following error:

ERROR: /private/var/tmp/_bazel_feri/27c97e79ab5b7e8bcdeb0d679a4bc60c/external/com_google_protobuf/BUILD:248:2: in @io_bazel_rules_scala//scala_proto/private:scalapb_aspect.bzl%scalapb_aspect aspect on proto_library rule @com_google_protobuf//:any_proto:
Traceback (most recent call last):
	File "/private/var/tmp/_bazel_feri/27c97e79ab5b7e8bcdeb0d679a4bc60c/external/com_google_protobuf/BUILD", line 248
		@io_bazel_rules_scala//scala_proto/private:scalapb_aspect.bzl%scalapb_aspect(...)
	File "/private/var/tmp/_bazel_feri/27c97e79ab5b7e8bcdeb0d679a4bc60c/external/io_bazel_rules_scala/scala_proto/private/scalapb_aspect.bzl", line 175, in _scalapb_aspect_impl
		proto_to_scala_src(ctx, target.label, code_generator, com..., <4 more arguments>)
	File "/private/var/tmp/_bazel_feri/27c97e79ab5b7e8bcdeb0d679a4bc60c/external/io_bazel_rules_scala/scala_proto/private/proto_to_scala_src.bzl", line 39, in proto_to_scala_src
		ctx.actions.run(executable = code_generator.file..., <6 more arguments>)
expected value of type 'File or string' for parameter 'executable', in method call run(FilesToRunProvider executable, list inputs, list outputs, string mnemonic, string progress_message, dict execution_requirements, list arguments) of 'actions'

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gseitz FYI

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tferi-da This needs a fix from a bazel update -- the latest rules require bazel 0.23 or above now i'm afraid

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ianoc thanks for the tip!

ittaiz added a commit to wix-incubator/rules_scala that referenced this pull request Mar 21, 2019
ittaiz added a commit to wix-incubator/rules_scala that referenced this pull request Mar 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants