Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protoc marking generated C++ classes final as of 3.7 #5869

Closed
JoshDreamland opened this issue Mar 11, 2019 · 21 comments
Closed

protoc marking generated C++ classes final as of 3.7 #5869

JoshDreamland opened this issue Mar 11, 2019 · 21 comments
Assignees
Labels

Comments

@JoshDreamland
Copy link

What version of protobuf and what language are you using?
Version: 3.6.1, 3.7
Language: C++

What operating system (Linux, Windows, ...) and version?
Linux, various (Arch Linux, Ubuntu, Debian)

What runtime / compiler are you using (e.g., python version or gcc version)
Various; this system uses GCC 8.2.1 20181127

What did you do?
Run protoc to generate C++ for a given proto message.

What did you expect to see
Prior to version 3.7, classes generated for proto messages were not final. My project has code extending proto messages to add helper functions which index and operate on field values. The code in question builds successfully through protoc 3.6.1. I expected this to continue to be the case.

What did you see instead?
The build fails in 3.7 because protoc now tags final onto codegen. This change isn't mentioned anywhere that I can find (release notes, documentation) and there doesn't seem to be a flag to control this behavior. To my knowledge, there is also no flag to control the target C++ standard, so we cannot work around this issue by targeting C++03 (before final was introduced).

Anything else we should know about your project / environment
This issue does not seem specific to any environment.

@BSBandme
Copy link
Contributor

I think starting from 3.6 we require C++11 (see #2780).

@JoshDreamland
Copy link
Author

I certainly wouldn't complain about most of the features that came with C++11. But final doesn't seem to add anything to the code, and it breaks our project's build (and probably others, who I expect to chime in after a couple more months as 3.7 rolls out).

@acozzette
Copy link
Member

The documentation here does officially warn against inheriting from generated message types. However, it's still true that we made this change without giving it much of an announcement at all. It is beneficial because it allows the compiler to take advantage of more optimizations, but maybe we need to introduce a temporary opt-out mechanism to give projects time to refactor.

@rchav
Copy link

rchav commented Mar 14, 2019

This is a huge problem for us, would def appreciate an option to disable.

@Arfrever
Copy link
Contributor

Arfrever commented Mar 15, 2019

This change also breaks building of Google Mozc project. (Actually now I have a patch.)
An option to avoid final (at least per-class) might be nice.

@Plegsd
Copy link

Plegsd commented Jun 25, 2020

What version of protobuf and what language are you using?
Version: 3.6.1, 3.7
Language: C++

What operating system (Linux, Windows, ...) and version?
Linux, various (Arch Linux, Ubuntu, Debian)

What runtime / compiler are you using (e.g., python version or gcc version)
Various; this system uses GCC 8.2.1 20181127

What did you do?
Run protoc to generate C++ for a given proto message.

What did you expect to see
Prior to version 3.7, classes generated for proto messages were not final. My project has code extending proto messages to add helper functions which index and operate on field values. The code in question builds successfully through protoc 3.6.1. I expected this to continue to be the case.

What did you see instead?
The build fails in 3.7 because protoc now tags final onto codegen. This change isn't mentioned anywhere that I can find (release notes, documentation) and there doesn't seem to be a flag to control this behavior. To my knowledge, there is also no flag to control the target C++ standard, so we cannot work around this issue by targeting C++03 (before final was introduced).

Anything else we should know about your project / environment
This issue does not seem specific to any environment.

@JoshDreamland
Copy link
Author

JoshDreamland commented Jun 25, 2020

I can appreciate compiler optimizations. But this eliminates a solution to another issue with proto that I know has been discussed a lot in the past: adding helper-like accessors. For instance, an Area() routine on a Rectangle message. Has there been any progress on formalizing that feature request?

I recognize it's no simple request since it must have language-level granularity (ie, the same method can't build for Java and C++ unless we bake an entire language into proto, which seems excessive). Our team would prefer the ability to provide a cut-and-paste block to be moved, verbatim, from the .proto source into the .h output.

@Expurple
Copy link

The C++ project that I'm working on also broke because of this. In case anyone still needs it, here's the hack that I've come up with. Add a build step that calls

sed --in-place 's \bfinal\b /*final*/ ' <AUTOGENERATED_FILE>

on files generated by protoc.

@emenchen
Copy link

emenchen commented May 16, 2022

@acozzette Add me to the list of people that would like an opt-out mechanism. I have a fair bit of existing code that subclasses a protobuf generated class, and many clients of that code which will have to be changed without an opt-out. I will likely employ a hack like @Expurple.

You commented that subclassing has been discouraged and pointed to the documentation which states, "You should never add behavior to the generated classes by inheriting from them. This will break internal mechanisms and is not good object-oriented practice anyway." I question this and wonder if you can provide additional insight as to why this might not be good practice, and what would break given the following:

package MyPackage
message Rectangle {
  int32 x1 = 1;
  int32 x2 = 2;
  int32 y1 = 3;
  int32 y2 = 4;
  int32 line_width = 5;
}

with a subclass like:

class RectHelper : public MyPackage::Rectangle {
  RectHelper& set_coordinates(::google::protobuf::int32 x1, ::google::protobuf::int32 x2,
    ::google::protobuf::int32 y1, ::google::protobuf::int32 y2) {
    set_x1(x1);
    set_x2(x2);
    set_y1(y1);
    set_y2(y2);
    return *this;
  }
};

Here I've added a helper, without adding new data or anything that needs to be serialized. This helper (or other code like it) may eliminate a lot of repetitive lines of code elsewhere. Yes, I could make a wrapper class, but I still want to expose all of the original functionality of the original protobuf class. To do this would require repeating all of those methods, or exposing the contained object, neither of which seem desirable or better than this. I could make a standalone function, and that might be the route I explore in the future, but that also takes away a lot of the convenience of the subclass in other situations.

@acozzette
Copy link
Member

Using a sed script sounds to me like a good workaround, at least as a temporary solution. For the Rectangle example, I think a standalone function would be the best way to go since that would provide all the necessary functionality without needing to inherit from the generated message.

@emenchen
Copy link

@acozzette Thanks for the quick reply. Of course my rectangle example was perhaps too simple. My current derived classes have other methods that implement something of a "builder" interface to allow chaining and simple code on the client side, but not always so simple in the derived class. I'm thinking of how to implement something with a set of standalone functions, but it isn't as straightforward as this simple case.

@XplicitComputing
Copy link

It is not okay to artificially constrain the library by implementing final, because it precludes downstream (advanced automation) implementations to specialize. For this reason, Xplicit Computing Inc is internally constrained to version 3.7 for building its code-generated libxccommon libraries, which provide server-client functionalities. Luckily, my understanding is that customer can still us any Protoc 3 compiler... but we're stuck on this old version that works -- and frankly are starting to assess alternative serialization techniques. Google Protobuf team, please revert this feature back and re-assess such design constraints.

@XplicitComputing
Copy link

at least give us the option --no-final

@XplicitComputing
Copy link

Bump from the international computational fluid dynamics community...

@fowles
Copy link
Contributor

fowles commented Jul 13, 2022

This change was made quite deliberately and provides a meaningful performance improvement when applied at scale. The classes themselves have always been documented as not being suitable for derivation. I would encourage you to consider containment over inheritance as a more efficient paradigm.

@XplicitComputing
Copy link

Significant investments have been made applying the Protobuf3 infrastructure. A better paradigm as library maintainers would be to not functionally change the spec after it has been deployed... and if so, allow the option for --no-final. I will discuss with our meta-programming experts, but as mentioned there are implications of a class being of a type, verses containing or having said type. There may literally not be a computer science opportunity for containment.

From a science and engineering professional perspective, it is generally not acceptable to do what has been done here. This may force us to migrate to Cern::Root sooner than initially planned.

@XplicitComputing
Copy link

At ICCFD11 two days ago, I just presented protobuf as a viable method for communicating messages between science and engineering teams. Perhaps I was wrong.

@fowles
Copy link
Contributor

fowles commented Jul 14, 2022

For what little it is worth, I agree with you that this should not have been done on a minor release. We have spent a lot of time over the past year creating and publishing more rigorous policies that we intend to follow going forward.

https://developers.google.com/protocol-buffers/docs/news/2022-05-06#versioning
https://opensource.google/documentation/policies/library-breaking-change
https://opensource.google/documentation/policies/cplusplus-support

Revisiting this particular decision also comes with costs. Inside Google, that use of final provides a significant cost savings in terms of fleet resources and code evolution. Making it configurable greatly complicates the ability of google to pull in external code that uses protobuffer because of the divergence in semantics exposed by that configuration.

As alluded to, there are workarounds of varying complexity. The simplest and hackiest being to preprocess the generated sources. A custom generator that wraps existing protobuf messages in custom APIs could provide you this flexibility and additionally might add significant power that you might find valuable.

@fowles fowles assigned fowles and unassigned acozzette Jul 14, 2022
@XplicitComputing
Copy link

Thank you for the info.
I've heard back from our protocol library lead and apparently we were able to sidestep this issue.
We are now tracking what appears to be a separate issue that appeared in v3.8 but nothing concrete yet. Likely unrelated and probably not your problem

@fowles fowles closed this as completed Jul 14, 2022
@emenchen
Copy link

@fowles I am going to use the simplest and hackiest solution you mention, which is process the generated sources. :-(

Using composition isn't particularly feasible. I looked at some of the protobuf generated classes and they have ~100 getter/setter type methods that I would have to replicate so that clients can use all the functionality currently available. I suppose I could just expose the underlying protobuf class, but that and some of the other alternatives seem to violate good software principles as well, e.g. encapsulation. Also, these alternatives also mean changing the APIs I supply to clients, which isn't desirable either.

@fowles
Copy link
Contributor

fowles commented Jul 18, 2022

@emenchen you can sometimes get away with defining an operator* and operator-> that delegate to avoid replicating the methods, but yeah, it can be pretty gross.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants