Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for friend functions #649

Merged
merged 6 commits into from
Feb 14, 2023

Conversation

HGuillemet
Copy link
Contributor

@HGuillemet HGuillemet commented Jan 25, 2023

This PR is a proposal to add support for friend functions to the parser.
Currently, these functions are simply ignored.

Friend functions are non-member functions but are declared in a class body. They have access to the class private and protected members.

This PR does the following:

  1. When a friend function is parsed, instead of being added to the current list of declarations, it’s added to a special list of non-member declarations.
  2. When we finish parsing an entire group, and if the declaration of the group has been successfully added to the parent declaration list, we also add the non-members declarations to the global declaration list.

Tested on Pytorch : this patch allows to map about 25 functions that were missing, and doesn’t alter anything else. Most of these functions are operator overloads. The generated new JNI code compiles, baring the removal of this info. It corresponds to a template instantiation apparently not meant to exist. Any idea why it was added ? The second infoMap c10::Dict<c10::Ivalue,c10::Ivalue>::iterator is correct.

One minor caveat:
C++ allows operators to be declared either as friend functions or as member functions, e.g.:

struct Foo {
    private:
      int n;
    public:
      Foo(int x) {
        n=x;
      }
      bool operator<=(Foo& rhs) {
        return n <= rhs.n;
      }
   friend bool operator<(Foo& lhs,  Foo& rhs) {
     return lhs.n < rhs.n
   }
};

In both case, the C++ user will write the same operation : x < y or x ≤ y but in Java it will map as an instance method x.lessThanEquals(y) and as a static method lessThan(x, y). Not a big deal.

Another caveat is about accessibility: C++ has some complicated rules for locating a function in the tree of namespaces based on the type of the arguments: the ADL (argument-dependent lookup).

namespace ns {
  struct Bar {
    …
  };
  struct Foo {
    friend bool operator<(Foo& lhs, Foo& rhs) { … }
    friend bool operator<(Bar& lhs, Bar& rhs) { … }
  };
}

in the exampe above, x < y works if x and y are Foo, but not it they are Bar, because the declaration of < on Bars cannot be found from type Bar. Ideally the first operator must be mapped to Java, because it is accessible, but not the second one.
In this PR we map the function only if it has an argument of the same type that the one it’s declared in. That covers only a subset of the ADL and some theoretically accessible friend functions could be missed.

@saudet
Copy link
Member

saudet commented Jan 26, 2023

Looks good, but let's make this optional so it doesn't break backward compatibility, probably with some new boolean flag like Info.friendify or something, that would work like Info.enumerate works for enums, but that activates your new code here instead.

@HGuillemet
Copy link
Contributor Author

It only adds new mappings. So the only way to break something is if the new mapping needs some Info to be mapped correctly. But I guess this will be the case for some presets, like it is for Pytorch with GenericDict.

Looking at how Info is working, I understand I have to write something like this.
Or shall we do something more generic and makeInfoMap.getFirst(null) always act as defaults for more specific InfoMap.get(cppName) ?

@saudet
Copy link
Member

saudet commented Jan 26, 2023

It only adds new mappings. So the only way to break something is if the new mapping needs some Info to be mapped correctly. But I guess this will be the case for some presets, like it is for Pytorch with GenericDict.

Like you said yourself, the ADL rules are complicated, so if we enable this by default this is going to break something somewhere.

Looking at how Info is working, I understand I have to write something like this. Or shall we do something more generic and makeInfoMap.getFirst(null) always act as defaults for more specific InfoMap.get(cppName) ?

Yes, something like that.

@HGuillemet
Copy link
Contributor Author

Yes, something like that.

Actually I was asking your preference between 2 alternatives :)

@saudet
Copy link
Member

saudet commented Jan 26, 2023

Ah, what's wrong with doing it like enumerate? We might need to add a couple more calls for existing Info, but that's about it, right?

@HGuillemet
Copy link
Contributor Author

If you define a global setting : new Info().mapFriends(), thus enabling the mapping for all friend functions.
And then an Info targetting a specific friend function but concerning some other setting, you have to also specify .mapFriends() or the function won't be mapped. That's not critical though.

@saudet
Copy link
Member

saudet commented Jan 27, 2023

Right, let's not break consistency and/or backward compatibility for small things like that.

// Only add the friend functions that we know will be visible through ADL.
for (int i=0; i<dd.declarator.parameters.declarators.length; i++)
if (dd.declarator.parameters.declarators[i].type.cppName.equals(type.cppName)) {
globalDeclList.add(dd);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing something or can this not be done without modifying DeclarationList? It seems like we should be able to check for type.friend here and there, resetting as necessary to pass it as a normal static function, and do what's needed without having to keep a list of "nonMemberDeclarations".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why do we need to put this in the "global" list? We can put static methods anywhere in any class...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nonMemberDeclarations list is here to postpone the addition of the friend functions after the addition of the whole group AND if the addition of the group worked (declList.add returned true). Two reasons for this :

  1. if we directly add the friend functions when they are parsed, we end up with template types not replaced by java types. Apparently it's only when the whole group can be successfully added that we have the context info needed to replace the template types. Maybe there is a better way, I'm far from mastering how the parser handles templates.
  2. if we add the friend function before the group to the global decl list, it will follow the class comment declaration, if any, and the comment will be attached to the friend function instead of correcty copied to the class file.

Concerning whether to translate the friend functions as static function of the global class or of the class they are friend of: both are possible, but I found more logical to make them methods of the global class. For example a function or operator combining 2 instances of 2 classes could be friend of both. Also many friend functions are also declared outside the class, to allow normal lookup instead of ADL only, in this case they end up in the global class. There are already some equals, notEquals, or shiftLeft operators in the global class for this reason. Generally speaking friend functions are global functions, not member of the class. They appear in the class body only to be able to access private and protected members. Finally the user will often make a static import of the global class, allowing to not prefix the method like in equals(iterator, end), instead of named_module_iterator.equals(iterator, end).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, to make this more consistent, I think friend functions should appear in the class where they are declared. If they are also supposed to be part of the global namespace, they will appear as such as normal non-friend static functions, too, right?

Copy link
Member

@saudet saudet Jan 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides, it makes it easier to pair them with non-static methods like this, and also makes it clearer to users that these two methods are related:

public class module_iterator extends Pointer {
...
@Namespace public static native @Name("operator !=") boolean notEquals(@Const @ByRef module_iterator a, @Const @ByRef module_iterator b);
@Override public boolean equals(Object o) { return o != null && o instanceof module_iterator && !notEquals(this, (module_iterator)o); }
...
}

I'm not saying you need to do that right away, but if we start moving function declarations around, it makes it harder to bring them back later like this, and that's what needs to be done anyway because that's usually the intent of the original C++ code when declaring friends functions like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you map the friend function to Object.equals() ?

Something like I've done for multiple inheritance around here: https://github.com/bytedeco/javacpp/blob/master/src/main/java/org/bytedeco/javacpp/tools/Parser.java#L3386 That all ends up in Declaration.text. We can literally put anything we want there.

Ok, so with this trick we could map as instance methods all operators declared as friends and taking an instance of the class or struct as its first argument. Can we agree on that ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so with this trick we could map as instance methods all operators declared as friends and taking an instance of the class or struct as its first argument. Can we agree on that ?

Sure, that sounds fine, although apart from equals(), I don't think we gain much by doing so...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I implemented this: map the friend function to a private static native method of the class the function is friend of, and add a Java wrapper as an instance method. It's what you had in mind, right ?
I had to force the use of ADL instead of namespace lookup with a @Namespace annotation without argument or it didn't compile. So now all friend functions accepting at least 1 argument of the class it is friend of are accessible as instance method.

Example:

  public boolean notEquals(StringView rhs) { return notEquals(this, rhs); }
  private static native @Namespace @Cast("bool") @Name("operator !=") boolean notEquals(@Const @ByRef StringView lhs, @Const @ByRef StringView rhs);

  public Pointer shiftLeft(Pointer os) { return shiftLeft(os, this); }
  private static native @Namespace @Cast("std::ostream*") @ByRef @Name("operator <<") Pointer shiftLeft(@Cast("std::ostream*") @ByRef Pointer os, @Const @ByRef StringView dt);

Do you agree with this solution ? Any remark before I repost a PR ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a library developer implements an operator, (s)he usually has the choice between implementing it as a friend function or as an instance method. It doesn't change anything for the C++ user.

The good thing about this solution is that, with it, it doesn't change anything for the Java user either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you agree with this solution ? Any remark before I repost a PR ?

Yes, that looks exactly like what I had in mind. You can update this PR if you want, GitHub keeps all the commits either way. Thanks!

// skip over non-const function within const class
if (!decl.constMember && context.constName != null) {
if (type.friend && !mapFriends || !decl.constMember && context.constName != null) {
Copy link
Member

@saudet saudet Feb 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure we don't have any regressions, in addition to this one, let's keep the skip condition for friend above too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed proposed changes in that direction. Can you give it a try? If it works, I think this is good to merge.

@HGuillemet
Copy link
Contributor Author

HGuillemet commented Feb 11, 2023

  1. The wrapper instance method is added just as a string to the static method declaration. We thus bypass any chance to detect if the method already exists. This case happens in pytorch: in ivalue.h, an equals member function is defined and then a friend operator ==. Do you see a simple way to detect this ? If we add a true declaration for the instance method, will it work ? And can we give priority to the existing member function ?
  2. The Info inheritance system is not well suited for this kind of setting: if we set a global new Info().frendly(), each time an info is created that applies to a function, the friendly setting is silently reset to false and the friend function is not parsed. What about adding a true set of default values for all new Info ? For instance, we could add this to InfoMap:
  Info defaultInfo = new Info();
  public Info getDefault() { return defaultInfo; }
  public Info add(String... cppNames) { 
    Info i = new Info(defaultInfo).cppNames(cppNames);
    put(i);
    return i;
 }

So instead of doing:

infoMap.put(new Info().enumerate().friendly())
       .put(new Info(...)....)
       .put(new Info(...)....)

We would do:

infoMap.getDefault().enumerate().friendly();
infoMap.add(...)...
infoMap.add(...)...

Nothing broken since the current style would be unchanged.

Another option, if you prefer to keep the current style, is to put the default Info in a static customizable field of Info:

Info.getDefault().enumerate().friendly();
infoMap.put(new Info(...)....)
       .put(new Info(...)....)

@saudet
Copy link
Member

saudet commented Feb 11, 2023

  1. The wrapper instance method is added just as a string to the static method declaration. We thus bypass any chance to detect if the method already exists. This case happens in pytorch: in ivalue.h, an equals member function is defined and then a friend operator ==. Do you see a simple way to detect this ? If we add a true declaration for the instance method, will it work ? And can we give priority to the existing member function ?

To take care of that, you just have to make sure their Declaration.signature is the same. Duplicates are removed here:
https://github.com/bytedeco/javacpp/blob/master/src/main/java/org/bytedeco/javacpp/tools/DeclarationList.java#L174

  1. The Info inheritance system is not well suited for this kind of setting: if we set a global new Info().frendly(), each time an info is created that applies to a function, the friendly setting is silently reset to false and the friend function is not parsed. What about adding a true set of default values for all new Info ? For instance, we could add this to InfoMap:

We can already do that, just keep a "Info defaults" objects somewhere, and then create your new Info(defaults)... from it. You can also add any number of helper methods such as add() if you really want it. What's wrong with that?

Nothing broken since the current style would be unchanged.

It's good that nothing breaks, but still, I don't see the point of maintaining this kind of code. It doesn't seem to add any value. Please prove me wrong.

@HGuillemet
Copy link
Contributor Author

HGuillemet commented Feb 11, 2023

To take care of that, you just have to make sure their Declaration.signature is the same. Duplicates are removed here:
https://github.com/bytedeco/javacpp/blob/master/src/main/java/org/bytedeco/javacpp/tools/DeclarationList.java#L174

Ok, so we need to create a separate Declaration for the instance method with the proper signature. The declaration that will be kept is the one appearing first in the C++ source I guess. It's a bit random but at least it will compile. The situation should not often happen. Additional commit to come.

It's good that nothing breaks, but still, I don't see the point of maintaining this kind of code. It doesn't seem to add any value.

Turning on new features like enumerate and friendly is supposed to be common for all new presets. So you'll asks users to write:

Info def = new Info().enumerate().friendly();
infoMap.put(new Info(def).cppNames(...).pointerTypes(...))
       .put....

While you could allow them to write with less noise, like they are used to:

Info.def.enumerate().friendly();
infoMap.put(new Info(...).pointerTypes(...))
       .put....

By adding just one more line to maintain in Info.java and changing another one:

static final public Info def = new Info();
public Info() { this(def); }

Please prove me wrong.

Done ? 😆

@HGuillemet
Copy link
Contributor Author

By adding just one more line to maintain in Info.java and changing another one.

Not counting the lines in parser where we fetch Info(null) that could be removed !

But this change of the info system, if any, can wait. I have a related issue to point out and there may be a better alternative to this default info.

…plicate signatures.

Map friends even if declared in private/protected section.
@HGuillemet
Copy link
Contributor Author

Committed. I also added a fix to bypass the inaccessible check in group for friends.
Tested successfully on Pytorch presets, although I had to manually skip some operators of DictIterator for some reason I failed to identified related to a template that couldn't be instantiated. If you can have a look, I'll be interested to understand with your help. I will push a PR of the modified Pytorch presets once this one is merged.

@saudet saudet merged commit 5f81c23 into bytedeco:master Feb 14, 2023
@HGuillemet HGuillemet deleted the add_friend_support branch February 14, 2023 10:44
@HGuillemet HGuillemet mentioned this pull request Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants