Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++domain: Idiomatic spacing of ptr-operator and ref-qualifier #7491

Open
jbab opened this issue Apr 16, 2020 · 6 comments
Open

C++domain: Idiomatic spacing of ptr-operator and ref-qualifier #7491

jbab opened this issue Apr 16, 2020 · 6 comments
Labels
domains:c domains:cpp type:enhancement enhance or introduce a new feature

Comments

@jbab
Copy link
Contributor

jbab commented Apr 16, 2020

Problem

When declaring objects (e,g, types, members, functions) in the cpp domain, sphinx accepts
arbitrary spacing between types, cv-qualifiers and ref-qualifiers. It then produces output
in a consistent way (which is a plus) but which cannot be influenced by the user (which is a problem).

For example

cpp:function:: int& foo(const Type & value, Resource * resource)

leads to output

int &foo(const Type &value, Resource *resource)

The problem here is that the spacing chosen by sphinx is not the idiomatic way used by the C++ community (I am aware that this is a bold statement, but https://en.cppreference.com and https://github.com/cplusplus/draft are good support for this.)

An idiomatic spacing for the example above should look somewhat like

int& foo(const Type& value, Resource* resource)

This is important, since IMO documentation should look like the code we write and it should follow conventions which are common in the specific language.

Supporting this kind of spacing could lead more C++ developers to consider using sphinx for documentation.

Possible solutions

  1. All spacing added by the user in the sphinx declarations are transfered as is to the output.
  2. A new option for the C++ domain, say cpp_idiomatic_spacing is added which produces the desired output. The default leaves the output as is.
  3. Sphinx only generates the "idiomatic" output and nothing else.

Option 1

This probably requires huge changes to the output generating functions in cpp.py.
It would make the output depend entirely on the user input. This can be seen as an advantage (full control for the user), but also as a disadvantage (potential for inconsistent output).

Option 2

Feasible, but would lead to lots of branching and duplication in the code and the tests.
The question is whether someone would still want the default behavior once the option is available and if the effort maintaining two output variants is worth while.

Option 3

Implementable with relatively low effort.

Since the current choice of spacing by sphinx is somewhat arbitrary (or taken over from Python or C), a different spacing should be just as good. I am not sure, however, if this would still be considered a breaking change.

Suggested solution

I would appreciate feedback to the above suggestions.
Personally I favor option 3 and would be willing to contribute a PR.

@jbab jbab added the type:enhancement enhance or introduce a new feature label Apr 16, 2020
@jakobandersen
Copy link
Contributor

The problem here is that the spacing chosen by sphinx is not the idiomatic way used by the C++ community (I am aware that this is a bold statement, but https://en.cppreference.com and https://github.com/cplusplus/draft are good support for this.)

It is indeed a bold statement which I (perhaps not surprisingly) do not agree with :-). This is perhaps a semi-religious issue like east-vs-west const, so I fully agree that it should be up to the user to decide.

  1. All spacing added by the user in the sphinx declarations are transfered as is to the output.

I consider this practically impossible to implement.

  1. A new option for the C++ domain, say cpp_idiomatic_spacing is added which produces the desired output. The default leaves the output as is.

Sounds good to me, though the option should have a less contentious name :-). From a technical perspective the current spacing is grammar-centric (the declarators (e.g., * and &) are grouped with the declarator-name), while the other style groups the declarators with the decalration-specifiers. How about the slightly long-winded cpp_grammar_centric_declarator_spacing = True?

  1. Sphinx only generates the "idiomatic" output and nothing else.

As I object to "idiomatic" I am definitely against this :-).

Since the current choice of spacing by sphinx is somewhat arbitrary (or taken over from Python or C), a different spacing should be just as good. I am not sure, however, if this would still be considered a breaking change.

As mentioned, the current style is based on how the C and C++ grammars are written, though in all fairness, biased towards my personal formatting opinion in more complicated cases involving parameter packs.

I don't mind implementing option 2 at some point, but what would very helpful would be a set of test cases, with the result in both styles. The more complicated issues comes with multiple qualified pointers, parameters packs, and differentiation of whether there is a declarator-name or not (e.g., function declarations without parameter names).

@jbab
Copy link
Contributor Author

jbab commented Apr 17, 2020

First of all, thank you very much for your comments and your appreciation of my request. I certainly do not want to rage any pseudo-religious wars. But I am also convinced that the issue is not of the same character as east-const vs. west-const.
My issue is that in 20+ years of experience with C++ I cannot remember having encountered a serious book, article, talk, co-worker etc. which/who uses the grammar_centric style you are proposing.

I would be curious to learn about the cases involving parameter packs or other cases you are having in mind where this style may be of advantage.

I could also be interesting to collect more opinions about this, maybe a poll among users?

Of course I could live with option 2 (and I do not insist on a particular name). I am just afraid that all the effort of implementing and maintaining it will be futile since hardly anyone would be using one of the two styles 😉.

@jbab
Copy link
Contributor Author

jbab commented Apr 17, 2020

@vector-of-bool
Copy link

vector-of-bool commented Apr 21, 2020

I decided to cut my teeth on Sphinx, and I like having this option, so I opened PR #7507 that allows customization of the placement of the ptr/ref token via cpp_ref_placement and cpp_ptr_placement. (Exact details can be bikeshed, of course.)

@jakobandersen
Copy link
Contributor

@vector-of-bool, thanks for your work in #7504.
Before going into the actual implementation I tried to get an overview of the situation. There are quite a few cases of declarator'ish things where the spacing style probably should be applied. For each case there are two situations: named and unnamed. The unnamed case appears for example in function qualifiers, trailing return types, template arguments, and function argument types without a parameter name.

A Single Declarator

Left

I personally don't prefer this style so there are a few cases where it's not clear to me what spacing people use. Do you agree with the table?

Declarator Named Unnamed
Ptr. A* a A*
Ptr., qual. A*const a (?) A*const (?)
L-val. ref. A& a A&
R-val. ref. A&& a A&&
Param. pack A... a A...
Func. ptr. A (*a)(B arg) A (*)(B arg)
Ptr. to mem. A C::* a A C::*
Ptr. to mem., qual. A C::*const a (?) A C::*const (?)
Ptr. to mem. func. A (C::* a)(B arg) A (C::*)(B arg)
Ptr. to mem. func., qual. A (C::*const a)(B arg) (?) A (C::*const)(B arg) (?)
  • (?): is the spacing correct here?

Right (current Sphinx)

While this is my preferred style, it looks like some of the cases are not rendered as I usually write code.

Declarator Named Unnamed
Ptr. A *a A*
Ptr., qual. A *const a A*const
L-val. ref. A &a A&
R-val. ref. A &&a A&&
Param. pack A ...a (1) A...
Func. ptr. A (*a)(B arg) A (*)(B arg)
Ptr. to mem. A C::* a (2) A C::*
Ptr. to mem., qual. A C::*const a A C::*const
Ptr. to mem. func. A (C::* a)(B arg) (3) A (C::*)(B arg)
Ptr. to mem. func., qual. A (C::*const a)(B arg) A (C::*const)(B arg)
  • (1): considering rendering as A... a because somehow I find the ...
    more related to the type pack than the name pack.
  • (2) and (3): perhaps the rendering should be like C::*a

Combinations of Declarators

Note all of these seem to be trivial extensions of the single-declarator cases. I have a few example tables for the "right" style, but we should probably agree on the single-declarator cases first.

Other Wonderings

  • I strongly believe that there should be only a single configuration variable to control the spacing style. Otherwise, with multiple declarators it may be unclear which one "wins" (e.g., what should A *& b become?).
  • There are 2 different function families this preferably should implemented in (and perhaps also in the C domain): the describe_signature functions generating docutils nodes and the _stringify functions that generate both raw string representations and display string representations (the difference being how anon entities are shown (@something vs. [anonymous])).
    The last part is the tricky one as I the method is called from __str__ methods, with no way to access the configuration variable, unless it is copied into a global variable.
  • I started finding out how to make tests for all cases to avoid missing some now or breaking them in the future.

@jbab
Copy link
Contributor Author

jbab commented Jun 7, 2020

About your table Left

In all rows with (?) the const refers to the variable being declared, i.e. a is const. The const should therefore be separate from the type (pointer to A or to a member of C):

Declarator Named
Ptr., qual. A* const a
Ptr. to mem., qual. A C::* const a
Ptr. to mem. func., qual. A (C::* const a)(B arg)

In the unnamed case, just leave out the name and its preceding space.

There are more cases involving cv-qualification: The object pointed to can be const/volatile, the member function can be cv-qualified. All of these should not be affected by the question on how to place the *.

Another thing of interest for this discussion is the placement for member function ref qualifiers.
For the left style I suggest:

A f() &;
A f() &&;
A f() const&;
A f() const&&;

About your table Right

Again, I would add a space between the const (which refers to the variable) and the type, only now the * is attached const a instead of the "type" (which is not quite correct, since in C++ the type
is pointer to something) . Thus

A C:: *a
A C:: *const a
A (C:: *a)(B arg)
A (C:: *const a)(B arg)

This would be totally in line with A *a and A *const a.
Unnamed would just mean to leave out the name.
The results looks awkward, especially for the unnamed and the pointer to member cases. Well, this may be the reason why this is a rather unusal style in C++.

A function returning a reference is currently rendered by Sphinx as

A &f();

Thus I think that the unnamed reference cases in your table should be

A &
A &&

For ref-qualified member functions in the right style I suggest:

A f() &;
A f() &&;
A f() const &;
A f() const &&;

(1) I agree. What is the current behaviour in the template declaration?

template<typename... Args> ...

or

template<typename ...Args> ...

About your other wonderings

  • I totally agree with one parameter only.
  • Maybe an option is to pass the config in the constructors of the AST objects. But then of course the question arises why it is again passed in the describe_signature function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domains:c domains:cpp type:enhancement enhance or introduce a new feature
Projects
None yet
Development

No branches or pull requests

4 participants