Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Introduce exchangeable InterlinkParser #791

Merged
merged 1 commit into from
Dec 31, 2023

Conversation

linawolf
Copy link
Contributor

In TYPO3 we want to allow additonal special signs, so that interlinks to composer names can be made.

Additionally, this also removed duplicate code

Copy link
Member

@jaapio jaapio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this improvement, just a minor request to improve the code a bit.

@wouterj
Copy link
Contributor

wouterj commented Dec 30, 2023

Just sharing an idea I had with intersphinx: I think we should decouple it from the nodes. Intersphinx in Sphinx itself is purely handled in the reference resolver logic. This makes some_inventory:some_anchor just a "reference target" and the Intersphinx reference resolver detects it can handle links like that (just like ExternalReferenceResolver detects it can handle reference targets like http://github.com).

Then, TYPO3 can implement another reference resolver for your custom sign and composer links. By using reference resolver priorities, you can make sure that it's prioritized over other reference resolvers.

I started working on this locally (I had it stashed away, that's why the commit is of today) in https://github.com/wouterj/guides/tree/intersphinx-decouple I remember I was fighting with the anchor reducer, I think it should be moved to the render stage somehow (I can also imagine we need a different anchor reducer rendering non-HTML, so it seems to be related more to the render phase I think).

Just throwing the idea. If you like it, I'm happy to continue this work (or collaborate on it). But otherwise, I'll give up on the idea of decoupling Intersphinx :)

Btw, this is related to this PR as it'll also centralize parsing the intersphinx reference to a single place (the intersphinx reference resolver). We might not need the intersphinx parser abstraction if we decide to go for this other way.

@linawolf
Copy link
Contributor Author

The thing is the Anchor-Reducer is not nessesaryly related to rendering. It defines weather :ref:My_Anchor should link to .. :my-anchor: or not. So it is a feature of rest.

@linawolf
Copy link
Contributor Author

I would have to have a look at your implementation. However the issue with interlinks is that they can be applied to different kind of links that have different kind of resolvers. Therefore just putting them in one link resolver would not work for all cases. For example if you can also use interlinks to link to a php classs in a different manual with :php:class:`some-manual:\FQN\MyClass`

@linawolf
Copy link
Contributor Author

The other thing I am wondering: If we have a different Input formal like Markdown or latex or whatever. If we want to introduce interlinks there they might have a different format as the doublepoint syntax might collide with some other language features? So I would see the interlink parsing in the parser, not in the compiler (as the compiler works for all formats. And also it should do no parsing)

@wouterj
Copy link
Contributor

wouterj commented Dec 30, 2023

Therefore just putting them in one link resolver would not work for all cases. For example if you can also use interlinks to link to a php classs in a different manual with :php:class:`some-manual:\FQN\MyClass`

This can work with another reference resolver that checks for a specific php class node. But indeed, this would introduce parsing the domain:... syntax in multiple classes again without this PR.

If we have a different Input formal like Markdown or latex or whatever. If we want to introduce interlinks there they might have a different format as the doublepoint syntax might collide with some other language features?

Interesting, I hadn't thought about other input formats. At least in Markdown it can still work ([link](domain:something)) but that might not be the case for other formats. Let's hear what Jaap thinks about this, it would be an argument in favor of making interlink a core concept indeed.

@linawolf
Copy link
Contributor Author

With the introduction of the centralized interlink parser in this PR you should still be able to exchange it according to your needs?

@jaapio
Copy link
Member

jaapio commented Dec 30, 2023

Let me first explain what I had in mind, this is not how it is implemented currently, but it helps to understand the purpose of the elements we have and might also help you to understand why some steps are taken.

  1. parser, as the parser knows the input format and is able to produce the nodes. The idea about the parser was to create generic nodes that can be used between formats. However if a certain format has the need for more specific nodes it may provide these nodes. For example, in phpDocumentor we have a PHPReference Text role which is specific for phpDocumentor. It does produce a PHPReferenceNode, this requires a specific resolver because it will have to do a look up in the internals of phpDocumentor to find the corresponding object to link.

  2. Resolvers, the resolvers are responsible to resolve the correct target of the node, we use the resolvers because some link targets are unknown at the moment of parsing. Ideally we would have solved this in the compiler. Because we should link to objects. A link points to another node in the AST. The resolvers are not a compiler step because we are kind of mixing it with the output format. Which is unfortunately kinda required. In the ideal world every link node would target a node that lives somewhere in the AST, and can be found by a lookup by a resolver.

  3. The last step would be to transform the linknode to a real link in html, which is of course part of the renderer. Because we need to know the location, output format ect which is needed to compose the real target.

The reason to split parsing, resolving and rendering is to make it easier to introduce cache, and reduce the time we need during rendering different formats. A link to a node would be the same in html as it would be in JSON. except the format is different. So until the rendering step we should not care about the actual format, just about nodes.


Back to this PR, the issue with interlinks is that any link can be an interlink which would mean to me that Interlinks are some kind of decoration on existing links. The issue is, if we would talk about decoration it would be a decoration on the LinkNode, so the easiest way to handle this would have been if the input format was written like a decoration... however sphinx decided to put the decoration in the target... which makes it impossible to have a parser that processes the interlink node, with a specific link node as a child.

Visualized:

  • InterlinkNode
    • LinkNode
      • Reference

However we could turn it around:

  • LinkNode

    • InterlinkReference
      • Reference

It becomes more interesting because we have a second layer of deviation, which are domains. Right now domains are modeled as part of a LinkNode, which is not necessarily true. We could have introduced a new layer around the reference.

  • LinkNode
    • InterlinkReference
      • Domain
        • Reference

I think the advantage of having this modeled in objects is that resolvers and compiler passes can be more generic, a php domain reference is still a php reference, and when interlink is added it just tells that the intersphinx handler that it should to a lookup in the source of the interlinks. Next the phpresolver kicks in, which will resolve the link in the given context.

So I do agree with Lina that the interlink handling should be part of the parser, as it is depending on the input format. And I agree with Wouter that interlink doesn't have to be part of the core, however I do see this as a core feature of a documentation tool to link between documentation sets.

What I do not like about the current implementation is that I would need to implement al kind of logic to support interlinks in my custom link formats. Which could be resolved by splitting the parsing of links to a structure as suggested. And think more about the way to structure this.

@jaapio
Copy link
Member

jaapio commented Dec 30, 2023

As far as I'm concerned right now, I do not think the structural change should be done directly. We are still learning about what we actually need to support. And new insights are part of what we are doing.

In order to keep learning it's more important to iterate, rather than have long discussions about what the best technical way of implementing stuff would be. There are no requirements rather than that we figure out on a daily basis that there are things that need to be improved.

@linawolf linawolf force-pushed the feature/interlinkparser branch from 738079c to 4cccf02 Compare December 31, 2023 07:08
In TYPO3 we want to allow additonal special signs, so that interlinks to composer names can be made.

Additionally, this also removed duplicate code
@linawolf linawolf force-pushed the feature/interlinkparser branch from 4cccf02 to b1c2e0a Compare December 31, 2023 07:11
@linawolf
Copy link
Contributor Author

@jaapio should we stick with the minor improvement and merge this as-is for now?

@jaapio
Copy link
Member

jaapio commented Dec 31, 2023

I can live with that. 😁

@jaapio jaapio merged commit 21fa962 into main Dec 31, 2023
38 checks passed
@jaapio jaapio deleted the feature/interlinkparser branch December 31, 2023 07:29
@jaapio
Copy link
Member

jaapio commented Dec 31, 2023

I will move the discussion we had here to a issue, so we can find the final solution together

@wouterj
Copy link
Contributor

wouterj commented Dec 31, 2023

Thanks for the extensive comment! I can see, if you take into account the "multiple input formats" aspect of guides, how having some sort of "inter project linking" concept makes sense now.

Btw, what you say about domains is something that Sphinx also knows for roles and directives (https://www.sphinx-doc.org/en/master/usage/domains/index.html). This refers to coding languages, where e.g. :php:func: roles link to specific language docs. As far as I can tell, this domain is a different concept from the intersphinx domain concept, which exist in reference targets.

It would be interesting to brainstorm, for the future, if we can somehow find a way to combine the concepts. I think it'll be very useful if you can use e.g. .. php:class:: InterlinkParser to render the phpDocumentor API docs for a class (and use :php:class:`InterlinkParser` to create a reference to it), maybe even cross project.

@linawolf
Copy link
Contributor Author

It would be interesting to brainstorm, for the future, if we can somehow find a way to combine the concepts. I think it'll be very useful if you can use e.g. .. php:class:: InterlinkParser to render the phpDocumentor API docs for a class (and use :php:class:`InterlinkParser` to create a reference to it), maybe even cross project.

@wouterj we have actually already implemented something like that in a standalone guides extension: https://github.com/TYPO3-Documentation/guides-php-domain

@jaapio
Copy link
Member

jaapio commented Dec 31, 2023

phpDocumentor has this since yesterday 🤓

https://phpc.social/@jaapio/111670163247164175

@linawolf
Copy link
Contributor Author

can I interlink to it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants