-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[api-documenter] Fix ae-ambiguous-ids meta issue #1308
Comments
Here's a pathological input that illustrates most of the sources of ambiguity, along with the corresponding TSDoc declaration reference expressions that would uniquely identify these members (without relying on // TSDoc: '(A:class)'
export class A {
// TSDoc: '(A:class).(:constructor,0)'
public constructor(c: number, d: number);
// TSDoc: '(A:class).(:constructor,1)'
public constructor(c: string);
// not visible API
public constructor(c: number|string, d?: number) {
}
// TSDoc: '(A:class).(constructor:instance)'
public "constructor": number; // TypeScript keyword confusingly used as member name
// TSDoc: '(A:class).("(constructor:instance)":instance)'
public "(constructor:instance)": number; // TSDoc expression confusingly used as member name
// TSDoc: '(A:class).(memberA:instance,0)'
public memberA(e: number, f: number): void;
// TSDoc: '(A:class).(memberA:instance,1)'
public memberA(e: string): void;
// not visible API
public memberA(e: number|string, f?: number): void {
}
// TSDoc: '(A:class).(memberA:static,0)'
public static memberA(g: boolean): void { // static member conflicting with instance member
}
// TSDoc: '(A:class).(MemberA:static,0)'
public static MemberA(): void { // differs by case only
}
}
// TSDoc: '(B:namespace)'
export namespace B {
// TSDoc: '(B:namespace).simpleSymbol'
export const simpleSymbol: unique symbol = Symbol("B.simpleSymbol");
}
// TSDoc: '(B:enum)'
export enum B { // merged declaration
// TSDoc: '(B:enum).One'
One
}
// TSDoc: '(C:namespace)'
export namespace C {
// TSDoc: '(C:namespace).(memberC:0)'
export function memberC(): void {
}
}
// TSDoc: '(C:interface)'
export interface C { // merged declaration
// TSDoc: '(C:interface).memberC'
memberC: number; // "instance side" conflicts with "static side" from namespace
// TSDoc: '(C:interface).(:new,0)'
new(h: number): number;
// TSDoc: '(C:interface).(:call,0)'
(i: string): string;
// TSDoc: '(C:interface).(:index,0)'
[j: string]: number;
// TSDoc: '(C:interface).[(B:namespace).simpleSymbol]'
[B.simpleSymbol](k: boolean, l: string): void;
} |
Thinking about the above code, it's obvious that we can't have both 2 and 3. Maybe we can relax the "stable" requirement like this: "If a new API member is added that makes an existing ID ambiguous, it's okay to change the ID, but an old ID should never get reassigned to a different target." For example, if we use ID |
Would it be better to form ids for method signatures based on their parameter types, not their overload order? You see this naming scheme in the .NET documentation for methods with overloads, i.e.:
Adding a net-new overload to an existing overload list would break the "stable" requirement: declare class Foo {
method(): void; // (Foo:class).method
// method(x: string): void;
method(x: number): void; // (Foo:class).method_1
} Uncommenting the above line would give that overload the name |
People seemed to like this idea when we discussed it in #881 (comment). However, by itself it's not a complete solution. It doesn't handle other sources of ambiguity (e.g. static vs instance, character case differences). It doesn't even handle all forms of overloading, for example: export class Foo {
/**
* First overload
* @param x - a boolean
*/
public method(x: boolean): void;
/**
* Second overload
* @param x - an array of strings
*/
public method(x: string[]): void;
public method(x: boolean | string[]): void {
// . . .
}
} Perhaps we can think of ID assignment as a kind of mapping function: f(item, context, history) --> identifier The output of We could also consider enhancing API Documenter to remember a history of previous outputs (e.g. saved as a JSON file), which could be consulted by the function to avoid "broken links" (i.e. changing an identifier to something new) and/or "shuffled links" (i.e. reusing an existing identifier to reference a different target). This additional tech might make it easier to provide nice-looking identifiers, while still providing a solution for high traffic websites who are concerned about URLs getting broken or shuffled. |
I'd like to register a preference for overloaded functions specifically, to simply merge all of the identifiers that collide into a single file. This is natural (and in my opinion, preferable) for overloads because they are documenting the behavior of a single function that may take different inputs -- duplicating the documentation for each overload is not user friendly. |
@acchou I suspect this depends on the how the overloads are used in your API. We've heard from others that documenting individual overloads was an important need. FYI we recently started work on an api-documenter.json settings file, so maybe the default handling of overloads could be configurable. (Feel free to open a GitHub issue tracking this, since it's somewhat offtopic here.) We also proposed to add a new TSDoc tag /**
* Adds some numbers together
*/
declare function add(x: number, y: number): number;
/** {@partOf (add:1)} */
declare function add(x: number, y: number, z: number): number; |
It's theoretically possible to provide a "no shuffling" guarantee without resorting to history. But every approach I can come up with leads to awkward, ugly identifiers. It seems that (optionally) tracking history in a JSON file is the right way to go: It's "opt-in" for people who don't care about broken links, and it's relatively easy to incorporate into a mapping algorithm. Regarding the mapping algorithm, here's some more thoughts about how we can implement
The transformations will depend on the particular identifier scenario. Recall that we're tackling 4 scenarios:
Since the URLs come from filenames, they shouldn't use characters that are bad for filenames. And although the |
What characters should be allowed in the URLs? In #1303 @acchou found that I found this thread discussing safe filename characters, and it concludes that Suppose we start with a TSDoc canonical declaration reference like this:
The
If there are no ambiguities, we can reduce the canonical reference by removing the redundant parts before applying the character substitutions. Or, if there still ambiguities after replacing characters, let's resolve them by reintroducing numeric suffixes after each dotted part, like this:
Lastly, for the 4 anonymous member kinds ( There's other possible improvements, but this actually seems like a pretty reasonable starting point. BTW I thought about appending function parameter names, but then I realized that it means the URL will break if someone renames a parameter, which maybe isn't worth it. (?) Here's a fully worked out result of the above algorithm: // URL '/api/a.md'
export class A {
// URL '/api/a.constructor-1.md'
public constructor(c: number, d: number);
// URL '/api/a.constructor-2.md'
public constructor(c: string);
// not visible API
public constructor(c: number|string, d?: number) {
}
// URL '/api/a.constructor-3.md'
public "constructor": number; // TypeScript keyword confusingly used as member name
// URL '/api/a.constructorinstance.md'
public "(constructor:instance)": number; // TSDoc expression confusingly used as member name
// URL '/api/a.membera-instance-1.md'
public memberA(e: number, f: number): void;
// URL '/api/a.membera-instance-2.md'
public memberA(e: string): void;
// not visible API
public memberA(e: number|string, f?: number): void {
}
// URL '/api/a.membera-static-1.md'
public static memberA(g: boolean): void { // static member conflicting with instance member
}
// URL '/api/a.membera-static-2.md'
public static MemberA(): void { // differs by case only
}
}
// URL '/api/b-namespace.md'
export namespace B {
// URL '/api/b-namespace.simplesymbol.md'
export const simpleSymbol: unique symbol = Symbol("B.simpleSymbol");
}
// URL '/api/b-enum.md'
export enum B { // merged declaration
// URL '/api/b-enum.one.md'
One
}
// URL '/api/c-namespace.md'
export namespace C {
// URL '/api/c-namespace.memberc.md'
export function memberC(): void {
}
}
// URL '/api/c-interface.md'
export interface C { // merged declaration
// URL '/api/c-interface.memberc.md'
memberC: number; // "instance side" conflicts with "static side" from namespace
// URL '/api/c-interface.new.md'
new(h: number): number;
// URL '/api/c-interface.call.md'
(i: string): string;
// URL '/api/c-interface.index.md'
[j: string]: number;
// URL '/api/c-interface.bnamespacesimplesymbol.md'
[B.simpleSymbol](k: boolean, l: string): void;
} The handling of ECMAScript symbols may get awkward (e.g. |
Lemme know if you guys like this direction. If so, I can start work on a PR when I get some time. It's tempting to put the mapping implementation in api-extractor-model so it can be reused by other tools besides API Documenter... |
@rbuckton I just realized I misread your sentence as "parameter names". Parameter types won't work in TypeScript because the parameter types can be arbitrarily large expressions, whereas in export function mergeStyles(...args: (IStyle | IStyleBaseArray | false | null | undefined)[]): string;
export function mergeStyleSets<
TStyleSet1 extends IStyleSet<TStyleSet1>,
TStyleSet2 extends IStyleSet<TStyleSet2>,
TStyleSet3 extends IStyleSet<TStyleSet3>,
TStyleSet4 extends IStyleSet<TStyleSet4>
>(
styleSet1: TStyleSet1 | false | null | undefined,
styleSet2: TStyleSet2 | false | null | undefined,
styleSet3: TStyleSet3 | false | null | undefined,
styleSet4: TStyleSet4 | false | null | undefined
): IProcessedStyleSet<TStyleSet1 & TStyleSet2 & TStyleSet3 & TStyleSet4>;
export function serializeRuleEntries(ruleEntries: { [key: string]: string | number }): string; |
What about cases where this causes a collision, since JavaScript/TypeScript property names are case sensitive?
Not opposed, but this can introduce collisions as well. For methods, docfx is very specific about how uids for methods should be derived for ManagedReference documents (joining the uids of the parameter types). It seems best to follow suit to avoid the shuffled links concern.
I concur with this, as it is something I had to implement myself in a custom YamlDocumenter for one of my projects: https://github.com/esfx/esfx/blob/master/scripts/docs/yamlDocumenter.js#L601 Keep in mind that the canonical UID you generate does not necessarily have to be a valid path/URL, but you could instead pass it through another function to generate a valid URL. Also, URLs already have an encoding strategy for invalid characters.
This worries me because it is perfectly valid in a quoted method/property name. You would need some mechanism for "quoting" a name with invalid characters to avoid collisions. Another option is to double up
I would argue for parameter types rather than names, as that is the DocFX standard for ManagedReference documents. Another way to deal with the shuffled links concern would be to have something like a
C# UIDs can be quite large as well. For some cases, an arbitrarily complex type (such as an anonymous type) is better broken down into an alias and a reference: …
- uid: some-package.some-class.some-method
name: method()
fullName: method()
syntax:
return:
type:
- some-package.some-class.some-method.anonymous-0
...
references:
- uid: some-package.some-class.some-method.anonymous-0
name.typeScript: '{ a: A, b: B, c: C, d: D }`
fullName.typeScript: '{ a: A, b: B, c: C, d: D }`
spec.typeScript:
- name: '{ a: '
fullName: '{ a: '
- uid: A
name: A
fullName: A
… |
Regarding the parameter types and complex aliases, I posted a comment on an issue about complex types with additional context here: #867 (comment) |
I understand this all is still in flux, but may I ask what is the currently supported syntax to link to an overloaded function? It seems there's some missing information and discrepancies between different parts of the documentation and the actual tooling/implementation. E.g. using the canonical references created by Example: Generated {
"kind": "Function",
"canonicalReference": "@thi.ng/arrays!quickSort:function(1)",
...
} Neither of these work (excluded from
Any (interim) advice is highly appreciated! Thanks! |
@postspectacular Let's discuss your question with #1240 since it's really about |
Is there any news on this or something I can help with to move things along? |
@kasperisager which subproblem were you interested in? |
This issue is affecting us quite a bit in our attempts to land Siteimprove/alfa#290, especially in relation to #1830 and #1921 that both seem to be manifestations of this underlying issue. I don't believe there are any patterns in the Alfa codebase that haven't already been outlined in the previous comments, so there's probably nothing new I can add on that front. Also, I do apologise for this new "Tracked in [...]" notice now appearing fairly prominently on all the Rushstack issues we've referenced from Siteimprove/alfa#290. I don't know why GitHub all of a sudden decided it would be a good idea to broadcast that to the world 🙈 |
This issue tracks the fix for a set of related bugs tagged by ae-ambiguous-ids. These issues all involve IDs that are generated using a surjective mapping that results in naming collisions (e.g. a file being ovewritten) or instability (e.g. a hyperlink being broken after a SemVer MINOR change).
The problem was summarized in #1252 (comment):
Certain IDs are case-insensitive, which means that e.g.
noTallPeople
andNotAllPeople
may collide.And here's a summary of the specific kinds of IDs that we need to solve this for:
The text was updated successfully, but these errors were encountered: