Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative method using transformers #5

Open
HoldYourWaffle opened this issue Dec 4, 2020 · 3 comments
Open

Alternative method using transformers #5

HoldYourWaffle opened this issue Dec 4, 2020 · 3 comments
Labels
investigation Further investigation is needed

Comments

@HoldYourWaffle
Copy link
Owner

HoldYourWaffle commented Dec 4, 2020

It might be a good idea to investigate using transformers as an alternative method for getting runtime caller information.
This would avoid the need to keep around TS source code and replaces source-maps & stack analysis, which has shown to have finnicky edge-cases (#2).

However, I predict there will be some logistical issues when implementing this within the limits of the compiler API. More investigation is needed to see if this is actually a suitable alternative.

Concept approach

First pass

  1. Find every usage of getCallSite.
  2. Replace it with a reference to an extra (generated) parameter.
  3. Mark this function as "needs caller information".

Second pass

  1. Find every usage of functions that "need caller information".
  2. Supply caller information to the generated extra parameter.

Issues & limitations

  • The main issues stem from the fact that any type information (including "get declarations" and "get references") are not available at transformation time (Include TypeChecker in TransformationContext so that custom transformers can use type information microsoft/TypeScript#25147).
    • This means that multiple passes are necessary (as outlined in the above approach)
    • How do we know if a certain call is to a "marked" function (3)? There's no references information yet, so we'd have to emulate full module+alias resolution.
    • Obvious solution would be creating a Program instance at both passes, but that's a big performance hit.
  • Since the AST node doesn't exist at runtime, a selection of compile-time information will have to be "inlined".
  • Functions using arguments could break, which could be a very big problem for functions that do funky overloads.
    • This could potentially be fixed by (temporarily) storing caller information in some global location instead of passing it as a parameter. Stackframes could used as keys.
      • Node doesn't appear to support a truly 'global' store. We'd have to use some runtime-module to store our caller information. It looks like this can't be automagically generated at compile-time, because transformers can't add new source files.
    • Unfortunately it's not possible to pre-store caller information by stackframe, because (without source map) it's impossible to know what the new location information will be. It might be possible to know what a "caller stackframe" will look like when using an after transformation.

More investigation is needed.

@HoldYourWaffle HoldYourWaffle added the investigation Further investigation is needed label Dec 4, 2020
@HoldYourWaffle
Copy link
Owner Author

It's near-impossible to detect all usages of a function X that uses getCallSite, because this function can be passed around in anonymous and effectively untraceable ways, like as a parameter or element of an array.
It's therefore in a lot of cases impossible to "inline" caller information at compile time.

In theory it could still be possible to build an "index" of caller information keyed by stackframes, sort of like a reduced version of the program's AST. This would require indexing a project's entire tree including dependencies.
This index would then be accessed using stackframes, taking into account the possible use of source maps. To allow both, a double-keyed index must be built.

Even if this index could be built in a reasonable time & size, it would still be problematic to actually use it at runtime, because (as far as I know) it's impossible to emit extra files using a transformer.

@HoldYourWaffle
Copy link
Owner Author

HoldYourWaffle commented Dec 6, 2020

The scope (therefore size) of the index could be limited by specifying the modules that should be index.

However, because this index is created at compile-time, any usage from code not known at compile-time will fail. Therefore libraries won't effectively be able to use this method, since their downstream users aren't compiled (thus indexed) with them.
This last issue might be possible to circumvent by executing the transformer at downstream-compilation too (expanding/augmenting the index).

@HoldYourWaffle
Copy link
Owner Author

Perhaps it's possible to store caller information on the function object itself, and then retrieve it using this principle.
This would keep the transformation scope limited, and solves the massive index & emitting issue. There'd also be no finicking with stacktraces (and all the troubles that come with it).

However, this method still has two major remaining issues:

  1. This 'metadata' has to be actually added to the function object, and more troublesome: removed cleanly.
    This could lead to massive performance concerns, since we can't know at compile-time which functions actually use getCaller, therefore every single function call in the source code would need to be "wrapped". This effectively moves the storage of "caller information" from a central (compile-time) index, to a runtime call on every single invocation.

  2. If a function was called from an "external location", no metadata will (nor can) be added. Therefore there can be no guarantee that the information is always available.
    In principle this is a reasonable concession, especially given that the current stackframe-based method fails in this case too.
    However it might still be frustrating in cases like this:

// JS source + definitions
function execute<T>(x: T, func: (x: T) => any) {
    func(x);
}

// TS source
function foo<T>(x: T) {
    const callSite = getCallSite();
    // Use "reified" generic information on T, see ts-reified-generics
}

// It's very clear that in this line 'foo' has T = string, but this can't be detected
execute("bar", foo);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigation Further investigation is needed
Projects
None yet
Development

No branches or pull requests

1 participant