Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple users of asyncWrap in a single application? #40

Closed
mike-kaufman opened this issue Feb 25, 2016 · 29 comments
Closed

Multiple users of asyncWrap in a single application? #40

mike-kaufman opened this issue Feb 25, 2016 · 29 comments

Comments

@mike-kaufman
Copy link
Contributor

Hi -

I'm not clear on how two independent modules using asyncWrap will co-exist in the same application.Is this a supported scenario?

My current understanding is setupHooks() will overwrite any existing callbacks. Consequently two modules independently using asyncWrap and linked into the same app will conflict and only one module will work correctly. Is this an accurate understanding? Has this been discussed?

Thanks much!

Mike

@AndreasMadsen
Copy link
Member

You are correct that AsyncWrap only support one setupHooks() call and will conflict with other modules.

It is something we are aware of, but it is not yet decided if it is something we want to support. There is some discussion here: nodejs/Release#59 (comment)

@mike-kaufman
Copy link
Contributor Author

Thanks Andreas. I'm not clear on the results of what was discussed by the TSC on this topic. Can you elaborate? Going through the thread you cite above, I'm having a hard time following the reasoning for not fixing. Assuming this is a valuable & useful API (and I believe it is), then it clearly follows that multiple libraries will start using it. Not supporting multiple callers in the same process is clearly broken. Am I missing something?

Thanks,

Mike

@mike-kaufman
Copy link
Contributor Author

@trevnorris - would love to hear your reasoning on this.

@AndreasMadsen
Copy link
Member

I'm not clear on the results of what was discussed by the TSC on this topic.

The discussion in nodejs/Release#59 was about the stability of AsyncWrap and if it was acceptable to land breaking changes in an LTS release. Thus not the appropriate place to discuss the finer details about the future of conflict management for AsyncWrap. This is actually something we haven't done yet.

@trevnorris was at the time (and may still be) primarily against supporting multiple users for the sake of keeping the complexity in node core low. But I will let Trevor elaborate on that.

I still hope to support multiple users. I suggest you look at https://github.com/AndreasMadsen/async-hook and give us (me) feedback about the API. That way we can get an idea about how much is needed in an multi-user implementation, if we decide to implement that in node core.

@mike-kaufman
Copy link
Contributor Author

So, yeah, async-hooks is almost exactly the API I was expecting to see from a client point-of-view. IMO it makes sense to pull this (or something analogous) as the public API surface here.

Only difference between what you have and what I was expecting is I expected something like:

var cookie = asyncHooks.add(...);  // returns an identifier that uniquely identifies my set of callbacks
asyncHooks.remove(cookie);  // allows for deterministic and constant-time de-registration of my callbacks.

Minor feedback:

  • index.js, 21-17 - not clear to me why you're doing this.
  • I'm not clear on why you need to monkey-patch timers, promises, nextTick. Probably my ignorance.
  • async-hook.js, 83-88. First concern here is this is a linear scan (maybe this is OK). Second concern here is that if two callers (say c & c+2) register the same function (e.g., same init(), different pre()), and then caller c+2 calls remove(), the ordering of callbacks is no longer pairwise. That is, the assumptions that initFns[i], preFns[i], etc are all the callbacks from the same call to add() breaks. Not sure if this even matters, or how realistic a scenario this is, but it stands out to me.

@AndreasMadsen
Copy link
Member

@mike-kaufman yep. Point 1 and 2 should not be a concern in a node core implementation. However point 3 highlights some of the complexities when supporting multiple users (another is the global effect of .disable()). Are these issues acceptable or is a more complex implementation required? Keep in mind that AsyncWrap is perhaps the hottest code path in node, thus API is not the only concern.

@mike-kaufman
Copy link
Contributor Author

So, I think point 3 is a minor point. You can probably ship what you have and my guess is you'll be fine. If it does prove to be problematic, you would likely need to introduce new APIs to keep a deterministic grouping of callbacks.

Now, you could fix this relatively easily today by using single hash table that maps cookie->set of callbacks (in lieu of storing callbacks in four distinct arrays). You'll want to be careful in how you choose cookies, as my guess is you want to keep the order of callback execution deterministic, specifically ith caller if add() is the ith to have their callbacks invoked. You could just choose cookies in a monotonically increasing sequence (0, 1, 2...).

I'm assuming in above statement, by "hot", you mean performance sensitive, and I'm assuming the hot path here is callback invocation path and not the add/remove path. Let me know if not the case, but I don't think storing callbacks as mentioned above would impact perf in any meaningful way. Would love to know why if not the case.

Last bit of feedback - for safety, consider wrapping callback invocations inside a try/catch, eg. here. If a mis-behaving callback throws, it will prevent all other hooks from running.

@avanderhoorn
Copy link

If it helps we have had similar discussions around some of these concerns (i.e. having multiple listeners and the cost/complexity it adds) in .net. The resolution that we came to was that if the code is optimized for 0 or 1 listeners, then that fits the majority of real world use cases but it should still support multiples. If the user decided to add packages which results in more than 1 listeners than that is a decision that they have made. Otherwise it limits a whole number of use cases and pushes the responsibility to authors using asyncwrap as to how they will try and play nicely with others.

@chrisdickinson
Copy link

Would there be a downside to implementing the multiple listeners approach at the JS level, and leaving the internal C++ implementation single-listener?

@mike-kaufman
Copy link
Contributor Author

@chrisdickinson - as an end-user, all I care about is if I register callbacks, they get called. :) As an end-user, I don't care whether this is implemented in JS or native. If implemented in JS (e.g., async-hook), then I would hope that the async-wrap setupHooks() API is hidden such that it can't be abused, and there's one official JS module that supports add/remove of multiple callbacks.

I'm curious to know if there are perf concerns that would suggest implementing at the native level.

@mike-kaufman
Copy link
Contributor Author

Or perhaps this API is useful to other native modules? And that is a reason to implement in c++?

@AndreasMadsen
Copy link
Member

@chrisdickinson there will be an extra function evaluation which likely will affect performance and there will be an extra callSite in error.stack.

I did play around with it, to measure the performance (AndreasMadsen/node@dc1403f...AndreasMadsen:cbbc48e), but discovered that node doesn't provide the necessary benchmark tooling (statistical significance) to say much about it. But I'm currently fixing that.

@AndreasMadsen
Copy link
Member

@mike-kaufman By the way, the reason why async-hooks needs to patch timers, nextTick and promises is documented here: https://github.com/nodejs/tracing-wg/blob/master/docs/AsyncWrap/README.md#things-you-might-not-expect

The reason behind https://github.com/AndreasMadsen/async-hook/blob/master/index.js#L21-L27 is just to prevent async-hook from showing up in peoples stack trace.

@AndreasMadsen
Copy link
Member

I'm assuming in above statement, by "hot", you mean performance sensitive, and I'm assuming the hot path here is callback invocation path and not the add/remove path. Let me know if not the case, but I don't think storing callbacks as mentioned above would impact perf in any meaningful way. Would love to know why if not the case.

@mike-kaufman yes, it is the callback (hook) invocation. This happens a few times for every async operation in node. This is one of the performance sensitive parts: https://github.com/nodejs/node/blob/master/src/async-wrap.cc#L210L220

Also, the reason why I choice a simple array for holding callbacks in async-hook, is for a fast iteration in the hook call part, not for a fast registration. This may be a false assumption, I haven't benchmarked it and it's too theoretical for me to discuss (I lack knowledge about the v8 internal data structures).

@trevnorris
Copy link

The reason for not implementing support for multiple hooks is the same reason AsyncListener was completely removed (nodejs/node@0d60ab3ef). As you can see from lib/tracing.js there is more to take into account than simply running multiple callbacks. For example it should guarantee the "parent" most handle running at a given time so that each callback has no need to rely on side effects to operate. e.g. if two callbacks happen to temporarily set 'unhandledException'.

In short, simply allowing multiple callbacks to fire isn't the correct solution. They need to be able to run without affecting each other.

@mike-kaufman
Copy link
Contributor Author

Thanks @trevnorris, but I'm not really following. Is a library like async-hook problematic? Does async-hook fail to guarantee "no side effects"?

@trevnorris
Copy link

@mike-kaufman My intent of only allowing a single set of callbacks to be registered was to pretty much force a community driven module that experiments with and attempts to solve these issues. This eventually was the reason AsyncListener was removed. Whether async-hook properly handles side effects can come from user testing. I haven't done a full review of the module so can't give my feedback ATM.

Once a module that has shown can deal with the issues I've mentioned then it won't be unreasonable to bring this into core. But until it's considered stable enough. This is another reason why the aysnc wrap callbacks will abort if on unexpected exception. Again to force module authors to properly handle side effects of possibly setting bad state, etc.

@avanderhoorn
Copy link

I guess the question is what happens between now and then? What happens when a user wants to use two packages which both want to use AsyncWrap, is the recommendation that only those seeking to create abstractions on AsyncWrap use AsyncWrap directly (what happens when multiple of those are in play), etc?

@AndreasMadsen
Copy link
Member

@mike-kaufman async-hook definitely fails to guarantee against side effects. .disable() has global side effects, hooks aren't wrapped in a try catch, the order of execution is also questionable. There are likely more issues.

@avanderhoorn AsyncWrap is undocumented and will change API in a patch update. When either you are using an abstraction or AsyncWrap directly, you are in for trouble. I will encourage you to either use or create an abstraction, it will benefit the project and it also help you manage a changing API.

@avanderhoorn
Copy link

@AndreasMadsen I see what you are saying and completely agree. But the point still remains about what happens when a user wants to use two packages which both want to use AsyncWrap.

@AndreasMadsen
Copy link
Member

"unanticipated and unpredictable side effects" is the only general answer I can give. Specifically there are two situations:

  • The simple case is that AsyncWrap is enabled by the first user (module). If setupHooks() is call by the second user (module) when AsyncWrap is enabled it will throw the error "hooks should not be set while also enabled". I think this is the most likely case.
  • If AsyncWrap is disabled, the second setupHooks() call will overwrite the current hooks. This will obviously leave the first AsyncWrap user (module) in an unknown state. Furthermore because the first module had AsyncWrap disabled, we can expect it to enable AsyncWrap again later. This may affect the second user (module). Thus the two modules can ultimately conflict so much that neither of them works.

@mike-kaufman
Copy link
Contributor Author

My intent of only allowing a single set of callbacks to be registered was to pretty much force a community driven module that experiments with and attempts to solve these issues

@trevnorris - The concern I have with above (as shared by @AndreasMadsen in nodejs/LTS#59) is if you have competing libraries, you introduce a module incompatibility problem that is non-obvious and could be quite frustrating for end users.

For clarity, and to better communicate my point of view, I'll try to explain what I think the key principles are for this "async-event" API. Please let me know if there's any disagreement on things below, or if there's anything to add:

  1. A callback mechanism that notifies clients about scheduling events of async calls is generally useful to node applications and modules, and would therefore be broadly used.
  2. If a client registers callbacks with this "async-event" API, those callbacks should always be called. More specifically, if an app takes a dependency on this API (e.g., to implement some "continuation-local-storage"), the app is dependent on consistent and reliable behavior.
  3. The semantics/behavior of the API (e.g., what is "allowed behavior" in registered callbacks) should be explicitly defined and clearly communicated.

@trevnorris
Copy link

@mike-kaufman

if you have competing libraries, you introduce a module incompatibility problem that is non-obvious and could be quite frustrating for end users.

I understand the concern here. Though we should also consider that Node has several Promise libraries. Assigning any of these to global Promise would cause the same frustration. Even so, things seemed to work out well in the end.

We could ease this for end users by throwing or printing a warning on a second attempt of calling setupHooks(). It would be possible to store the information for the initial call site so the user knows where it was already hooked. Thoughts on this?

A callback mechanism that notifies clients about scheduling events of async calls is generally useful to node applications and modules, and would therefore be broadly used.

Mostly true. But this is also in a performance sensitive spot. These will be called hundreds of thousands of times. So the optimal would be a single library that stores information about application status that can be queried by others. To reduce the overhead. Though this is the optimal case, not saying it should be the only case.

If a client registers callbacks with this "async-event" API, those callbacks should always be called. More specifically, if an app takes a dependency on this API (e.g., to implement some "continuation-local-storage"), the app is dependent on consistent and reliable behavior.

That sounds right. Though if there were multiple callbacks and one threw, would you continue execution of the other callbacks? Just example of edge case that needs to be thought through.

In a similar vein, since all callbacks will have access to the context of the handle, it's technically possible to clobber values on the context. Even things such as the oncomplete callback. There's really no way to force users to not do this, except not give them access to the context handle. What would be the best option in this case?

The semantics/behavior of the API (e.g., what is "allowed behavior" in registered callbacks) should be explicitly defined and clearly communicated.

Definitely. As you've seen above there are definitely things that shouldn't be done. Now, we can't enforce these, but it's a good start.

What I'm getting at is that there are enough edge cases that need to be solved before the solution is integrated into core. I'd love to come out of the gate w/ a module that can handle most of what we've talked about. It'd be possible to repurpose some of the AsyncListener code, or take what @AndreasMadsen has already done..

@AndreasMadsen
Copy link
Member

We could ease this for end users by throwing or printing a warning on a second attempt of calling setupHooks(). It would be possible to store the information for the initial call site so the user knows where it was already hooked. Thoughts on this?

I think this is something we should do.

@mike-kaufman
Copy link
Contributor Author

@trevnorris - Thanks. Incorporating your input, I think the principles around this API become the following.

  1. A callback mechanism that notifies clients about scheduling events of async calls is generally useful to node applications and modules, and would therefore be broadly used.
  2. Allowing user-hooks to plug into the event-loop lifecycle has potentially sever performance impacts, and alternatives should be preferred.
  3. If a client registers callbacks with this "async-event" API, those callbacks should always be called. More specifically, if an app takes a dependency on this API (e.g., to implement some "continuation-local-storage"), the app is dependent on consistent and reliable behavior.
  4. The semantics/behavior of the API (e.g., what is "allowed behavior" in registered callbacks) should be explicitly defined and clearly communicated.
  5. Any callbacks run through this hook should not overwrite existing program state.

Anything to add/change? I think that once we agree on a set of principles, it will become much easier to answer questions around API shape/behavior. For example:

We could ease this for end users by throwing or printing a warning on a second attempt of calling setupHooks(). It would be possible to store the information for the initial call site so the user knows where it was already hooked. Thoughts on this?

Disagree with this. I think given principles 1 & 3 above, the user-facing API must support multiple callbacks.

Though if there were multiple callbacks and one threw, would you continue execution of the other callbacks? Just example of edge case that needs to be thought through.

Given principle 3, all callbacks should run independently, and if one callback throws, then all other registered callbacks should run (assuming non-catastophic failure modes, e.g., available memory, heap corruption,...).

Even things such as the oncomplete callback. There's really no way to force users to not do this, except not give them access to the context handle. What would be the best option in this case?

Bear with me, as I'm not super familiar with node/v8 internals, but can you clarify precisely what a "handle" is? I'm happy to read through source if you can point me in the right direction.

Also, I'd be interested in seeing an explicit list of edge cases. E.g., you previously mentioned a callback setting "unhandledException". (I could also use an explanation of precisely what unhandledException is? i.e., where is it defined? Is it readable/writable from user code?).

Lastly, I should say what my motivation is here. I'm working on a project in node and we need some reliable form of continuation-local-storage. So, I want to see an API here that will "just work" from the user's point of view.

Thanks,

Mike

@AndreasMadsen
Copy link
Member

@mike-kaufman there is an AsyncWrap documentation that also explains what Handle objects are: https://github.com/nodejs/tracing-wg/tree/master/docs/AsyncWrap

@joshgav
Copy link
Contributor

joshgav commented Sep 21, 2016

closing old AsyncWrap issues, please start a new thread if appropriate

@joshgav joshgav closed this as completed Sep 21, 2016
@Jeff-Lewis
Copy link
Contributor

Jeff-Lewis commented Sep 21, 2016

@joshgav are we keeping track of async_wrap issues somewhere else? #29 still has this issue and others open.

@joshgav
Copy link
Contributor

joshgav commented Sep 21, 2016

@Jeff-Lewis I closed these to get your attention :)

If this isn't addressed in the PR to core (nodejs/node#8531) or the Node-EP (nodejs/node-eps#18) let's re-open.

Perhaps we should go through all the bullets in #29?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants