Fix resolution performance issues #84

Mystere · 2013-02-20T19:08:31Z

I've seen several comments from people that Ninject is extremely slow compared to other containers. There have also been several benchmarks that seem to support this. For instance:

http://www.palmmedia.de/Blog/2011/8/30/ioc-container-benchmark-performance-comparison

(note that the tests have been updated to include Ninject 3.0.1.10)

Is there a problem with the way that Ninject is being benchmarked? Or is it really that much slower than everything else? Is there a good reason we should accept this performance? Can we address these issues? Is there workarounds?

Many of us love Ninject, but I will soon be working in a high performance, large transaction environment and if those tests are accurate, I may have to go with something else, unless there are ways to mitigate the problems.

I'd like to see these performance issues fixed.

remogloor · 2013-02-28T22:10:46Z

If you just look at the performance of the IoC container then those performance tests are correct. What none of them does is to investigate what's the actual impact on a real application.

E.g. If I add the MiniProfiler to an ASP.NET MVC application that doesn't even have and db access. I can see on my computer that the impact of the IoC container is far less than 10% rather around 1% or even less. This impact makes it not even worth thinking about what IoC container to use because if you get to the point where you have performance issues you will so with all other IoC containers as well.

Ninject has some unique features like conditional bindings and scoping to the lifetime of other objects that cost a lot of performance. Removing them would boost Ninject by magnitudes. But then we have just the features of any of the tiny DI frameworks. Without those features there would be no reason to continue the development of Ninject anymore. Because you can take many others. Therefore we prefer to keep those advanced features and take the disadvantage of beeing slower.

What would be possible though is to do a complete rewrite of the resolving mechanism to do precalculated resolves for all those subtrees that do not use any conditional bindings. But that would be a huge work. Many other things have a higher priority.

Sure the current implementation has potential for improvment too but you will never get near the tiny DI frameworks.

In conclusion. You have to choose what matters more. Squeezing out the last 1% of additional performance or to pay a bit and get more advanced features.

danielmarbach · 2013-03-01T08:05:30Z

@remogloor could you blog this planetgeek?

Am 28.02.2013 um 23:10 schrieb Remo Gloor [email protected]:

If you just look at the performance of the IoC container then those performance tests are correct. What none of them does is to investigate what's the actual impact on a real application.

E.g. If I add the MiniProfiler to an ASP.NET MVC application that doesn't even have and db access. I can see on my computer that the impact of the IoC container is far less than 10% rather around 1% or even less. This impact makes it not even worth thinking about what IoC container to use because if you get to the point where you have performance issues you will so with all other IoC containers as well.

Ninject has some unique features like conditional bindings and scoping to the lifetime of other objects that cost a lot of performance. Removing them would boost Ninject by magnitudes. But then we have just the features of any of the tiny DI frameworks. Without those features there would be no reason to continue the development of Ninject anymore. Because you can take many others. Therefore we prefer to keep those advanced features and take the disadvantage of beeing slower.

What would be possible though is to do a complete rewrite of the resolving mechanism to do precalculated resolves for all those subtrees that do not use any conditional bindings. But that would be a huge work. Many other things have a higher priority.

Sure the current implementation has potential for improvment too but you will never get near the tiny DI frameworks.

In conclusion. You have to choose what matters more. Squeezing out the last 1% of additional performance or to pay a bit and get more advanced features.

—
Reply to this email directly or view it on GitHub.

Mystere · 2013-03-01T23:38:16Z

While I appreciate your point, I think it is a bit short sighted. Yes, Ninject may be only 1% of an application run on normal hardware, but many of us are running apps on very high end servers that need fast response times. We tune the apps to remove all the normal bottlenecks, and what that does is raise the percentage of time spent resolving objects by significant amounts.

Add in the fact that you may have rather complex call graphs, each with their own sets of objects that need to be resolved, and it can become quite extensive. If Ninject is truly 10-100x slower than other common containers (and I'm not talking about the tiny ones, i'm referring to Unity, Windsor, etc..) the that becomes exponential as the number of objects you resolve grows.

Hell, i'd be happy if we could improve it by 50%. Please, let's keep this open so that it can be something that eventually gets addressed.

jsmarsch · 2013-03-05T20:35:06Z

For what it's worth, microbenchmark or not, performance differences like these are enough to scare me away. I'm at the start of a large enterprise application. We're not doing anything realtime, but I can't afford to have timeouts or a slow app. I really like the ninject API. Heck, I even like the name! But we're talking orders of magnitude in performance difference, I'm just not willing to take the risk that it will only be a 1% hit.

remogloor · 2013-03-15T16:31:10Z

Do you have many implementations of one or more interfaces and bind them with conditional bindings or are these all unique servie types?

martinkoslof · 2013-05-07T15:50:39Z

Hmmmm, this topic seems to be gaining more momentum and I'm curious what specific areas or "advanced" features are the most costly regarding performance. For example we use Ninject heavily within an MVC application and most objects are Request scoped. We have leveraged some additional add ins, such as Ninject.Extensions.Conventions and I'm ok with extensions and advanced assembly scanning being "slower". In those scenarios, yes you are choosing flexibility over performance (within reason).

Are there particular things we should "avoid" with Ninject to help boost our performance? I'd really like to see more time or conversations around this topic. We have been doing some heavy performance tuning recently and we've already done a massive amount of Entity Framework/LINQ optimizations and Razor view pre-compilation, CSS and JS optimizations and now Ninject/IOC is going to need to be evaluated as well.

Thanks

idavis · 2013-05-07T16:02:27Z

What are you seeing that is slow enough to impact your application? Do you
see timeouts? Do you have slow response times? Is the the IoC container? Is
the IoC container being used improperly? Have you profiled your app to
isolate the cause?

I have never seen a valid case where the container, any container, was the
actual cause of performance issues. Almost every issue I ever see is people
misusing the container. If someone presents an actual case where Ninject
has a real-world performance issue, I'd be happy to look into it. Until
then, it's all FUD and microbenchmarks.

-Ian

On Tue, May 7, 2013 at 8:50 AM, martinkoslof [email protected]:

Hmmmm, this topic seems to be gaining more momentum and I'm curious what
specific areas or "advanced" features are the most costly regarding
performance. For example we use Ninject heavily within an MVC application
and most objects are Request scoped. We have leveraged some additional add
ins, such as Ninject.Extensions.Conventions and I'm ok with extensions and
advanced assembly scanning being "slower". In those scenarios, yes you are
choosing flexibility over performance (within reason).

Are there particular things we should "avoid" with Ninject to help boost
our performance? I'd really like to see more time or conversations around
this topic. We have been doing some heavy performance tuning recently and
we've already done a massive amount of Entity Framework/LINQ optimizations
and Razor view pre-compilation, CSS and JS optimizations and now
Ninject/IOC is going to need to be evaluated as well.

Thanks

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/84#issuecomment-17550947
.

martinkoslof · 2013-05-07T18:23:23Z

I'm not sure if your comments are directed towards me, or someone else in the comments thread, but I'm not sure what nomenclature in my last post lead you believe I am having "timeouts", or that I'm misusing the IOC Container. I also didn't say Ninject was slow/unusable to the point where my application was worse off. There have been several bench-marking articles written and other people I talk to also have commented that Ninject is "slower" then other IOC containers.

If 10 solutions exists and yours is deemed the slowest it doesn't mean its time to stop using it...it means it is the SLOWEST of those tested. Yes, there are many advantages and extensions to Ninject which I find useful (also noted in my comments). "Real world performance" also comes down to your expectations. If you are trying to get as optimized as possible, and another IOC container is x times faster the question isn't "what's wrong" but "how can this be faster as well".

I also didn't say Ninject was too slow for my existing MVC application. ALL aspects are beening performance tested and tweaked and our IOC implementation is going to be tested next. If another IOC container is faster and provides all the functionality we currently use, we might switch over. Overall I find nothing about this topic irrational or riddled in FUD.

idavis · 2013-05-07T18:26:55Z

I was referring to the thread in general; it wasn't directed at you and I'm
sorry that wasn't clear. People have been clamoring for a long time about
how slow Ninject is, without any substantial claim that there is actually a
problem. The team I work on now uses Castle, and I really don't care. Use
the tooling that solves your needs.

-Ian

On Tue, May 7, 2013 at 11:23 AM, martinkoslof [email protected]:

I'm not sure if your comments are directed towards me, or someone else in
the comments thread, but I'm not sure what nomenclature in my last post
lead you believe I am having "timeouts", or that I'm misusing the IOC
Container. I also didn't say Ninject was slow/unusable to the point where
my application was worse off. There have been several bench-marking
articles written and other people I talk to also have commented that
Ninject is "slower" then other IOC containers.

If 10 solutions exists and yours is deemed the slowest it doesn't mean its
time to stop using it...it means it is the SLOWEST of those tested. Yes,
there are many advantages and extensions to Ninject which I find useful
(also noted in my comments). "Real world performance" also comes down to
your expectations. If you are trying to get as optimized as possible, and
another IOC container is x times faster the question isn't "what's wrong"
but "how can this be faster as well".

I also didn't say Ninject was too slow for my existing MVC application.
ALL aspects are beening performance tested and tweaked and our IOC
implementation is going to be tested next. If another IOC container is
faster and provides all the functionality we currently use, we might switch
over. Overall I find nothing about this topic irrational or riddled in FUD.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/84#issuecomment-17560428
.

ssakharov · 2013-07-25T12:54:35Z

I think I have a real world example of when performance is important for IoC container.

Some background:
When I joined my current company and started working on the project it was a big and messy mix of all layers (originally it was designed as active-record but later it grew into something big with UI specific stuff leaking through business logic, data access straight into DB). On top of that system uses no-longer supported ORM (Gentle.NET) and not even latest version of it. One of the main reasons why it became like this is because the company grew too fast and project had to catch up with whatever human resources were available.
By the time I joined, company got bought by a bigger one and amount of change requests grew dramatically in their scope and complexity without allowing for a major rewrite of the system itself.

At that point decision was made that we should start extracting layers of the system without breaking it. The most obvious/easiest one was data access (which is the opposite to where you'd usually start introducing DI). That allowed us to decrease amount of mess and leave only service/domain and parts of presentation mixed up. Additionally we now had a chance to start transition to a different ORM.

NOTE: all of those changes should be done to the live system without breaking it while it undergoes business-related transformation.

To cut out that part, decision was made to extract data access and inject it using service location backed by Ninject.

Fast forward to present: While the new projects are designed with DI in mind and do not need service location - core of the system is still using it very extensively. Meaning e.g. every http request from most of the websites will cause thousands of resolution calls. (sorry I do not have any numbers on my hands but I saw profiles at some point and they were much larger than 1% and comparable to some IO calls by total time per business transaction)

I am not trying to pretend that this is a "correct" usage of the container whatever that word means. But it is real-world enough for me and was driven by imperfect real-world decisions.

bartelink · 2013-07-25T16:02:11Z

@ssakharov "much larger than 1%". How much larger? % of what? Are you sure the request processing time is definitely not IO bound?

I'd spend lots more time measuring (though really I'd invest the time in reading and re-reading http://blog.ploeh.dk/2011/03/04/Composeobjectgraphswithconfidence/ )

remogloor · 2013-07-26T14:58:55Z

Feel free to use this on your own risk if performance is highly important:

https://github.com/ninject/ninject/tree/PerformanceTryouts

https://teamcity.bbv.ch/viewType.html?buildTypeId=bt7&branch_Ninject=PerformanceTryouts&tab=buildTypeStatusDiv

ssakharov · 2013-08-12T15:22:17Z

Just did live profiling for 5 minutes. It showed that Ninject.Activation.Context.Resolve had 1.1m calls that took 51k seconds to complete which is really close to be 66% of CPU time.
I haven't had a chance looking into the problem in more details yet. Regardless that does not seem like a normal performance to me anyway.

Ninject version: 3.0.1.10

ssakharov · 2013-08-15T11:41:59Z

So, after trying out perf. tryout branch following are the profile summaries for same set of load tests on the app that I ran on my dev machine. Branch indeed brings some improvements in performance but nowhere close enough to address the issue.

As far as I can tell problem here with lock contention since when profiling with "Thread Cycle Time" measurements, Context.Resolve times become negligible.

Therefore, at the moment we have decided to follow 2 routes of how to try to fix our app performance:

Change ninject's MultiMap implementation to use ConcurrentDictionary and remove corresponding locks
Get rid of static scoped bindings in our web apps and have kernel per thread for service location

Update:
This is how profile looks with proof-of-concept implementation of thread-static kernels:

Plasma · 2013-08-18T09:21:21Z

Hi,

I've recently benchmarked a few of our services and threw hundreds of requests against a server.

Using Ninject 3.0.10.1 Context.Resolve() takes up an abnormally large amount of time.

I can't see from the profiling any of my own code being used (the page being loaded is supposed to be a simple blank page, but it has dependencies to be resolved).

High level time taken:

And when diving into Context.Resolve:

@ssakharov does this look similar to yours / any thoughts?

I also notice there's a large GC time, not sure if this is Ninject's fault.

Mystere · 2013-08-19T00:40:04Z

ssakharov is definitely on to something here. The locks seem to be a huge part of the problem.

As an unsafe test, I did nothing but remove the locks around the MultiMap objects and re-ran the Palmmedia benchmark test, and performance more than doubled across the board.

I think we need to look into the locking strategy in use here and figure out why it's blocking so much.

ssakharov · 2013-08-19T08:31:25Z

Could you try #97 on your solutions? Seems pretty relevant to me and apparently it was not merged into perf tryouts branch when I tried it out. If I understand code correctly it should reduce locks a lot if you mostly using transient scope.

Plasma · 2013-08-19T11:19:33Z

#97 is interesting, as part of my load testing - using a mix of transient and singleton scopes only - I sometimes seemed to get deadlocks (CPU was 0%, but IIS threads were all backed up).

@ssakharov do you think #97 would also affect non-request scopes?

danielmarbach · 2013-08-19T11:21:45Z

By the way guys Remo Gloor is in Vacation for the next two weeks and will not be very responsive. Just that you know.

From: Andrew Armstrong [mailto:[email protected]]
Sent: Montag, 19. August 2013 13:20
To: ninject/ninject
Cc: danielmarbach
Subject: Re: [ninject] Fix resolution performance issues (#84)

#97 #97 is interesting, as part of my load testing - using a mix of transient and singleton scopes only - I sometimes seemed to get deadlocks (CPU was 0%, but IIS threads were all backed up).

@ssakharov https://github.com/ssakharov do you think #97 #97 would also affect non-request scopes?

—
Reply to this email directly or view it on GitHub #84 (comment) . https://github.com/notifications/beacon/TUqL6p7Xz6SF-_dIEJ1YBOhUabR9ydt1W40Pm-MlRvmKOwG8eLIeCTi_up2eIggv.gif

danielmarbach · 2013-08-19T11:24:54Z

The major pain point we currently have is that ninject’s architecture allows to add bindings during the runtime. That is why it requires locks. Of course there are rooms for improving this performance as you guys already spotted but if we could somehow change the kernel internally so that it builds up a “readonly” kernel we wouldn’t require locks and this would dramatically improve the performance. We would like to build this without breaking the public API and behavior. So the current kernel would need to build up a readonly internal kernel and as soon as someone adds a binding after that build up it would throw away that one and rebuild the whole readonly kernel plus that new binding.

From: Andrew Armstrong [mailto:[email protected]]
Sent: Montag, 19. August 2013 13:20
To: ninject/ninject
Cc: danielmarbach
Subject: Re: [ninject] Fix resolution performance issues (#84)

#97 #97 is interesting, as part of my load testing - using a mix of transient and singleton scopes only - I sometimes seemed to get deadlocks (CPU was 0%, but IIS threads were all backed up).

@ssakharov https://github.com/ssakharov do you think #97 #97 would also affect non-request scopes?

—
Reply to this email directly or view it on GitHub #84 (comment) . https://github.com/notifications/beacon/TUqL6p7Xz6SF-_dIEJ1YBOhUabR9ydt1W40Pm-MlRvmKOwG8eLIeCTi_up2eIggv.gif

Plasma · 2013-08-21T23:50:17Z

I'm not sure how frequently at runtime other people are adding bindings, but I do it once-off during a static constructor and adds about 30 things at once and never again.

Plasma · 2013-09-20T05:54:11Z

I have done some performance improvements using ConcurrentDictionary at #102 (all tests pass)

BrunoJuchli · 2014-01-16T06:21:41Z

During boot-up phase we are frequently adding bindings. The boot-up phase can last several seconds. With our bootstrapping we do (or should) know when all bindings are completed but this is certainly after ninject has already handled quite a few requests.

@danielmarbach
So i would suggest that the kernel initially has a modifiable "binding repository" which is then converted to an unmodifiable "binding repository" on an external call .Snapshot(). Like .Compact() on a database ;-)

Of course even this does not work without locking completely. But instead of locking read-access to the "binding repository" one can lock Snapshot() vs write-access to "binding repository" (i.E. creating a binding).

Advantages:

full backwards compatibility (if one doesn't call Snapshot() == same behavior as before).
ninject does not need some elaborate code to find out at which time it makes sense to convert the "binding repository" to an unmodifiable one
converting from an unmodifiable to a modifiable repository might be costly?

Drawbacks:

user cannot Snapshot then add bindings again. Or one adds support for that, but would make implementation more complex.
For users which don't use Snapshot() performance will decrease since now there are locks on both binding creation and binding resolving.

For most applications binding resolving is probably executed far more often then binding creation, the change should thus result in a performance improvement.

danielmarbach · 2014-01-16T06:33:41Z

We thought about having a readonly kernel. Similar like you described it but determined by the kernel class you instantiate. That certainly improve perf. By the way there is a performance branch remo did. You can try out that. The largest issues are actually around conditional bindings. There is almost no way you can precompute and cache them. So not onky users would not be able to add bindings when the kernel has all bindings loaded and is armed but also no conditionals. Urs and I did a lot of evangelizing internally it and looks like we got budget to get more love into ninject. We'll keep you posted

Am 16.01.2014 um 07:21 schrieb BrunoJuchli [email protected]:

During boot-up phase we are frequently adding bindings. The boot-up phase can last several seconds. With our bootstrapping we do (or should) know when all bindings are completed but this is certainly after ninject has already handled quite a few requests.

So i would suggest that the kernel initially has a modifiable "binding repository" which is then converted to an unmodifiable "binding repository" on an external call .Snapshot(). Like .Compact() on a database ;-)

Advantages:

full backwards compatibility if one doesn't call Snapshot()
ninject does not need some elaborate code to find out at which time it makes sense to convert the "binding repository" to an unmodifiable one
converting from an unmodifiable to a modifiable repository might be costly?
Drawbacks:

user cannot Snapshot then add bindings again. Or one adds support for that, but would make implementation more complex.
Of course even this does not work without locking completely. But instead of locking read-access to the "binding repository" one can lock Snapshot() vs write-access to "binding repository" (i.E. creating a binding).

For most applications binding resolving is probably executed far more often then binding creation, the change should thus result in a performance improvement.

—
Reply to this email directly or view it on GitHub.

danielmarbach · 2014-01-16T06:36:05Z

Forgot to mention:

Another approach would be to stripp out the binding builder from the kernel. The user uses the binding builder and passes the built bindings to the kernel. The kernel never modifies those.

Am 16.01.2014 um 07:21 schrieb BrunoJuchli [email protected]:

During boot-up phase we are frequently adding bindings. The boot-up phase can last several seconds. With our bootstrapping we do (or should) know when all bindings are completed but this is certainly after ninject has already handled quite a few requests.

So i would suggest that the kernel initially has a modifiable "binding repository" which is then converted to an unmodifiable "binding repository" on an external call .Snapshot(). Like .Compact() on a database ;-)

Advantages:

full backwards compatibility if one doesn't call Snapshot()
ninject does not need some elaborate code to find out at which time it makes sense to convert the "binding repository" to an unmodifiable one
converting from an unmodifiable to a modifiable repository might be costly?
Drawbacks:

user cannot Snapshot then add bindings again. Or one adds support for that, but would make implementation more complex.
Of course even this does not work without locking completely. But instead of locking read-access to the "binding repository" one can lock Snapshot() vs write-access to "binding repository" (i.E. creating a binding).

For most applications binding resolving is probably executed far more often then binding creation, the change should thus result in a performance improvement.

—
Reply to this email directly or view it on GitHub.

chaim1221 · 2014-03-15T22:23:38Z

Any updates on this? I'm thinking about using the -pre and was wondering if the performance changes made it in.

Telavian · 2014-04-02T19:44:12Z

I also am very curious if there has been any work on this. I used Ninject for a WPF project a while ago and never saw any issues, but have heard from other developers that they won't use Ninject because it is "slow".

Plasma · 2014-04-02T22:13:34Z

I switched to another dependency framework as my improvements still weren't enough to remove Ninject from showing up on dotTrace profiler, as the first or second longest running code point, when my ASP.NET app was under load.

I just bit the bullet and spend a few hours moving everything across and have had no problems now, even under load I don't see the DI system appearing in any traces.

Plasma · 2014-04-02T22:14:03Z

Also with regards to being "slow", I really mean it, its not 2ms slower, its 1000ms+ slower under load.

michaelaird · 2014-04-04T20:44:52Z

@Plasma out of curiousity, what did you switch to?

Plasma · 2014-04-04T21:18:47Z

Simple injector, had near feature parity and was benchmarked as one of the fastest (google for IOC benchmarks .net)

Sent from my iPhone

On 5 Apr 2014, at 7:44 am, Michael Aird [email protected] wrote:

@Plasma out of curiousity, what did you switch to?

—
Reply to this email directly or view it on GitHub.

martinkoslof · 2014-04-04T22:16:59Z

I also started using Simple Injector for my MVC web apps. I think it does a good job while remaining performant.

Telavian · 2014-04-05T05:03:03Z

I looked at autofac, is it similar?
On Apr 4, 2014 3:17 PM, "martinkoslof" [email protected] wrote:

I also started using Simple Injector for my MVC web apps. I think it does
a good job while remaining performant.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/84#issuecomment-39617117
.

danielmarbach · 2014-04-05T06:44:08Z

You guys know that there is a feature branch with perf improvements and we are working on more improvements

Am 04.04.2014 um 23:18 schrieb Andrew Armstrong [email protected]:

Simple injector, had near feature parity and was benchmarked as one of the fastest (google for IOC benchmarks .net)

Sent from my iPhone

On 5 Apr 2014, at 7:44 am, Michael Aird [email protected] wrote:

@Plasma out of curiousity, what did you switch to?

—
Reply to this email directly or view it on GitHub.
—
Reply to this email directly or view it on GitHub.

chaim1221 · 2014-04-05T15:16:18Z

Daniel--does -pre from Nuget include those improvements?

BrunoJuchli · 2015-06-12T18:16:50Z

@scott-xu
regarding #176 and you saying that there would be room for joining work on this "ReadOnlyKernel" feature:
What do you think regarding my earlier comments on read only kernel by doing something like .Snapshot() ? I think we could retain a lot of the performance improvements and still offer flexibility for all who don't want to use a readonly kernel.

Would that be an option? If so i would try getting started with an implementation. Should i base this on master or another branch?

cr7pt0gr4ph7 · 2015-06-12T21:11:37Z

@BrunoJuchli @scott-xu
I'm very in favor of something like .Snapshot(), because that's what it is really wanted. Automatic change detection would be even better, e.g. using optimistic version numbers, see below.
Most of the time in Resolve() is probably spent in searching for matching bindings to the constructor, and then searching for matching bindings for the parameters of those bindings, and so on.

The current process goes like this (I'm concentrating on constructor injection using StandardProvider for now):

KernelBase.Resolve:
- Looks for a binding that matches in the current context
- Creates an instance of Ninject.Activation.Context and calls Context.Resolve()
Context.Resolve
- Determines the scope by looking at the request or, if the former has no associated scope, the matching binding (using IBinding.ScopeCallback, passing in the current context).
- Lock on the scope and call Context.ResolveInternal
Context.ResolveInternal (executed while a lock on the scope is held!):
- Queries Ninject.Activation.Caching.ICache to get the already activated instance for the current scope
- Gets the provider that matches the current context (using IBinding.ProviderCallback)
- Call IProvider.Create(IContext)
StandardProvider.Create:
- ...uses the ConstructorScorer to select an appropriate constructor, thus yielding an ConstructorInjectionDirective.
- ...get the injection targets (i.e. the constructor arguments) from the ConstructorInjectionDirective and call StandardProvider.GetValue to resolve them, which in turn:
  - Looks if the context contains an IConstructorArgument,
  - Or, if not, calls ITarget.ResolveWithin.
Target.ResolveWithin
- Calls IRequest.CreateChildRequest, and
- IKernel.Resolve(IRequest) to resolve the request.

So, to improve performance, many information from the steps above can be cached:

KernelBase.Resolve
- Look for an ICachedActivationPlan for the given type and context, which already contains:
  - The binding to activate.
  - The ICachedActivationPlan instances for the parameters required by the binding.
Context.Resolve
- (Use a special case for the transient scope that avoids a delegate call to ScopeCallback when the is always returning null anyway (e.g. via delegate == null)?).
- Do not take a coarse lock here; instead, take more granular locks in Context.ResolvIInternal.
Context.ResolveInternal
- For non-scoped singletons:
  - Note that locks are taken on this instance of ICachedActivationPlan
  - Note that the cached instance for a binding is stored in an instance field of ICachedActivationPlan, which is protected using the double-null-check idiom (see below):
- For transient bindings:
  - No locking of the cachedInstance field needed.
- Bindings with custom scopes (e.g. HttpContext):
  - These currently pose a problem.
StandardProvider.Create
- Determine the constructor to be used at the time when ICachedActivationPlan is created.
- For each constructor parameter, use the binding for that injection target that is already cached in the ICachedActivationPlan, and also represented as an ICachedActivationPlan.

For change detection, one could use an optimistic versioning approach, such that a counter is incremented whenever the kernel transitions from "unmodified" to "dirty". This way, binding caches (that contain instances of ICachedActivationPlan) can check "their" version number with the current one, and only update their data (on take a lock on the kernel's data structures) when really necessary.

The main point is to cache the activation plan for a binding, and let it contain pointers to other activation plans for other related binding, such that searches in the internal datastructures are avoided, as well as avoiding coarse locks over the whole kernel / the whole scope / a certain binding, even when viewed from different scopes.

@BrunoJuchli
I've got a version that contains the latest changes from #142, but excludes the readonly kernel, over at [https://github.com/cr7pt0gr4ph7/Ninject/tree/pull-request/old-kernel], if you want to experiment with the design as it was before the introduction of IReadonlyKernel.

Double null check

if (instance == null) {
    lock (this) {
        if (instance == null) {
            instance = new ...();
         }
    }
}

cr7pt0gr4ph7 · 2015-06-12T21:38:05Z

One idea for realizing snapshots would be to use System.Collections.Immutable, smartly combining immutable collections and their builders, so that locks can be completely avoided in the fast path.

The idea would be that, at the start of the resolution operation, the current state of the kernel's data structures is stored in the IRequest (or IContext, maybe) and passed down to child resolution operations.

Modifcations of the kernel are done by swapping out the pointer to the data structure using an atomic operation (note that loading and setting variables that contain a pointer to a reference type is always done atomically according to the CLR specification./.object references are

Possibly problematic is the scenario of concurrent modifications, but this could be avoided by taking a lock only for threads wishing to modify the IKernel. The other problematic area is the resolution cache, see my comment above for more information.

danielmarbach · 2015-06-12T21:47:25Z

What about conditional bindings?

From: Lukas Waslowski [mailto:[email protected]]
Sent: Freitag, 12. Juni 2015 23:12
To: ninject/Ninject
Cc: Daniel Marbach
Subject: Re: [Ninject] Fix resolution performance issues (#84)

@BrunoJuchli https://github.com/BrunoJuchli @scott-xu https://github.com/scott-xu
I'm very in favor of something like .Snapshot(), because that's what it is really wanted. Automatic change detection would be even better, e.g. using optimistic version numbers, see below.
Most of the time in Resolve() is probably spent in searching for matching bindings to the constructor, and then searching for matching bindings for the parameters of those bindings, and so on.

The current process goes like this (I'm concentrating on constructor injection using StandardProvider for now):

https://github.com/cr7pt0gr4ph7/Ninject/blob/pull-request/old-kernel/src/Ninject/KernelBase.cs KernelBase.Resolve:

Looks for a binding that matches in the current context
Creates an instance of Ninject.Activation.Context and calls Context.Resolve()

https://github.com/cr7pt0gr4ph7/Ninject/blob/pull-request/old-kernel/src/Ninject/Activation/Context.cs Context.Resolve

Determines the scope by looking at the request or, if the former has no associated scope, the matching binding (using IBinding.ScopeCallback, passing in the current context).
Lock on the scope and call Context.ResolveInternal

https://github.com/cr7pt0gr4ph7/Ninject/blob/pull-request/old-kernel/src/Ninject/Activation/Context.cs Context.ResolveInternal (executed while a lock on the scope is held!):

Queries Ninject.Activation.Caching.ICache to get the already activated instance for the current scope
Gets the provider that matches the current context (using IBinding.ProviderCallback)
Call IProvider.Create(IContext)

https://github.com/cr7pt0gr4ph7/Ninject/blob/pull-request/old-kernel/src/Ninject/Activation/Providers/StandardProvider.cs StandardProvider.Create:

...uses the ConstructorScorer to select an appropriate constructor, thus yielding an ConstructorInjectionDirective.
...get the injection targets (i.e. the constructor arguments) from the ConstructorInjectionDirective and call StandardProvider.GetValue to resolve them, which in turn:
Looks if the context contains an IConstructorArgument,
Or, if not, calls ITarget.ResolveWithin.

https://github.com/cr7pt0gr4ph7/Ninject/blob/pull-request/old-kernel/src/Ninject/Planning/Targets/Target.cs Target.ResolveWithin

Calls IRequest.CreateChildRequest, and
IKernel.Resolve(IRequest) to resolve the request.

So, to improve performance, many information from the steps above can be cached:

KernelBase.Resolve

Look for an ICachedActivationPlan for the given type and context, which already contains:
The binding to activate.
The ICachedActivationPlan instances for the parameters required by the binding.

Context.Resolve

(Use a special case for the transient scope that avoids a delegate call to ScopeCallback when the is always returning null anyway (e.g. via delegate == null)?).
Do not take a coarse lock here; instead, take more granular locks in Context.ResolvIInternal.

Context.ResolveInternal

For non-scoped singletons:
Note that locks are taken on this instance of ICachedActivationPlan
Note that the cached instance for a binding is stored in an instance field of ICachedActivationPlan, which is protected using the double-null-check idiom (see below):
For transient bindings:
No locking of the cachedInstance field needed.
Bindings with custom scopes (e.g. HttpContext):
These currently pose a problem.

StandardProvider.Create

Determine the constructor to be used at the time when ICachedActivationPlan is created.
For each constructor parameter, use the binding for that injection target that is already cached in the ICachedActivationPlan, and also represented as an ICachedActivationPlan.

For change detection, one could use an optimistic versioning approach, such that a counter is incremented whenever the kernel transitions from "unmodified" to "dirty". This way, binding caches (that contain instances of ICachedActivationPlan) can check "their" version number with the current one, and only update their data (on take a lock on the kernel's data structures) when really necessary.

The main point is to cache the activation plan for a binding, and let it contain pointers to other activation plans for other related binding, such that searches in the internal datastructures are avoided, as well as avoiding coarse locks over the whole kernel / the whole scope / a certain binding, even when viewed from different scopes.

@BrunoJuchli https://github.com/BrunoJuchli
I've got a version that contains the latest changes from #142 #142 , but excludes the readonly kernel, over at [https://github.com/cr7pt0gr4ph7/Ninject/tree/pull-request/old-kernel], if you want to experiment with the design as it was before the introduction of IReadonlyKernel.

Double null check

if (instance == null) {
lock (this) {
if (instance == null) {
instance = new ...();
}
}
}

—
Reply to this email directly or view it on GitHub #84 (comment) . https://github.com/notifications/beacon/AAKossRA7SN0JvomQPdG15yWwoTGJMs9ks5oS0KPgaJpZM4Acfx5.gif

danielmarbach · 2015-06-12T21:48:10Z

Interesting read http://ayende.com/blog/164739/immutable-collections-performance

From: Lukas Waslowski [mailto:[email protected]]
Sent: Freitag, 12. Juni 2015 23:38
To: ninject/Ninject
Cc: Daniel Marbach
Subject: Re: [Ninject] Fix resolution performance issues (#84)

One idea for realizing snapshots would be to use https://www.nuget.org/packages/System.Collections.Immutable/ System.Collections.Immutable, smartly combining immutable collections and their builders, so that locks can be completely avoided in the fast path.

The idea would be that, at the start of the resolution operation, the current state of the kernel's data structures is stored in the IRequest (or IContext, maybe) and passed down to child resolution operations.

Modifcations of the kernel are done by swapping out the pointer to the data structure using an atomic operation (note that loading and setting variables that contain a pointer to a reference type is always done atomically according to the CLR specification http://www.ecma-international.org/publications/standards/Ecma-335.htm ./.object references are

Possibly problematic is the scenario of concurrent modifications, but this could be avoided by taking a lock only for threads wishing to modify the IKernel. The other problematic area is the resolution cache, see my comment above for more information.

—
Reply to this email directly or view it on GitHub #84 (comment) . https://github.com/notifications/beacon/AAKospzoeTvRQu62n2pFVrfP4sUDcomBks5oS0jCgaJpZM4Acfx5.gif

BrunoJuchli · 2015-06-27T11:41:22Z

Sadly enough i wasn't able to spend as much time on this topic as I'd wished. The good side, however, is that it allowed me some more time to think things over.

So in the following section there'll be some more or less coherent ramblings about the stuff. Feel free to jump to the next section to read more on my (preliminary) conclusions!

So regarding requirements, I've come to the conclusion that an interface like

IKernel newKernel = kernel.Compact();

would be very cumbersome to maintain because the application would need to update all references to the kernel. So i think the kernel should not change but rather the binding information it uses should be changed. An interface like

kernel.Compact();

which doesn't replace the kernel would still be an option. There'd be two options:

after a Compact() attempting to change / add bindings throws an InvalidOperationException (so user who want to continuously change the kernel during runtime can't make use of Compact(). Same goes for users who don't have a clear point-in-time where kernel buildup is complete).
Compact() is allowed to be called multiple times. Adding bindings after a Compact() basically "uncompacts" the kernel until the next Compact() is called.
- we could also potentially perform the Compact() call internally instead of leaving this responsiblity to the user. Issue: When? After each added binding? After each loaded module?

in the end, i think this is a rather expensive solution to implement, because binding retrieval would need to support the "uncompacted" as well as the "compacted" state. Except of course if we create a fully transactional system where only the "commited" bindings can be used. This poses another issue with the current ninject interface: IKernel inherits IBindingRoot and IResolutionRoot. So the atomicity of a transaction would need to be one single binding - otherwise it's unclear what state you're retrieving a binding from (warning: such small an atomicity would be detrimental on application startup performance!). Imagine:

class MyModule : NinjectModule
{
    public void Load()
    {
        Bind<IFoo>().To<Foo>();
        var fooInstance = Get<IFoo>();
    }
}

this can only work if the Bind is immediately commited. Here comes the next issue: is it even finished? :

    public void Load()
    {
        var syntax = Bind<IFoo>().To<Foo>();
        var fooInstance = Get<IFoo>();
        syntax.WhenInjectedInto<SomeType>();
    }

There's actually no clear definition when a binding is complete!

Conclusion

Separating "kernel usage" from "kernel buildup" phases would be beneficial because:

buildup: allows for more coarse transactions, which improves startup performance
buildup: transaction boundaries are clearly defined. It is well defined when a binding definition is complete. Once you the buildup ends, you can't change the binding any more.
usage: improves the "default" performance because there's no explicit calls like Compact() to be made to improve performance. Also you don't really have to think that much about this topic because it comes naturally.

How about something like:

public interface IBIndingRoot
{
     Load<TModule>() where TModule : NinjectModule;
     /* more overloads for loading modules */

    IBindingSyntax<T> Bind<T>();
    /* more overloads for creating bindings */
}

public interface InitialKernelBuilder : IBindingRoot {
    /* creates the kernel with initial binding map */
    IKernel Build(); 
}

public interface IKernelExpander : IBindingRoot 
{
    /* replaces the kernel's binding map with Interlocked.Exchange */
    void Commit(); //alternative: inherit from IDisposable to enable usage of using
}

public interface IKernel : IResolutionRoot
{
    IKernelExpander AddBindings();
}

public class InitialKernelBuilder : IInitialKernelBuilder
{
    public InitialKernelBuilder(KernelConfiguration configuration)
    {
         // similar to new StandardKernel(config);
    }
}

usage:

var kernelBuilder = new InitialKernelBuilder(new KernelConfig { LoadExtensions = true });
kernelBuilder.Load<BusyIndicationModule>();
kernelBuilder.Load<FooModule>();
kernelBuilder.Bind<IBar>().To<Bar>();

var kernel = kernelBuiler.Build();

var bar = kernel.Get<IBar>();
bar.DoSomeBarStuff();

var kernelExpander = kernel.AddBindings();
kernelExpander.BInd<IDelicious>().To<Delicious>();
kernelExpander.Commit();

// this should throw:
kernelExpander.Bind<IFoolicious>().To<WontDo>();

I think this way we can keep on supporting the same scenarios the old kernel supported but introduce better performance without making the new implementation overly complex.

What do you guys think?

scott-xu · 2017-09-11T15:31:26Z

still thinking...

scott-xu · 2017-09-21T13:45:55Z

Hi folks, I'd propose to back out ReadOnlyKernel as it still has a long way to go.
I'm thinking to draft a release called 3.3.0 which is mainly focused on bugfix and platform update.

BrunoJuchli · 2017-09-21T14:13:40Z

@scott-xu
I would agree that for now bugfix and platform update (i interpret: .net standard/.net core support) is much more important.

Regarding performance optimization, I believe it's highly likely that an "optimal" solution will also require some interface changes - which means new major release. Question is whether the platform change would make sense to be 4.0.0 "politically". And then performance optimizations = 5.0.0.
With a .net standard 4.0.0 release one would show "progress". Given a new 3.3 release people might believe it's just a minor bugfix update - no major new things. And I believe .net standard support to be a major new thing.

Mystere · 2017-09-22T06:27:37Z

Would it maybe make more sense to do a re-write from scratch?

scott-xu · 2017-09-23T12:07:24Z

@BrunoJuchli There's no breaking API change. According to Semantic Version, there's no need to bump major version.

qstarin · 2018-04-21T21:11:52Z

Has anyone found a registration/dependency pattern with Ninject that's performant for resolving conditional dependencies?

We have interfaces that have multiple implementations where the specific implementation resolved depends on things like configuration values that can change at any time or based on parameters in the web request. Most of these are registered ToMethod which checks a condition and resolves a specific class with IContext.Kernel.Get. Many of these are also registered InScope to the current thread or http context object.

Resolving these dependencies consumes a lot of time and literally costs us real money in increased compute resource needs. Our main application serves 20M-30M requests per day and over the years I've profiled multiple pieces of functionality in our app where Ninject resolution consumes up to 50% of CPU times.

Here's an example of the types of registrations that resolve especially slowly.

builder.Kernel.Bind<CacheDiagnostics>().ToSelf().InRequestOrTransientScope();
builder.Kernel.Bind<NullCacheDiagnostics>().ToSelf().InSingletonScope();
builder.Kernel.Bind<Func<ICacheDiagnostics>>()
        .ToMethod(c => {
            var ctx = c;
            return () => CacheDiagModeFromHttpContext()
                            ? ctx.Kernel.Get<CacheDiagnostics>()
                            : (ICacheDiagnostics)ctx.Kernel.Get<NullCacheDiagnostics>();
        })
        .InSingletonScope();

builder.Kernel.Bind<ICache>().ToMethod(ctx => {
    switch (CacheTypeFromHttpContext()) {
        case "redis": return ctx.Kernel.Get<IRedisCache>();
        case "memory": return ctx.Kernel.Get<IMemoryCache>();
        default: return ctx.Kernel.Get<MultiLayerCache>();
    }
}).InRequestOrTransientScope();

Plasma · 2018-04-22T01:22:17Z

Not to derail this project, but I ended up replacing Ninject with SimpleInjector and DI stopped showing up in profiling sessions, I’d recommend it.

…

On Sun, 22 Apr 2018 at 7:11 am, Quentin ***@***.***> wrote: Has anyone found a registration/dependency pattern with Ninject that's performant for resolving conditional dependencies? We have interfaces that have multiple implementations where the specific implementation resolved depends on things like configuration values that can change at any time or based on parameters in the web request. Most of these are registered ToMethod which checks a condition and resolves a specific class with IContext.Kernel.Get. Many of these are also registered InScope to the current thread or http context object. Resolving these dependencies consumes a lot of time and literally costs us real money in increased compute resource needs. Our main application serves 20M-30M requests per day and over the years I've profiled multiple pieces of functionality in our app where Ninject resolution consumes up to 50% of CPU times. Here's an example of the types of registrations that resolve especially slowly. builder.Kernel.Bind<CacheDiagnostics>().ToSelf().InRequestOrTransientScope(); builder.Kernel.Bind<NullCacheDiagnostics>().ToSelf().InSingletonScope(); builder.Kernel.Bind<Func<ICacheDiagnostics>>() .ToMethod(c => { var ctx = c; return () => CacheDiagModeFromHttpContext() ? ctx.Kernel.Get<CacheDiagnostics>() : (ICacheDiagnostics)ctx.Kernel.Get<NullCacheDiagnostics>(); }) .InSingletonScope(); builder.Kernel.Bind<ICache>().ToMethod(ctx => { switch (CacheTypeFromHttpContext()) { case "redis": return ctx.Kernel.Get<IRedisCache>(); case "memory": return ctx.Kernel.Get<IMemoryCache>(); default: return ctx.Kernel.Get<MultiLayerCache>(); } }).InRequestOrTransientScope(); — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#84 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAN_WW3piiubgEFWk5Yk93hMw32c2Nmuks5tq6CdgaJpZM4Acfx5> .

niwrA · 2018-04-24T07:09:58Z

Same here ...

drieseng · 2019-03-23T08:29:24Z

We've made great advancements in the 4.0 release cycle.
Please submit a new issue if you have specific areas that need to be improved.

michaelaird · 2019-03-23T10:37:44Z

Fantastic! Please publish 4.0 to NuGet and submit it here ( https://danielpalme.github.io/IocPerformance/ ) so we can see the improvements.

…

On Sat, Mar 23, 2019 at 4:29 AM Gert Driesen ***@***.***> wrote: We've made great advancements in the 4.0 release cycle. Please submit a new issue if you have specific areas that need to be improved. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#84 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AANX6Ytj0-TAOK4FpRlFtHs5zrNffP-3ks5vZeXpgaJpZM4Acfx5> .

AndrewSav · 2020-08-17T00:47:47Z

Today, in August 2020, I think we can safely assume that ninject as a whole is no longer relevant.

Telavian · 2020-08-17T00:55:43Z

I wouldn't say that. The project is amazing and I used it for several years. I would say that currently there are better and easier to use tools in the .net world. As a whole the project still works if anyone wanted to use it.

…

On Sun, Aug 16, 2020, 5:48 PM Andrew Savinykh ***@***.***> wrote: Today, in August 2020, I think we can safely assume that ninject as a whole is no longer relevant. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#84 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZNC75E6OEJ5UQASY2GNGDSBB44BANCNFSM4ADR7R4Q> .

drieseng · 2020-08-17T21:06:49Z

I spent quite a bit of time optimizing (see ##276) and refactoring Ninject more than a year ago. Back then I was the only active contributor that had spare cycles, and I really needed someone to have (design) discussions with.

If another experienced dev (@scott-xu, I'm looking at you ::p) is available and there's enough interest from the community, we can surely revive the effort. I'm ok with a complete reboot.

michaelaird · 2020-08-18T21:41:01Z

There hasn't been a commit in 6 months. there hasn't been a release in 3 years. Even though ninject is in daily use in a lot of software (including the project I work), i think we can agree it's "done".

Plasma mentioned this issue Sep 20, 2013

Improve Ninject performance with ConcurrentDictionary #102

Closed

scott-xu mentioned this issue Jun 12, 2015

ReadOnlyKernel breaks current use cases #176

Open

BrunoJuchli mentioned this issue Oct 17, 2017

Any plan to work on performance? #258

Closed

sergiorykov mentioned this issue Oct 3, 2018

Slowest DI-container of 2018 #273

Open

drieseng closed this as completed Mar 23, 2019

Fix resolution performance issues #84

Fix resolution performance issues #84

Comments

Mystere commented Feb 20, 2013

remogloor commented Feb 28, 2013

danielmarbach commented Mar 1, 2013

Mystere commented Mar 1, 2013

jsmarsch commented Mar 5, 2013

remogloor commented Mar 15, 2013

martinkoslof commented May 7, 2013

idavis commented May 7, 2013

martinkoslof commented May 7, 2013

idavis commented May 7, 2013

ssakharov commented Jul 25, 2013

bartelink commented Jul 25, 2013

remogloor commented Jul 26, 2013

ssakharov commented Aug 12, 2013

ssakharov commented Aug 15, 2013

Plasma commented Aug 18, 2013

Mystere commented Aug 19, 2013

ssakharov commented Aug 19, 2013

Plasma commented Aug 19, 2013

danielmarbach commented Aug 19, 2013

danielmarbach commented Aug 19, 2013

Plasma commented Aug 21, 2013

Plasma commented Sep 20, 2013

BrunoJuchli commented Jan 16, 2014

danielmarbach commented Jan 16, 2014

danielmarbach commented Jan 16, 2014

chaim1221 commented Mar 15, 2014

Telavian commented Apr 2, 2014

Plasma commented Apr 2, 2014

Plasma commented Apr 2, 2014

michaelaird commented Apr 4, 2014

Plasma commented Apr 4, 2014

martinkoslof commented Apr 4, 2014

Telavian commented Apr 5, 2014

danielmarbach commented Apr 5, 2014

chaim1221 commented Apr 5, 2014

BrunoJuchli commented Jun 12, 2015

cr7pt0gr4ph7 commented Jun 12, 2015

Double null check

cr7pt0gr4ph7 commented Jun 12, 2015

danielmarbach commented Jun 12, 2015

danielmarbach commented Jun 12, 2015

BrunoJuchli commented Jun 27, 2015

Conclusion

What do you guys think?

scott-xu commented Sep 11, 2017

scott-xu commented Sep 21, 2017

BrunoJuchli commented Sep 21, 2017 • edited Loading

Mystere commented Sep 22, 2017

scott-xu commented Sep 23, 2017

qstarin commented Apr 21, 2018

Plasma commented Apr 22, 2018 via email

niwrA commented Apr 24, 2018

drieseng commented Mar 23, 2019

michaelaird commented Mar 23, 2019 via email

AndrewSav commented Aug 17, 2020

Telavian commented Aug 17, 2020 via email

drieseng commented Aug 17, 2020

michaelaird commented Aug 18, 2020

BrunoJuchli commented Sep 21, 2017 •

edited

Loading