-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix resolution performance issues #84
Comments
If you just look at the performance of the IoC container then those performance tests are correct. What none of them does is to investigate what's the actual impact on a real application. E.g. If I add the MiniProfiler to an ASP.NET MVC application that doesn't even have and db access. I can see on my computer that the impact of the IoC container is far less than 10% rather around 1% or even less. This impact makes it not even worth thinking about what IoC container to use because if you get to the point where you have performance issues you will so with all other IoC containers as well. Ninject has some unique features like conditional bindings and scoping to the lifetime of other objects that cost a lot of performance. Removing them would boost Ninject by magnitudes. But then we have just the features of any of the tiny DI frameworks. Without those features there would be no reason to continue the development of Ninject anymore. Because you can take many others. Therefore we prefer to keep those advanced features and take the disadvantage of beeing slower. What would be possible though is to do a complete rewrite of the resolving mechanism to do precalculated resolves for all those subtrees that do not use any conditional bindings. But that would be a huge work. Many other things have a higher priority. Sure the current implementation has potential for improvment too but you will never get near the tiny DI frameworks. In conclusion. You have to choose what matters more. Squeezing out the last 1% of additional performance or to pay a bit and get more advanced features. |
@remogloor could you blog this planetgeek? Am 28.02.2013 um 23:10 schrieb Remo Gloor [email protected]:
|
While I appreciate your point, I think it is a bit short sighted. Yes, Ninject may be only 1% of an application run on normal hardware, but many of us are running apps on very high end servers that need fast response times. We tune the apps to remove all the normal bottlenecks, and what that does is raise the percentage of time spent resolving objects by significant amounts. Add in the fact that you may have rather complex call graphs, each with their own sets of objects that need to be resolved, and it can become quite extensive. If Ninject is truly 10-100x slower than other common containers (and I'm not talking about the tiny ones, i'm referring to Unity, Windsor, etc..) the that becomes exponential as the number of objects you resolve grows. Hell, i'd be happy if we could improve it by 50%. Please, let's keep this open so that it can be something that eventually gets addressed. |
For what it's worth, microbenchmark or not, performance differences like these are enough to scare me away. I'm at the start of a large enterprise application. We're not doing anything realtime, but I can't afford to have timeouts or a slow app. I really like the ninject API. Heck, I even like the name! But we're talking orders of magnitude in performance difference, I'm just not willing to take the risk that it will only be a 1% hit. |
Do you have many implementations of one or more interfaces and bind them with conditional bindings or are these all unique servie types? |
Hmmmm, this topic seems to be gaining more momentum and I'm curious what specific areas or "advanced" features are the most costly regarding performance. For example we use Ninject heavily within an MVC application and most objects are Request scoped. We have leveraged some additional add ins, such as Ninject.Extensions.Conventions and I'm ok with extensions and advanced assembly scanning being "slower". In those scenarios, yes you are choosing flexibility over performance (within reason). Are there particular things we should "avoid" with Ninject to help boost our performance? I'd really like to see more time or conversations around this topic. We have been doing some heavy performance tuning recently and we've already done a massive amount of Entity Framework/LINQ optimizations and Razor view pre-compilation, CSS and JS optimizations and now Ninject/IOC is going to need to be evaluated as well. Thanks |
What are you seeing that is slow enough to impact your application? Do you I have never seen a valid case where the container, any container, was the -Ian On Tue, May 7, 2013 at 8:50 AM, martinkoslof [email protected]:
|
I'm not sure if your comments are directed towards me, or someone else in the comments thread, but I'm not sure what nomenclature in my last post lead you believe I am having "timeouts", or that I'm misusing the IOC Container. I also didn't say Ninject was slow/unusable to the point where my application was worse off. There have been several bench-marking articles written and other people I talk to also have commented that Ninject is "slower" then other IOC containers. If 10 solutions exists and yours is deemed the slowest it doesn't mean its time to stop using it...it means it is the SLOWEST of those tested. Yes, there are many advantages and extensions to Ninject which I find useful (also noted in my comments). "Real world performance" also comes down to your expectations. If you are trying to get as optimized as possible, and another IOC container is x times faster the question isn't "what's wrong" but "how can this be faster as well". I also didn't say Ninject was too slow for my existing MVC application. ALL aspects are beening performance tested and tweaked and our IOC implementation is going to be tested next. If another IOC container is faster and provides all the functionality we currently use, we might switch over. Overall I find nothing about this topic irrational or riddled in FUD. |
I was referring to the thread in general; it wasn't directed at you and I'm -Ian On Tue, May 7, 2013 at 11:23 AM, martinkoslof [email protected]:
|
I think I have a real world example of when performance is important for IoC container. Some background: At that point decision was made that we should start extracting layers of the system without breaking it. The most obvious/easiest one was data access (which is the opposite to where you'd usually start introducing DI). That allowed us to decrease amount of mess and leave only service/domain and parts of presentation mixed up. Additionally we now had a chance to start transition to a different ORM. NOTE: all of those changes should be done to the live system without breaking it while it undergoes business-related transformation. To cut out that part, decision was made to extract data access and inject it using service location backed by Ninject. Fast forward to present: While the new projects are designed with DI in mind and do not need service location - core of the system is still using it very extensively. Meaning e.g. every http request from most of the websites will cause thousands of resolution calls. (sorry I do not have any numbers on my hands but I saw profiles at some point and they were much larger than 1% and comparable to some IO calls by total time per business transaction) I am not trying to pretend that this is a "correct" usage of the container whatever that word means. But it is real-world enough for me and was driven by imperfect real-world decisions. |
@ssakharov "much larger than 1%". How much larger? % of what? Are you sure the request processing time is definitely not IO bound? I'd spend lots more time measuring (though really I'd invest the time in reading and re-reading http://blog.ploeh.dk/2011/03/04/Composeobjectgraphswithconfidence/ ) |
Feel free to use this on your own risk if performance is highly important: https://github.com/ninject/ninject/tree/PerformanceTryouts |
Just did live profiling for 5 minutes. It showed that Ninject.Activation.Context.Resolve had 1.1m calls that took 51k seconds to complete which is really close to be 66% of CPU time. Ninject version: 3.0.1.10 |
So, after trying out perf. tryout branch following are the profile summaries for same set of load tests on the app that I ran on my dev machine. Branch indeed brings some improvements in performance but nowhere close enough to address the issue. As far as I can tell problem here with lock contention since when profiling with "Thread Cycle Time" measurements, Context.Resolve times become negligible. Therefore, at the moment we have decided to follow 2 routes of how to try to fix our app performance:
Update: |
Hi, I've recently benchmarked a few of our services and threw hundreds of requests against a server. Using Ninject 3.0.10.1 Context.Resolve() takes up an abnormally large amount of time. I can't see from the profiling any of my own code being used (the page being loaded is supposed to be a simple blank page, but it has dependencies to be resolved). And when diving into Context.Resolve: @ssakharov does this look similar to yours / any thoughts? I also notice there's a large GC time, not sure if this is Ninject's fault. |
ssakharov is definitely on to something here. The locks seem to be a huge part of the problem. As an unsafe test, I did nothing but remove the locks around the MultiMap objects and re-ran the Palmmedia benchmark test, and performance more than doubled across the board. I think we need to look into the locking strategy in use here and figure out why it's blocking so much. |
Could you try #97 on your solutions? Seems pretty relevant to me and apparently it was not merged into perf tryouts branch when I tried it out. If I understand code correctly it should reduce locks a lot if you mostly using transient scope. |
#97 is interesting, as part of my load testing - using a mix of transient and singleton scopes only - I sometimes seemed to get deadlocks (CPU was 0%, but IIS threads were all backed up). @ssakharov do you think #97 would also affect non-request scopes? |
By the way guys Remo Gloor is in Vacation for the next two weeks and will not be very responsive. Just that you know. From: Andrew Armstrong [mailto:[email protected]] #97 #97 is interesting, as part of my load testing - using a mix of transient and singleton scopes only - I sometimes seemed to get deadlocks (CPU was 0%, but IIS threads were all backed up). @ssakharov https://github.com/ssakharov do you think #97 #97 would also affect non-request scopes? — |
The major pain point we currently have is that ninject’s architecture allows to add bindings during the runtime. That is why it requires locks. Of course there are rooms for improving this performance as you guys already spotted but if we could somehow change the kernel internally so that it builds up a “readonly” kernel we wouldn’t require locks and this would dramatically improve the performance. We would like to build this without breaking the public API and behavior. So the current kernel would need to build up a readonly internal kernel and as soon as someone adds a binding after that build up it would throw away that one and rebuild the whole readonly kernel plus that new binding. From: Andrew Armstrong [mailto:[email protected]] #97 #97 is interesting, as part of my load testing - using a mix of transient and singleton scopes only - I sometimes seemed to get deadlocks (CPU was 0%, but IIS threads were all backed up). @ssakharov https://github.com/ssakharov do you think #97 #97 would also affect non-request scopes? — |
I'm not sure how frequently at runtime other people are adding bindings, but I do it once-off during a static constructor and adds about 30 things at once and never again. |
I have done some performance improvements using ConcurrentDictionary at #102 (all tests pass) |
During boot-up phase we are frequently adding bindings. The boot-up phase can last several seconds. With our bootstrapping we do (or should) know when all bindings are completed but this is certainly after ninject has already handled quite a few requests. @danielmarbach Of course even this does not work without locking completely. But instead of locking read-access to the "binding repository" one can lock Advantages:
Drawbacks:
For most applications binding resolving is probably executed far more often then binding creation, the change should thus result in a performance improvement. |
We thought about having a readonly kernel. Similar like you described it but determined by the kernel class you instantiate. That certainly improve perf. By the way there is a performance branch remo did. You can try out that. The largest issues are actually around conditional bindings. There is almost no way you can precompute and cache them. So not onky users would not be able to add bindings when the kernel has all bindings loaded and is armed but also no conditionals. Urs and I did a lot of evangelizing internally it and looks like we got budget to get more love into ninject. We'll keep you posted
|
Forgot to mention: Another approach would be to stripp out the binding builder from the kernel. The user uses the binding builder and passes the built bindings to the kernel. The kernel never modifies those.
|
Any updates on this? I'm thinking about using the -pre and was wondering if the performance changes made it in. |
I also am very curious if there has been any work on this. I used Ninject for a WPF project a while ago and never saw any issues, but have heard from other developers that they won't use Ninject because it is "slow". |
I switched to another dependency framework as my improvements still weren't enough to remove Ninject from showing up on dotTrace profiler, as the first or second longest running code point, when my ASP.NET app was under load. I just bit the bullet and spend a few hours moving everything across and have had no problems now, even under load I don't see the DI system appearing in any traces. |
Also with regards to being "slow", I really mean it, its not 2ms slower, its 1000ms+ slower under load. |
@Plasma out of curiousity, what did you switch to? |
Simple injector, had near feature parity and was benchmarked as one of the fastest (google for IOC benchmarks .net) Sent from my iPhone
|
I also started using Simple Injector for my MVC web apps. I think it does a good job while remaining performant. |
I looked at autofac, is it similar?
|
You guys know that there is a feature branch with perf improvements and we are working on more improvements
|
Daniel--does -pre from Nuget include those improvements? |
@scott-xu Would that be an option? If so i would try getting started with an implementation. Should i base this on master or another branch? |
@BrunoJuchli @scott-xu The current process goes like this (I'm concentrating on constructor injection using
So, to improve performance, many information from the steps above can be cached:
For change detection, one could use an optimistic versioning approach, such that a counter is incremented whenever the kernel transitions from "unmodified" to "dirty". This way, binding caches (that contain instances of The main point is to cache the activation plan for a binding, and let it contain pointers to other activation plans for other related binding, such that searches in the internal datastructures are avoided, as well as avoiding coarse locks over the whole kernel / the whole scope / a certain binding, even when viewed from different scopes. @BrunoJuchli Double null checkif (instance == null) {
lock (this) {
if (instance == null) {
instance = new ...();
}
}
} |
One idea for realizing snapshots would be to use The idea would be that, at the start of the resolution operation, the current state of the kernel's data structures is stored in the Modifcations of the kernel are done by swapping out the pointer to the data structure using an atomic operation (note that loading and setting variables that contain a pointer to a reference type is always done atomically according to the CLR specification./.object references are Possibly problematic is the scenario of concurrent modifications, but this could be avoided by taking a lock only for threads wishing to modify the |
What about conditional bindings? From: Lukas Waslowski [mailto:[email protected]] @BrunoJuchli https://github.com/BrunoJuchli @scott-xu https://github.com/scott-xu The current process goes like this (I'm concentrating on constructor injection using StandardProvider for now):
So, to improve performance, many information from the steps above can be cached:
For change detection, one could use an optimistic versioning approach, such that a counter is incremented whenever the kernel transitions from "unmodified" to "dirty". This way, binding caches (that contain instances of ICachedActivationPlan) can check "their" version number with the current one, and only update their data (on take a lock on the kernel's data structures) when really necessary. The main point is to cache the activation plan for a binding, and let it contain pointers to other activation plans for other related binding, such that searches in the internal datastructures are avoided, as well as avoiding coarse locks over the whole kernel / the whole scope / a certain binding, even when viewed from different scopes. @BrunoJuchli https://github.com/BrunoJuchli Double null check if (instance == null) { — |
Interesting read http://ayende.com/blog/164739/immutable-collections-performance From: Lukas Waslowski [mailto:[email protected]] One idea for realizing snapshots would be to use https://www.nuget.org/packages/System.Collections.Immutable/ System.Collections.Immutable, smartly combining immutable collections and their builders, so that locks can be completely avoided in the fast path. The idea would be that, at the start of the resolution operation, the current state of the kernel's data structures is stored in the IRequest (or IContext, maybe) and passed down to child resolution operations. Modifcations of the kernel are done by swapping out the pointer to the data structure using an atomic operation (note that loading and setting variables that contain a pointer to a reference type is always done atomically according to the CLR specification http://www.ecma-international.org/publications/standards/Ecma-335.htm ./.object references are Possibly problematic is the scenario of concurrent modifications, but this could be avoided by taking a lock only for threads wishing to modify the IKernel. The other problematic area is the resolution cache, see my comment above for more information. — |
Sadly enough i wasn't able to spend as much time on this topic as I'd wished. The good side, however, is that it allowed me some more time to think things over. So in the following section there'll be some more or less coherent ramblings about the stuff. Feel free to jump to the next section to read more on my (preliminary) conclusions! So regarding requirements, I've come to the conclusion that an interface like
would be very cumbersome to maintain because the application would need to update all references to the kernel. So i think the kernel should not change but rather the binding information it uses should be changed. An interface like
which doesn't replace the kernel would still be an option. There'd be two options:
in the end, i think this is a rather expensive solution to implement, because binding retrieval would need to support the "uncompacted" as well as the "compacted" state. Except of course if we create a fully transactional system where only the "commited" bindings can be used. This poses another issue with the current ninject interface:
this can only work if the
There's actually no clear definition when a binding is complete! ConclusionSeparating "kernel usage" from "kernel buildup" phases would be beneficial because:
How about something like:
usage:
I think this way we can keep on supporting the same scenarios the old kernel supported but introduce better performance without making the new implementation overly complex. What do you guys think? |
still thinking... |
Hi folks, I'd propose to back out ReadOnlyKernel as it still has a long way to go. |
@scott-xu Regarding performance optimization, I believe it's highly likely that an "optimal" solution will also require some interface changes - which means new major release. Question is whether the platform change would make sense to be 4.0.0 "politically". And then performance optimizations = 5.0.0. |
Would it maybe make more sense to do a re-write from scratch? |
@BrunoJuchli There's no breaking API change. According to Semantic Version, there's no need to bump major version. |
Has anyone found a registration/dependency pattern with Ninject that's performant for resolving conditional dependencies? We have interfaces that have multiple implementations where the specific implementation resolved depends on things like configuration values that can change at any time or based on parameters in the web request. Most of these are registered ToMethod which checks a condition and resolves a specific class with IContext.Kernel.Get. Many of these are also registered InScope to the current thread or http context object. Resolving these dependencies consumes a lot of time and literally costs us real money in increased compute resource needs. Our main application serves 20M-30M requests per day and over the years I've profiled multiple pieces of functionality in our app where Ninject resolution consumes up to 50% of CPU times. Here's an example of the types of registrations that resolve especially slowly.
|
Not to derail this project, but I ended up replacing Ninject with
SimpleInjector and DI stopped showing up in profiling sessions, I’d
recommend it.
…On Sun, 22 Apr 2018 at 7:11 am, Quentin ***@***.***> wrote:
Has anyone found a registration/dependency pattern with Ninject that's
performant for resolving conditional dependencies?
We have interfaces that have multiple implementations where the specific
implementation resolved depends on things like configuration values that
can change at any time or based on parameters in the web request. Most of
these are registered ToMethod which checks a condition and resolves a
specific class with IContext.Kernel.Get. Many of these are also registered
InScope to the current thread or http context object.
Resolving these dependencies consumes a lot of time and literally costs us
real money in increased compute resource needs. Our main application serves
20M-30M requests per day and over the years I've profiled multiple pieces
of functionality in our app where Ninject resolution consumes up to 50% of
CPU times.
Here's an example of the types of registrations that resolve especially
slowly.
builder.Kernel.Bind<CacheDiagnostics>().ToSelf().InRequestOrTransientScope();
builder.Kernel.Bind<NullCacheDiagnostics>().ToSelf().InSingletonScope();
builder.Kernel.Bind<Func<ICacheDiagnostics>>()
.ToMethod(c => {
var ctx = c;
return () => CacheDiagModeFromHttpContext()
? ctx.Kernel.Get<CacheDiagnostics>()
: (ICacheDiagnostics)ctx.Kernel.Get<NullCacheDiagnostics>();
})
.InSingletonScope();
builder.Kernel.Bind<ICache>().ToMethod(ctx => {
switch (CacheTypeFromHttpContext()) {
case "redis": return ctx.Kernel.Get<IRedisCache>();
case "memory": return ctx.Kernel.Get<IMemoryCache>();
default: return ctx.Kernel.Get<MultiLayerCache>();
}
}).InRequestOrTransientScope();
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#84 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAN_WW3piiubgEFWk5Yk93hMw32c2Nmuks5tq6CdgaJpZM4Acfx5>
.
|
Same here ... |
We've made great advancements in the 4.0 release cycle. |
Fantastic! Please publish 4.0 to NuGet and submit it here (
https://danielpalme.github.io/IocPerformance/ ) so we can see the
improvements.
…On Sat, Mar 23, 2019 at 4:29 AM Gert Driesen ***@***.***> wrote:
We've made great advancements in the 4.0 release cycle.
Please submit a new issue if you have specific areas that need to be
improved.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#84 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AANX6Ytj0-TAOK4FpRlFtHs5zrNffP-3ks5vZeXpgaJpZM4Acfx5>
.
|
Today, in August 2020, I think we can safely assume that ninject as a whole is no longer relevant. |
I wouldn't say that. The project is amazing and I used it for several years.
I would say that currently there are better and easier to use tools in the
.net world.
As a whole the project still works if anyone wanted to use it.
…On Sun, Aug 16, 2020, 5:48 PM Andrew Savinykh ***@***.***> wrote:
Today, in August 2020, I think we can safely assume that ninject as a
whole is no longer relevant.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#84 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZNC75E6OEJ5UQASY2GNGDSBB44BANCNFSM4ADR7R4Q>
.
|
I spent quite a bit of time optimizing (see ##276) and refactoring Ninject more than a year ago. Back then I was the only active contributor that had spare cycles, and I really needed someone to have (design) discussions with. If another experienced dev (@scott-xu, I'm looking at you ::p) is available and there's enough interest from the community, we can surely revive the effort. I'm ok with a complete reboot. |
There hasn't been a commit in 6 months. there hasn't been a release in 3 years. Even though ninject is in daily use in a lot of software (including the project I work), i think we can agree it's "done". |
I've seen several comments from people that Ninject is extremely slow compared to other containers. There have also been several benchmarks that seem to support this. For instance:
http://www.palmmedia.de/Blog/2011/8/30/ioc-container-benchmark-performance-comparison
(note that the tests have been updated to include Ninject 3.0.1.10)
Is there a problem with the way that Ninject is being benchmarked? Or is it really that much slower than everything else? Is there a good reason we should accept this performance? Can we address these issues? Is there workarounds?
Many of us love Ninject, but I will soon be working in a high performance, large transaction environment and if those tests are accurate, I may have to go with something else, unless there are ways to mitigate the problems.
I'd like to see these performance issues fixed.
The text was updated successfully, but these errors were encountered: