Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Pomelo to the TechEmpower Framework Benchmarks #1408

Closed
lauxjpn opened this issue Apr 26, 2021 · 53 comments
Closed

Add Pomelo to the TechEmpower Framework Benchmarks #1408

lauxjpn opened this issue Apr 26, 2021 · 53 comments

Comments

@lauxjpn
Copy link
Collaborator

lauxjpn commented Apr 26, 2021

Similar to Npgsql, we should add a Pomelo EF Core scenario as well, so we can track where we stand.
MySqlConnector is already being used for the RAW ADO.NET scenarios targeting MySQL.

@lauxjpn lauxjpn added this to the 6.0.0 milestone Apr 26, 2021
@lauxjpn lauxjpn self-assigned this May 3, 2021
@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 3, 2021

The necessary PR has been added to the TFB repo as TechEmpower/FrameworkBenchmarks#6565 (currently still targeting EF Core 5.0).

According to a test warning, we currently seem to generate 3 queries per update, instead of two. So we should look into that.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 3, 2021

@roji Is there anything for us to do in regards to aspnet/Benchmarks?

@roji
Copy link

roji commented May 3, 2021

@lauxjpn great to see this!

Re aspnet/Benchmarks, I don't think we run any MySQL benchmarks there at the moment - are you looking for anything in particular from there?... Generally only packages from nuget.org can be consumed, so this wouldn't give you continuous perf info at the moment...

@sebastienros any thoughts?

/cc @ajcvickers

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 3, 2021

I don't think we run any MySQL benchmarks there at the moment - are you looking for anything in particular from there?

@roji Nah, just saw you were recently active on there (and it said they have some TFBs over there), so I though before I spend time looking into it myself, I just ask the "perf guy" instead. If there is nothing for us to do there, I will just start looking at some optimization potential for us.

BTW, are providers supposed to (or should feel encouraged to) implement the new EF Core benchmark projects from the EF Core repo?

@roji
Copy link

roji commented May 4, 2021

BTW, are providers supposed to (or should feel encouraged to) implement the new EF Core benchmark projects from the EF Core repo?

Sure, these can be helpful as a way of measuring the perf impact before and after a change. We also run these continuously on Sqlite as a way of detecting regressions and seeing the overall perf trend. You could in theory do the same for MySQL, though you'd need to store data historically, plot it, etc. etc....

Note also that I plan to take a close look at exactly what we're benchmarking and how, so these benchmarks may change significantly...

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 4, 2021

@bgrainger I started with doing some ORM-less benchmarking (so raw ADO.NET), to see were we currently are on the ADO.NET and database server level.

MySqlConnector (asp.net core [platform, my]) is nearly 4x as performant as Oracle's Connector/NET (asp.net core [platform, my, oracle]), both running against MySQL, but only half as performant as PostgreSQL (asp.net core [platform, pg]):

https://www.techempower.com/benchmarks/#section=test&shareid=2a313a96-363c-4863-a366-88ca68707848

(Run in the Vagrant/VirtualBox VM that comes with TBF on my decent desktop system.)

@mguinness
Copy link
Collaborator

Interesting to see the head to head comparisons. Not surprised about the Oracle connector (being synchronous), but I always thought that MySQL had a slight edge over Postgres on read-heavy operations. Though probably that has changed with the latest releases of both database systems and configurations. Hope to learn more about this, but I guess for the average application the performance difference would be negligible.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 4, 2021

@mguinness Yes, in the past, MySQL with MyISAM was faster than PostgreSQL, but they did some serious tuning over the years.

I will add a MariaDB benchmark next, which I expect to perform significantly better than MySQL.

I should also mention, that I ran this with all the already configured settings, that I suppose are optimized for the official run. I will ramp up the vagrant VM to use 6 CPU cores and 20 GB of RAM of my host system for my next local run, to close the hardware gap a bit (the previous local run used the default values, which are 2 CPU cores and 3 GB of RAM).

@bgrainger
Copy link
Collaborator

bgrainger commented May 4, 2021

but only half as performant as PostgreSQL

The top MySQL entrant at https://www.techempower.com/benchmarks/#section=data-r20&hw=ph&test=fortune comes in at 44th place with 41.1% the speed of the top PG entrant.

I've always just assumed that PG is either just faster than MySQL, or much more suited to the TFB benchmarks, and speed parity should not be expected.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 4, 2021

So here are some updated results including MariaDB, run on the beefed-up VM:

https://www.techempower.com/benchmarks/#section=test&shareid=089bcffa-dccd-4eac-8719-9d4848718e3d

The performance benefit of MariaDB over MySQL is actually pretty small.

Looking at the fortunes benchmarks, it can also be seen that with 3x the CPU count and 6,66 times the amount of RAM, the Connector/NET throughput just doubles, the MySqlConnector throughput triples (so appears to grow linearly with the increase of CPU cores) and PostgreSQL even widens the gap to MariaDB and MySQL by a couple percent.

@bgrainger Great job for making MySqlConnector achieving a throughput that is 5x as high as the one from Oracle's Connector/NET.

I will use that as a baseline for now and take a look at the the micro and full ORM benchmarks next.

(Though I think I have to repeat this run, because I forgot to include the explicit PostgreSQL update benchmarks, that use distinct settings from the rest of the benchmarks.) [Updated]

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 5, 2021

I reported a concurrency bug for MariaDB and MySQL, that can result in an imprecise count of updated rows under heavy load/high concurrency (in our case the 20 update test with 512 concurrent operations).

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 5, 2021

We should run some non concurrency operation benchmarks, to get an idea of how much of the gap between MySQL and Postgres is just the engine (and communication) itself and how much can be attributed to concurrency handling/locking/resource management.

@bgrainger
Copy link
Collaborator

I know aspcore-mvc-ado-my scores extremely poorly for "Data Updates" (https://www.techempower.com/benchmarks/#section=data-r20&hw=ph&test=update) but haven't ever looked into where the primary source of the slowdown is. (But, in general, all MySQL entrants on that benchmark are very slow.)

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 5, 2021

I ran some quick benchmarks (using Benchmark.Net) for non-concurrent SELECT operations against MySQL and PostgreSQL on my development laptop (couple of years-old machine).

Independent of the amount of SELECTs per benchmark run (1, 5, 10; using the same connection), and the amount of rows per SELECT (1, 20, 40, 80), MySQL 8.0.21 (using MySqlConnector, utf8mb4) is about 25% slower than Postgres (using Npgsql, UTF8), on my not optimized local database server instances. (I can post the actual data if anybody is interested.)

Since the SELECT TFBs for MySQL have only about 25% to 35% of the throughput of Postgres in their (concurrent) scenarios, but MySQL is only about 25% slower than Postgres when accessed synchronously, about 35% to 50% could be subtracted from the 75% to 65% difference, and the result could roughly be related with concurrency/locking/waiting etc. (that is mathematically not correct, but I think good enough for a rough estimate here). Since these are only read operations, there should not be much locking necessary, but I don't know how optimized MySQL/InnoDB is for this case.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 5, 2021

I know aspcore-mvc-ado-my scores extremely poorly for "Data Updates"

@bgrainger Yeah, that is indeed odd. I'll take a look at that over the next few days.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 10, 2021

@bgrainger Short update. I did some local benchmarking and optimizing over the last couple of days, and I was able to double the (RAW) throughput of the UPDATE statements against MySQL.

I'll link the benchmarks tomorrow.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 12, 2021

Here are the locally run benchmarks:

Both are RAW ADO.NET runs.

The main improvement for the optimized run was gained by using innodb_flush_log_at_trx_commit = 0 over innodb_flush_log_at_trx_commit = 2 in the my.cnf file. In addition to the ones used in the benchmark here, there seem to be also slight improvements by using innodb_flush_method = O_DIRECT_NO_FSYNC over innodb_flush_method = O_DIRECT.

@roji
Copy link

roji commented May 12, 2021

@lauxjpn does this make the database commit less reliable (e.g. in case the server crashes)? I wonder if there are TechEmpower restrictions around this kind of tweak - ideally the different databases would be configured similarly, so as not to compared apples to oranges...

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 12, 2021

@lauxjpn does this make the database commit less reliable (e.g. in case the server crashes)? I wonder if there are TechEmpower restrictions around this kind of tweak - ideally the different databases would be configured similarly, so as not to compared apples to oranges...

@roji Yes, it makes commits less reliable than the current default settings of MySQL.

However, I explicitly checked the TFB requirements against ACID compliance or even any restrictions in regards to the transaction isolation level and did not find any current restrictions about it (kind of makes sense, as there are databases out there, that might be unable to comply otherwise).

I think as the bare minimum, this makes sense for the platform tests, which should be as fast as humanly possible.

But even in realistic production scenarios, this is a viable option. If my application is update-heavy, needs to perform as fast as possible, and I can live with loosing up to the last second of transactions in case of a crash, than this is a good way to go.

@roji
Copy link

roji commented May 12, 2021

@lauxjpn I'm not really questioning whether this MySQL config makes sense in a real-world application or not - my only concern is for different databases in TE to be configured with the same (or similar) levels of reliability, otherwise we're not comparing apples to apples any more...

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 12, 2021

@roji I think its generally a good idea to define some basic capabilities that databases have to comply with, so they can be part of the TFBs. For example, they have to be fully ACID compliant and have to use transactions for such and such benchmark types using this isolation level. This will however close the door for some database server implementations out there (which might be fine).

Whether these make sense for the platform benchmarks, might be debatable and depends on their intention:

Are they primarily about speed/throughput? Then speed/throughout should be all that matters (and common abstractions should be introduced with care, to keep them as fast as humanly possible) and not being ACID compliant is fine. They can still be compared to other platform benchmarks.

Or are they primarily about demonstrating the overhead of common approaches (like MW and MVC) on that platform (in conjunction with those benchmarks)? In that case, all benchmarks (including the platform ones) should have to comply with the same ruleset (e.g. ACID compliance and a READ COMMITTED transaction isolation level). But then, to make the platform benchmarks comparable to their e.g. middleware and MVC counterparts, they should use similar abstractions in their benchmarks when it comes to non-middleware and non-MVC related constructs, or we are again comparing apples to oranges in regards to framework overhead.

@roji
Copy link

roji commented May 12, 2021

@lauxjpn clearly, the TechEmpower benchmarks are about comparing different web frameworks, languages, OSs, database etc. - that's the point of having a big table showing how much each one scored.

Now, at a very pedantic level, nothing can be compared to anything else - all web frameworks are different from each other, as are all databases and all OSs; no two apples are actually comparable in this world. But that's not very helpful to anyone. On a more pragmatic level, TechEmpower establishes some general, imprecise benchmark categories we know that databases like whether an ORM is being used or not (although what exactly an ORM is can be debated endlessly).

Similarly, we all know that PostgreSQL and MySQL are roughly similar, at least in terms of the Fortunes query. If your specific benchmark for MySQL on .NET tweaks the database to not fsync to disk, and all others don't, then .NET will look artificially faster than other platforms, and MySQL will look artificially faster than PostgreSQL. Now, if this is just an exploratory test you're doing to see what happens to your perf when you spend less time in the database (since no fsync), maybe that's interesting, I don't know. But I do think it would be problematic to merge that without checking what other benchmarks do and/or have a conversation about it with the TechEmpower people...

@mguinness
Copy link
Collaborator

From Framework Benchmarks Round 17

We discussed the pros and cons of permitting this optimization in our tests. While we have disallowed several performance-optimizing tactics elsewhere as violations of the rules or spirit of our tests, we ultimately arrived at permitting this optimization for two reasons. First, it can be applied "seamlessly" to application code. The application code does not need to be aware this is happening. And second, because this isn't a trick applied at the application tier, but rather an optimization that potentially benefits all applications (once implemented by drivers), we feel this is precisely the sort of improvement this project should encourage.

@bhauer Do you have any comments on this discussion?

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 12, 2021

@mguinness Very interesting, I was wondering about that and even had a work item added to my Pomelo related TODO list, to dig into the PostgreSQL code and research, where the massive performance differences are achieved.

/cc @bgrainger

First, it can be applied "seamlessly" to application code. The application code does not need to be aware this is happening.
And second, because this isn't a trick applied at the application tier, but rather an optimization that potentially benefits all applications (once implemented by drivers), we feel this is precisely the sort of improvement this project should encourage.

This is the case for our optimization as well. It is configured for the database server in its config file. It does not depend on the application layer and is an optimization specifically of interest for high throughput scenarios where you are willing to trade-off up to 1 sec. of transaction persistence in case of a crash, for 100% gain in throughput.

I do think it would be problematic to merge that without checking what other benchmarks do [...]

(My response is pretty obvious here. No need actually to write it out.)

@bgrainger
Copy link
Collaborator

IMO my.cnf should be consistent across all entrants in the TFB (that use MySQL); no individual benchmark should be configuring server parameters to get better throughput. I didn't actually see where @lauxjpn was changing this; given that my.cnf appears to be in a shared location (https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/toolset/databases/mysql/my.cnf) I wouldn't think it could be changed just for .NET.

If @roji is questioning whether MySQL should be configured differently from PG in TFB, I don't have a strong opinion. I personally wouldn't use the benchmarks to pick between the two backends; I think there's a huge amount of tweaking and customising for your specific application's needs that can go into running MySQL/PG in production and it's unrealistic to expect TFB to encompass that; it's just providing a consistent baseline (good or bad) for applications to code against.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 13, 2021

@bgrainger Yes seems similar to MySQLs prepared statements. I did test them extensively over the last weekend for the different types, but ended up using the current implementations, as the different variations that I tried did not improve the general performance (a couple of variations are probably still on my fork in some feature branches).

While MySQL did get rid of their query cache for MySQL 8, it does still exist for MariaDB (though disabled by default). But if I understood it correctly (not tested yet), it only caches full statements (including their parameter values). But I could be wrong about it. I wanted to establish the MySQL raw tests as a baseline first, so that we can measure MariaDB's performance against it.

@roji
Copy link

roji commented May 13, 2021

First, it can be applied "seamlessly" to application code. The application code does not need to be aware this is happening. And second, because this isn't a trick applied at the application tier, but rather an optimization that potentially benefits all applications (once implemented by drivers), we feel this is precisely the sort of improvement this project should encourage.

This is the case for our optimization as well. It is configured for the database server in its config file. It does not depend on the application layer and is an optimization specifically of interest for high throughput scenarios where you are willing to trade-off up to 1 sec. of transaction persistence in case of a crash, for 100% gain in throughput.

For my 2 cents - the protocol pipelining feature @mguinness referred to above is very different from the MySQL config tweak we're discussing. Pipelining has to do with how the driver communicates with the server; it is a feature which, if done right, transparently increases perf and has no drawbacks (such as reduced reliability). Npgsql's multiplexing feature was partially inspired by this, and led to some important perf improvements - so the driver is better because of it.

On the other hand, tweaking MySQL to not flush on commit, while "seamless/transparent", is just a config tweak that trades perf for reliability. The fact is that this mode isn't enabled by default in MySQL (since it's reasonable to assume reliability is more important), and users can opt into it. At the end of the day, I simply don't see the point of enabling this - yes, it would increase the MySQL results in TE, but to what end? Absolute RPS will be higher, but no component in the stack has actually been improved.

For completeness, here's what I think is the corresponding config flag in PG. The docs mention the following:

While turning off fsync is often a performance benefit, this can result in unrecoverable data corruption in the event of a power failure or system crash. Thus it is only advisable to turn off fsync if you can easily recreate your entire database from external data.

Assuming the same applies to MySQL, well...

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 13, 2021

Yes, it makes commits less reliable than the current default settings of MySQL. [...] I think as the bare minimum, this makes sense for the platform tests, which should be as fast as humanly possible.

@roji As already stated above, I would currently at least strongly consider using this configuration tweak for the platform benchmarks, which as you stated before, are not supposed to be realistic, but the fastest implementation that could possibly work:

I know it sounds a bit extreme, but the platform benchmarks are extreme (e.g. see the totally stripped-down ASP.NET, which sane applications generally won't do).

So delaying transaction persistence by up to 1 sec. (which is what the configuration tweak does), seems to be exactly in the spirit of the platform benchmarks.


I think its generally a good idea to define some basic capabilities that databases have to comply with, so they can be part of the TFBs. For example, they have to be fully ACID compliant and have to use transactions for such and such benchmark types using this isolation level. This will however close the door for some database server implementations out there (which might be fine).

As already stated above, I am fine with adding some restrictions about databases and how they need to be used. But as long as there are none, I don't see a problem in optimizing the platform benchmarks for MySQL.

@roji
Copy link

roji commented May 13, 2021

Yes, it makes commits less reliable than the current default settings of MySQL. [...] I think as the bare minimum, this makes sense for the platform tests, which should be as fast as humanly possible.

@roji As already stated above, I would currently at least strongly consider using this configuration tweak for the platform benchmarks, which as you stated before, are not supposed to be realistic, but the fastest implementation that could possibly work:

I disagree... It's very different to write benchmark code in a very optimized way, excluding various ASP.NET machinery (which is what the platform benchmarks do), and to turn off fsync at the database. Once again, I don't see what we would learn from this - we would just have higher numbers (and so would everyone else, so the relative position most probably wouldn't change). Stripping out the ASP.NET machinery - as the platform benchmarks do - teaches us something about the overhead of those components, and guides ASP.NET work for better perf.

When I wrote that we want the fastest implementation for the platform benchmarks, the context was to avoid slowing them down with the changes you were proposing in TechEmpower/FrameworkBenchmarks#6565 (in order to support running against MySQL). That would have made .NET perf drop relative to everyone else, just in order to make the benchmark ADO.NET-generic. Disabling database fsync is very different.

Finally, I think disabling fsync only for platform benchmarks is even more problematic, since it artificially inflates the difference between platform and non-platform. The difference would no longer express the impact of ASP.NET machinery, it would also express MySQL fsync overhead. In any case, unless I'm mistaken, we can't have one MySQL setup for platform and another for non-platform - there's just one MySQL docker image for everyone, no?

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 13, 2021

In any case, unless I'm mistaken, we can't have one MySQL setup for platform and another for non-platform - there's just one MySQL docker image for everyone, no?

The statement is dynamic, so it can be enabled/disabled at will (but on a MySQL global level).


Finally, I think disabling fsync only for platform benchmarks is even more problematic, since it artificially inflates the difference between platform and non-platform. The difference would no longer express the impact of ASP.NET machinery [...]

That is exactly what I asked about earlier. This is what the second scenario stated below:

Whether these make sense for the platform benchmarks, might be debatable and depends on their intention:

Are they primarily about speed/throughput? Then speed/throughout should be all that matters (and common abstractions should be introduced with care, to keep them as fast as humanly possible) and not being ACID compliant is fine. They can still be compared to other platform benchmarks.

Or are they primarily about demonstrating the overhead of common approaches (like MW and MVC) on that platform (in conjunction with those benchmarks)? In that case, all benchmarks (including the platform ones) should have to comply with the same ruleset (e.g. ACID compliance and a READ COMMITTED transaction isolation level). But then, to make the platform benchmarks comparable to their e.g. middleware and MVC counterparts, they should use similar abstractions in their benchmarks when it comes to non-middleware and non-MVC related constructs, or we are again comparing apples to oranges in regards to framework overhead.

@roji These were not addressed at all when I posted them, but instead you posted a mostly unrelated general answer.

However, since we do know now, that the platform tests are about showing the overhead of component choices like middleware and MVC, we are now agreeing that delaying transaction persistence should not be valid for all benchmarks, and we can now move the discussion into that direction.

(I propose a dedicated issue on TFB for that, as this is not MySQL specific anymore from this point onwards.)

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 13, 2021

@bgrainger Now back to MySQL configuration optimizations in general, there are a couple of things regarding the config for the TFBs that we should check:

For example, currently the innodb_buffer_pool_instances are set to 14 (the number of CPU cores of the machine), while each core can process 2 threads (so 28 in total). Increasing the number from 14 towards 28 is something that should be tested.

Also, the related innodb_buffer_pool_size is not being explicitly set, so this should be optimized in regards to the innodb_buffer_pool_instances number we end up with (e.g. a good starting point would be around 20 GB, that would then be divided into n pool instances, depending on the innodb_buffer_pool_instances setting).

In this context, we should also check the general innodb_buffer_pool_chunk_size (and that the total chunk number is reasonable).

@bhauer
Copy link

bhauer commented May 13, 2021

@bhauer Do you have any comments on this discussion?

Thanks for pinging me, this is a very interesting discussion. @bgrainger and @roji have already expressed viewpoints that align with mine, so I don't have a whole lot to add. But to echo them:

The TechEmpower Framework Benchmarks project is intended to provide an objective measure of web application framework and platform performance. It's an imperfect proxy for the impossible task of measuring the performance of your real-world application as built on several dozen frameworks. We've aimed to provide what we call "high water marks" of performance, to which you can apply a rough coefficient to take a guess at your application's performance as realized on various frameworks and platforms.

We include database tests, not expressly to measure and compare the performance of various databases, but rather because web applications very commonly include the use of databases. We want to understand the overhead costs of having your web framework interact with a database (including cost areas such as query protocol traffic and object-relational mapping, where applicable), rather than the features and raw performance of the database platforms themselves.

You might ask then, why not have the database on a ramdisk or set up the database configuration to completely minimize the cost of everything outside those parts we want to measure? (E.g., avoid the cost of actually writing changes to disk.) We have debated this and revisited our thoughts here a few times in the project's history. What we often return to is our goal to have our configuration and deployment mimic mainstream production-grade deployments. Although "production-grade" is subject to debate, in the case of database configuration, it would mean I lean toward keeping the default file system flushing behavior of MySQL.

As @bgrainger says, we'd prefer to have the my.sql configuration as consistent as possible across all tests that use MySQL. Otherwise, the configuration becomes another permutation attribute or "dimension" of variability.

That said, if there are my.sql configuration changes that can be made globally and increase the performance of our MySQL tests without appreciably moving the configuration away from being "production-grade," we'd like to consider them. It will be interesting to see what you find on that front, @lauxjpn. The way I see it, as long as those changes retain an overall production-grade configuration, diminishing the bottleneck of MySQL as seen from perspective of the web application allows the frameworks to better distinguish their performance characteristics among one another.

Perhaps most importantly, we welcome feedback, counterpoints, and debate on these subjects. Ultimately, we have to make decisions that balance a bunch of different goals such as providing value to both application developers and framework maintainers. It's very helpful to have opinions from several experts!

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 14, 2021

[...] we are now agreeing that delaying transaction persistence should not be valid for all benchmarks, and we can now move the discussion into that direction.

(I propose a dedicated issue on TFB for that, as this is not MySQL specific anymore from this point onwards.)

I extracted what I think is the remaining point of the previous sub discussion (ACID compliance) and filed it as TechEmpower/FrameworkBenchmarks#6587.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 18, 2021

@bgrainger I have a rough idea for a perf related feature for MySqlConnector (or otherwise for Pomelo) in mind. Basically some kind of "auto prepare", that you can opt-into.

The basic idea would automatically execute a prepared statement, when a number of queries with the same text have been executed. MySqlConnector would then execute the prepared statement for the current query and future ones (as long as they are the same), until e.g. an auto prepare specific timeout is triggered, which would then free the resources.

This would need to be accompanied by a couple of options, to specify the threshold, possibly the cache size, the number of past queries to track, and the timeout.

This could also be quite helpful to Pomelo, so we don't have to necessarily implement our own prepare specific logic, to profit from its performance gains.

(Though it might be possible on the Pomelo level to cache prepared command objects (since MySqlConnector couples prepared statements to commands) and reuse them later for similar ones, effectively implementing the feature there.)

What are your initial thoughts on this?

@roji
Copy link

roji commented May 18, 2021

In case you guys are interested in Npgsql automatic preparation: docs, blog post. I'd obviously be happy to discuss details if you want.

@bgrainger
Copy link
Collaborator

I think that's been suggested (in passing) before, but I can't find a relevant issue right now.

I know npgsql supports this: https://www.npgsql.org/doc/prepare.html#automatic-preparation. Its connection string options, API, and internal implementation would be good reference material when designing/implementing this.

In MySQL (don't know about PG), prepared statements are per-connection. We would have to decide if statement execution is tracked at the "connection pool" level (which would imply contention over a shared data structure tracking execution counts), or just per connection (simpler, but would take much longer to "kick in"). Either way, some kind of LRU cache would be necessary to avoid caching every single SQL statement that's executed on a session / connection pool.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 18, 2021

We would have to decide if statement execution is tracked at the "connection pool" level (which would imply contention over a shared data structure tracking execution counts), or just per connection (simpler, but would take much longer to "kick in").

Good point! The connection pool level would be preferable of course.

[...] which would imply contention over a shared data structure tracking execution counts [...]

Yeah, that's the main reason I think this should probably be opt-in.
(Locking would probably not be that much of an issue with some concurrent collections etc.).

@roji
Copy link

roji commented May 18, 2021

In PostgreSQL, prepared statements are also per-connection. Npgsql tracks prepared statements on each physical connection - what kind of pool-level tracking are you guys thinking about, given that at the database level prepared statements are per connection?

Besides that, yeah, LRU mechanics at each connection level. The theory behind it is that an application will pretty quickly reach stable state, where all connections have the same set of hot queries auto-prepared etc.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 18, 2021

Hmm, that's right, if it is a per connection feature, it doesn't make sense to track it at the pool level, since we cannot make an informed decision what available connection might be preferable to a caller, as we cannot know what queries are going to be executed later on that might already be prepared on a pooled and available connection.

We could optimize it at least in a way, that the least recently used connection is being reused first (if this isn't already the current implementation). (@roji Is that what you mean with LRU at the connection level, or is that only meant in regards to the prepared query cache?)

I guess than that prepared statement resources would be released only when the connection is being actually released, instead of at some timeout before that?

@bgrainger
Copy link
Collaborator

Npgsql tracks prepared statements on each physical connection - what kind of pool-level tracking are you guys thinking about, given that at the database level prepared statements are per connection?

I'm considering the case where the user says "auto-prepare SQL if it's executed 5 times". One interpretation of that request is that new MySqlConnection(constantConnectionString).Execute(constantSql) would start preparing (on each physical session) after it's executed 5 times, even if those executions took place on 5 separate physical connections.

The other interpretation (which I think npgsql uses, and which is simpler, but may not match user expectations?) is that a SQL statement is only auto-prepared when it's been executed 5 times on the same physical connection. In a "busy" connection pool, that might mean it's been executed 50+ times on different physical connections before it starts getting prepared. Maybe for frequently-executed SQL statements, that difference isn't material over the average lifetime of a typical physical session? But if you had a low wait_timeout value and were closing idle connections aggressively, you could end up with a situation in which auto-preparation happens very seldom if ever. OTOH, if the physical connection only executes that SQL once or twice (because it gets closed due to idleness), preparing the SQL could be entirely extra overhead that's not worth it.

I think I'm talking myself into tracking SQL execution per-physical-connection, which sounds like a simpler problem to solve.

@roji
Copy link

roji commented May 18, 2021

We could optimize it at least in a way, that the least recently used connection is being reused first (if this isn't already the current implementation). (@roji Is that what you mean with LRU at the connection level, or is that only meant in regards to the prepared query cache?)

Yeah, I only meant LRU in regards to (each connection's) prepared query cache. For a given connection, we track the currently auto-prepared statements, and each one has a LastUsed date. If a new SQL needs to be auto-prepared (because it has been executed enough times, passing the threshold), either we find a free slot (if we haven't reached the maximum number of auto-prepared for that connection), or we evict the statement with the oldest LastUsed. Fairly basic stuff.

Regarding which connection is returned from the pool - as you wrote, I don't think there's a way of knowing at that point what SQL will be executed.. I don't particularly see any utility in doing LRU over connections (it would probably increase complexity and maybe hurt perf)...

Irrelevant ramblings

(there is one interesting thought... in some DB APIs (non-.NET) you just ask the driver to execute a SQL, rather than open a connection and execute on that. In that world, it does make sense to have some sort of global tracking, and to internally select a physical connection on which the SQL has already been prepared... I have some wild thoughts in that direction, but it's probably not super relevant here...)

I guess than that prepared statement resources would be released only when the connection is being actually released, instead of at some timeout before that?

Npgsql actually... never releases auto-prepared statements - until they get evicted by another statement via LRU. The reasoning if we release when a connection is closed (i.e. returned to the pool), then short-lived connection scenarios (i.e. most web apps) cannot benefit from it. So prepared statements are "persisted" across pooled open/close, with the assumption that the application tends to perform the same workload (I wrote a bit about it here). There's a custom UnprepareAll on NpgsqlConnection which clear everything if users need it.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 18, 2021

Npgsql actually... never releases auto-prepared statements - until they get evicted by another statement via LRU. The reasoning if we release when a connection is closed (i.e. returned to the pool), then short-lived connection scenarios (i.e. most web apps) cannot benefit from it. So prepared statements are "persisted" across pooled open/close, with the assumption that the application tends to perform the same workload (I wrote a bit about it here).

Yeah, that's what I meant by releasing the "actual" connection (in contrast to returning it to the pool).

There's a custom UnprepareAll on NpgsqlConnection which clear everything if users need it.

Sounds good enough.

Irrelevant ramblings [...]

Could be an interesting scenario for MySqlDirect? (Though sounds much more complicated, when transactions come into play as well.)

@roji
Copy link

roji commented May 18, 2021

when transactions come into play as well.

Yeah, any sort of connection state (transactions are the best example) make this problematic... Typically that requires falling back to the more traditional pattern of getting a connection and executing on that instead. At that point we're back in what ADO.NET looks like today.

In any case, my general assumption is that the vast majority of apps won't really benefit from such centralized connection/prepared-statement tracking, since typically all connections are used to run all (hot) queries. So after an initial warmup, a stable state should be reached where everything is prepared everywhere anyway.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 18, 2021

In any case, my general assumption is that the vast majority of apps won't really benefit from such centralized connection/prepared-statement tracking, since typically all connections are used to run all (hot) queries. So after an initial warmup, a stable state should be reached where everything is prepared everywhere anyway.

Makes sense. Is this opt-in or -out for Npgsql?

@mguinness
Copy link
Collaborator

A little off topic but maybe pertinent to benchmarks, are Compiled queries in Entity Framework Core still a thing?

I know there are some limitations, but I haven't seen much written about it since it was announced in 2.0 release.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented May 18, 2021

[...] are Compiled queries in Entity Framework Core still a thing?

Yes, they are used in the TFBs for example.

Though the automatic query cache of EF Core is pretty effective (enough) for most applications.

@roji
Copy link

roji commented May 18, 2021

Is this opt-in or -out for Npgsql?

It's opt-in. There's still also the possibility to explicitly prepare via DbCommand.Prepare (or use both mechanism simultaneously).. Though since upper layers like Dapper/EF Core don't use (or expose) Prepare, it's kinda useless for most scenarios.

A little off topic but maybe pertinent to benchmarks, are Compiled queries in Entity Framework Core still a thing?

Though the automatic query cache of EF Core is pretty effective (enough) for most applications.

Yeah, they're a thing, though @lauxjpn is right that the benefit isn't huge. They allow jumping over the very first phase of the query pipeline, which is necessary before looking up the query in the query cache. The bigger the expression tree, the more this feature would help. I ran the numbers of TE Fortunes recently, and got 3.8% improvement by using compiled queries - though the Fortunes query is the minimal one could imagine (no query operators at all), so this is more of a "lower bound".

But note that compiled queries have nothing to do with ADO.NET command preparation, at least at the moment. There's dotnet/efcore#5459 for tracking this, but at this point I really think it makes more sense for ADO.NET providers to do auto-preparation instead.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented Jun 3, 2021

@bgrainger Could MySqlConnector skip statement parsing, if no parameters are used by the command (at least for single statements)?
According to my benchmarks, it would increase performance for those cases by about 10%.

@bgrainger
Copy link
Collaborator

It would be a behavioural change. Currently SELECT @variable; is not allowed unless AllowUserVariables=true. But maybe a test like if (Command.Parameters.Count > 0 || CommandText.Contains('@') || CommandText.Contains('?')) { DoExpensiveParse } would work?

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented Jun 3, 2021

But maybe a test like if (Command.Parameters.Count > 0 || CommandText.Contains('@') || CommandText.Contains('?')) { DoExpensiveParse } would work?

Looks good to me. I'll do another benchmark later, to see whether it's worth it or not.

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented Jun 3, 2021

With the following code path and AllowUserVariables=true, 10_000 simple SELECT statements without parameters (SELECT count(*) FROM `IceCreams` where `IceCreamId` = 1) run in 3.451 sec instead of 4.165 sec, so about 20% faster:

internal sealed class SingleCommandPayloadCreator : ICommandPayloadCreator
{
    // ...

    private static bool WriteCommand(IMySqlCommand command, ByteBufferWriter writer)
    {
        // ...

        if (command.CommandText is not null)
        {
            if (!(command.Connection?.AllowUserVariables ?? false) || command.RawParameters?.Count > 0 || command.CommandText.Contains("@") || command.CommandText.Contains("?"))
            {
                var preparer = new StatementPreparer(command.CommandText!, command.RawParameters, command.CreateStatementPreparerOptions());
                var isComplete = preparer.ParseAndBindParameters(writer);
                if (isComplete && (isSchemaOnly || isSingleRow))
                {
                    ReadOnlySpan<byte> clearSqlSelectLimit = new byte[] {10, 83, 69, 84, 32, 115, 113, 108, 95, 115, 101, 108, 101, 99, 116, 95, 108, 105, 109, 105, 116, 61, 100, 101, 102, 97, 117, 108, 116, 59}; // \nSET sql_select_limit=default;
                    writer.Write(clearSqlSelectLimit);
                }

                return isComplete;
            }

            writer.Write(command.CommandText);
        }

        return true;
    }

    // ...
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants