-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[10.x] Escaping functionality within the Grammar #46558
Conversation
@taylorotwell This is an implementation of your proposed approach for the initial database expressions which feature that I wanted to postpone. This implementation is backwards compatible in the meaning that any 3rd party driver will still work without any code change (because no constructor is added). Those drivers will just not be compatible with the escaping databas expressions until the add the |
I'm out of my depth with database stuff, so take this with a grain of salt. I'm just looking at this from a security POV and hopefully it's constructive, but feel free to push back if you think I'm being too paranoid. 🙂 It worries me that it "should be theoretically safe", and although all you're doing is exposing a core PHP function, similar to other aspects of Laravel with some extra validation, it's specifically intended to perform a security function. Do you know how widely used the How much extra effort would it be to use a prepared statement instead? |
That's why I asked you to comment it ;)
Yes, the "should be theoretically safe" bugs me too. The PHP devs never state why the say it is only theoretical safe. Before prepared statements had been available
Everyone is using prepared statements. And I only found some references:
A lot of stuff within the DB core has to be rewritten for bindings support with expressions. Its a very big effort. |
Would there be any issue for a project with custom connection and grammar (such as Oracle, MongoDb etc) since we planning to add this on a minor release. |
No problem. When those new drivers don't call |
@tpetry Looking at the PHP manual for the PDO quote function it actually states at it can take 2 parameters a string and a integer specifying what datatype PDO is suppose to qoute. Maybe we should somehow consider implementing that sense i can see someone complain in the future that they cannot escape binary data? |
Do you have a use-case or reason for this? Binary data is still just some string with just different content within. The And as @crynobone said, there are plans to support MongoDB. Which couldn't work with PDO escaping params either. |
Some notes on why I would be reluctant to rely on Not all drivers have itThe PHP
Not all drivers handle all stringsThe sqlite driver apparently "doesn't handle binary strings" (see next). Binary strings may break in this
I'd say that for the purpose of these drivers binary data is a string that can contain a null byte. It's not uncommon (and occasionally it is even recommended) to store some values like UUIDs in binary. The PGSQL driver will truncate such strings:
Btw regarding charsets
Wouldn't this interfere with using different charsets for different columns? They seem less battle testedThis is a bit more subjective, but to me quotes appear to be an afterthought on many drivers. I'd only really trust the mysql one.
@valorin https://bugs.php.net/bug.php?id=81740 (recently fixed) and https://bugs.php.net/bug.php?id=81053 might provide an insight on what kind of errors have arisen lately. |
I haven't really tried as it only seemed really necessary in one place, but I doubt that it would be so bad. Many (all?) cases go through Besides, most of the query methods have their raw counterparts, so a lot of the cases could also be handled at that level in a manner like this: public function method($expr)
{
if ($expr instanceof Expression)
return $this->methodRaw($expr->getSql(), $expr->getBindings());
// the current code of `method`
} The main issue with such change would be adding something like |
The drivers supported by Laravel all implement it. ODBC is not used by Laravel.
Thats a bad behaviour of them. I've tested it and the statement is true. But it is no big deal in my opinion. The PostgreSQL and SQLite grammar can just split any string by the zero byte, quote the parts independently and join them again. In reality it even doesnt have to join them, it can generate a concatenated value:
The connection charset and column charsets are different things. Relevant in this case is only the connection charset which specifies how the SQL query will be parsed. The database will then internally transform the value to the column charset for storage. And vice versa retrieve values of the column charset and cast to the connection charset when querying data.
This may sound simple but you're only looking at provided binding values. You could also use database expressions as a column reference, selected value, group by etc. And within a
There are approaches to do it without breaking BC. An |
Would it be very hard to support the I have no idea about SQLite, but Postgres doesn't support the null byte in a string field. The contents for a |
I am working on a new implementation that will manually fix the zero-byte issues with PostgreSQL/SQLite and also supports embedding binary data. Will take some days. |
5eb1c54
to
b3c68e7
Compare
I've completely rewritten the implementation and also added tests:
@crynobone To support MongoDB in the future the @tontonsb Whats your opinion? |
Thank you for your work @tpetry! This is a lot more polished now. The case-by-case handling of value types feels like it provides more control, especially if one of the cases will have to be adjusted.
I like this. Although it kind of takes away a MySQL feature, putting binary strings like a string is not a feature that should be encouraged. I will add a couple of minor inline comments in a few moments, but I no longer have any objections against the implementation :) |
Another updates based on the good feedback by @tontonsb
|
After more testing and using it in a various ways I can say that the PR reached its final state. It is working really great in escaping the different values specific for every db. Additionally, there is absolutely no breaking change for existing drivers. The full implementation is opt-in by drivers. While the PR was built to improve the query expressions feature, there are more usefull ways it can be used:
Sorry @taylorotwell that the implementation took so long. I promised it the day after Laracon EU but I needed several attempts to get one implementation with the least changes. I am now happy with that one :) |
I would rather see support for expressions returning bindings. Rolling our own escaping seems a bit risky when bindings would solve most use cases without adding and maintaining extra code. I agree with @tontonsb that adding support for bindings in expressions would not be extremely difficult. I have drafted several PRs in the past to add support for this and the changes were not that involved. The argument that escaping needs to be done in places where bindings are not allowed is somewhat valid, but I'm not sure if that's currently within the scope of what Laravel should be offering a solution for. If we add support for bindings that would bring expressions into line with the rest of the query builder, whereas it would seem odd to add in a custom escaping solution prior to expressions even supporting basic bindings. |
I disagree with that. Adding binding support only looks simple at first. But you need to support more than adding bindings for And at some places, bindings are even not allowed. You can't use bindings for I would have implemented expressions returning bindings if I had seen a way to do it in an easy way without rewriting half the database core. Because that pull request would get closed because of too much complexity. Rightly so!
You are free to disagree. But I have plans for the expressions package and what it should support. Being limited again to only what bindings in the query builder can do will limit the use case massively. Expressions exist because the query builder is too limited in some cases. Not to do the same as the query builder. Because then we wouldn't need it. |
Yes, every place that supports expressions would need to be updated to also get any bindings as well, but that's not an impossible task and would make expression much more powerful. All of those places in the query builder already store their own bindings, so I'm not sure I understand why it would be as complex as you think. You had to update all those places in your last PR as well.
As I said, I agree there are some places that bindings would not work, but I'm not sure if that's something that Laravel should tackle. If you are doing this so that you can build a package, then maybe you could implement the escaping there first and prove out its stability and usefulness.
Including support for bindings in expressions would not make them obsolete, just a bit easier to work with. |
I updated just a few places that worked with bindings. There are a ton of places that have yet no idea what expressions are as they can just cast everything to a string. You underestimate the involved work. I am constantly changing things in the database core. Trust me. A ton of the methods of the query build would have to be rewritten.
Sure, feel free to disagree. But I implemented it for a purpose. To get rid of many
I already do as much as possible without changing the Laravel core. At this point, it is not really possible to do this within a package. You're all free to agree or disagree. And to work on a different implementation. This one already took 20h to research & implement. And it is an implementation the most likely to get included as it is not changing big things in the core and won't break anything. Not even third-party packages. Implementing custom binding-logic, as requested often, will need big changes and will most likely break many 3rd packages until they update their builder functions to also add special grammar-binding support. A way I don't want to go to make the adoption of this much easier. |
Thanks! |
Awesome! Thanks Taylor, now I can work on the next extension to expressions. |
So in Laravel 10, a PR of me (#44784) was merged to support database expressions with grammar-specific formatting. The idea is to bring more capabilities of abstracting the SQL flavors of different databases to Laravel. You shouldn't have to use any vendor-specific
raw()
statements anymore.The idea is to replace those hardcoded SQL constructs with classes (like tpetry/laravel-query-expressions) that automatically change the generated SQL based on the used database:
But one problem still needs to be solved: For some queries you have to use values with a query:
Embedding numeric values is easy as they can not be used for SQL injections. But any user-provided string could be an SQL injection vector. This is solved in Laravel by using bindings (placeholders) in queries. But these can't be used for expressions as many parts of the Query Builder and Eloquent would have to be rewritten - the needed changes are too broad.
To solve the problem, I am proposing a solution to add support for database grammars to escape any value for safe embedding into SQL queries. PHP provides this natively with the
PDO::quote
method. But the documentation of it states that it should be theoretically safe to use a quoted string within a SQL query.In my opinion (and after extensive research) this implementation is safe because of these reasons:
PDO::quote()
are always some strange charset conversion tricks that begin by using invalid UTF-8 sequences. But those attacks are fixed by setting the proper connection charset. And Laravel's configuration has a charset setting that will be used.What are your opinions?