Skip to content
This repository has been archived by the owner on Jul 16, 2021. It is now read-only.

Proposal to address confusion between PostgreSQL concepts "schema" and "search_path" #918

Closed
cbj4074 opened this issue Dec 8, 2017 · 10 comments

Comments

@cbj4074
Copy link

cbj4074 commented Dec 8, 2017

After several days of frustrating debugging ( #916 ), I would like to propose a change to the terminology that Laravel employs in relation to PostgreSQL.

In essence, my frustration and confusion stem directly from the fact that the existing code seems to conflate PostgreSQL's schema and search_path concepts. As tempting as it may be to write-off the conflation as a "mundane detail", it feels worthwhile to correct.

Until I discovered that the schema connection parameter could in fact accept an array of schemas (and not merely a single name in the form of a string), via laravel/framework#15535 , I was absolutely baffled as to why the framework would ever set the search_path to a single schema. I discovered this "undocumented feature" only after rooting-around in the source (and found the above PR only much later); I have never seen it mentioned nor documented elsewhere. And I went to the source first, not after struggling for several days to reach this point. I just couldn't find anything in relation to the search_path (because I discovered it to be the problem, on the PostgreSQL side), and now I know why. I wish I had found this Issue much sooner (which inspired the above PR), as it would have saved me a lot of time and angst:

laravel/framework#8639

A schema refers to a single namespace, whereas the search_path is comprised of one or more schemas. Simply put, the PostgreSQL manual describes it as, "a list of schemas to look in". While hugely useful, the search_path is equally complicated, which is why I feel so strongly that this dinstinction needs to be made, and the source corrected accordingly. Even if the source is not corrected, the connection configuration parameter name, schema, should be changed to searchpath. To be clear, I am proposing only that the variable and method names be changed to reflect their actual purpose, with no functional changes (beyond renaming schema to searchpath in the connection configuration array; BC could be preserved if desired).

Somewhat surprisingly, PostgreSQL's search_path is barely documented and even less well understood, and the conflation at issue here is therefore largely understandable.

Many such articles exist, but the following article details just how complex the relationship between schema and search_path is. Many of the complexities don't emerge until more sophisticated use-cases are undertaken, such as those involving functions, stored procedures, views, and large-scale data manipulation, especially in the context of multiple schemas.

http://www.postgresonline.com/journal/archives/279-Schema-and-search_path-surprises.html

In fact, it doesn't even make sense to define a schema on a connection, because a connection has no need for, nor any concept of, a schema; a connection needs a database, but not a schema. It is the search_path that concerns PostgreSQL connections. It is unfortunate that PostgreSQL itself provides the SET SCHEMA ... capability, which is defined as follows:

SET SCHEMA 'value' is an alias for SET search_path TO value. Only one schema can be specified using this syntax.

This confuses the distinction between schema and search_path even further.

Despite the fact that what I was trying to achieve all along could be accomplished simply by using an array for the schema value, there are advantages to be gained from implementing separate connection configuration properties for schema and search_path. The schema can then be restricted to its intended purpose, which is to "address" (prefix) objects in schemas that are not within the search_path, or to disambiguate between objects of the same name within schemas that are within the search_path). At the same time, search_path can be used for its intended purpose, which is wildly complex, and beyond the scope of what can be discussed reasonably in this venue. The point is simply to keep them separate.

"Getting this right" feels instrumental to a flexible and predictable PostgreSQL configuration, especially where PostgreSQL extension usage is concerned, because extensions are enabled on a per-schema basis in PostgreSQL.

I could be wrong (and often am), but the only benefit I see in defining a schema on a connection is to be used as a prefix when addressing a table in an Eloquent model or similar. In other words, prior to this "revelation", I was using the connection's schema only in order to switch between MySQL and PostgreSQL without having to change anything in the application's models.

Something like this:

trait EnhancesModels
{
    public function getDriverName($connectionName = null)
    {
        return DB::connection($connectionName)->getPdo()->getAttribute(\PDO::ATTR_DRIVER_NAME);
    }

    public function setFullyQualifiedTableNames()
    {
        $driver = $this->getDriverName();
        
        $table = $this->tablePrefix . $this->table;
        
        if ($driver === 'pgsql') {
            $this->table = $this->getConnection()->getConfig('schema') . '.' . $table;
        }
        elseif ($driver === 'mysql') {
            $this->table = $this->getConnection()->getConfig('database') . '.' . $table;
        }
    }
}

Maybe there is a better way to implement this, in which case I see no need for the schema connection property at all. My point is only that protected $table = 'myschema.users';, which would be necessary otherwise, won't fly if PostgreSQL is swapped-out for MySQL.

I am curious how other people feel about this, particularly those using PostgreSQL.

Thank you for your time and consideration.

@cbj4074
Copy link
Author

cbj4074 commented Dec 8, 2017

I've cooked-up a PR that I feel addresses this as elegantly as possible (while maintaining BC) and will submit it tomorrow for consideration. Thanks!

cbj4074 added a commit to cbj4074/framework that referenced this issue Dec 11, 2017
This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatable in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918
cbj4074 added a commit to cbj4074/framework that referenced this issue Dec 11, 2017
This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918
@cbj4074
Copy link
Author

cbj4074 commented Dec 11, 2017

PR: laravel/framework#22390

cbj4074 added a commit to cbj4074/framework that referenced this issue Dec 13, 2017
This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918

Fix flaws in how search_path is set on connection

This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918
cbj4074 added a commit to cbj4074/framework that referenced this issue Dec 14, 2017
This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918

Fix flaws in how search_path is set on connection

This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918
cbj4074 added a commit to cbj4074/framework that referenced this issue Dec 14, 2017
This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918

Fix flaws in how search_path is set on connection

This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918
cbj4074 added a commit to cbj4074/framework that referenced this issue Dec 14, 2017
This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918

Fix flaws in how search_path is set on connection

This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918
cbj4074 added a commit to cbj4074/framework that referenced this issue Dec 14, 2017
This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918
cbj4074 added a commit to cbj4074/framework that referenced this issue Dec 14, 2017
This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918
veksen pushed a commit to cbj4074/framework that referenced this issue Dec 16, 2017
This commit seeks to address two fundamental issues: 1) incorrect PostgreSQL terminology and application of concepts; 2) ineffective "search_path" values set on connection due to incorrect prepared statement parameter bindings (entire comma-separated string of schemas is quoted when each schema must be quoted separately).

In the existing code, "schema" is conflated with "search_path", which has significant negative implications for PostgreSQL-backed applications.

This commit corrects the terminology within the source so that the code reflects more accurately its actual behavior, but more importantly, it changes the connection configuration variable name that is used to set the "search_path" from "schema" to "searchpath".

The change allows for "decoupling" in that "schema" can still be used for its intended purpose (to prefix object references for schemas not in the "search_path", or to disambiguate objects of the same name that exist in more than one schema in the "search_path"), while "searchpath" can now be used for *its* intended purpose. In other words, "schema" and "searchpath" are now two separate and unrelated directives, which can be configured independently.

Note that the "schema" connection property is no longer used when configuring the connection, but may still be accessed as needed via getConnection()->getConfig('schema').

The change is backward-compatible in that if the "searchpath" key is not defined on the connection, the "schema" key will be used to set the search_path instead. If "schema" is not defined, PostgreSQL's default schema ("public") will be used.

Note further that "searchpath" may be specified as a string (which allows for a single schema only), or an array (which allows for any number of schemas, excluding only dynamic schemas that have special meaning to PostgreSQL, such as '$user'; support for $-style variable notation may be added in a future revision).

For an exhaustive discussion, see: laravel/ideas#918
@cbj4074
Copy link
Author

cbj4074 commented Dec 19, 2017

The more I dig into the issues that the apparent confusion between "schema" and "search path" has caused, the more I realize just how much work will be required to "fix" this and implement PostgreSQL properly.

That is certainly not to disparage other contributors or anyone who has worked to improve the implementation. I'm grateful that PostgreSQL is supported at all.

That said, this is a mess. Many hands have "touched" the affected code, and nobody is "on the same page". Some people still don't realize that a "schema" is not a "search path", some people think that the first schema in the search path is somehow more significant than any other, and so on.

Laravel is playing way too "fast and loose" with PostgreSQL's search path. The ways in which the current implementation is broken are not at all obvious until the developer is a) working in a multi-schema PostgreSQL layout and b) using PostgreSQL extensions (which are enabled on a per-schema basis). This is so hugely problematic because if the search path is mishandled, extensions may simply be ignored in unexpected (and "incorrect") ways. In such cases, the developer will not receive any type of error, however, his citext column, for example, will be treated case-sensitively. If others don't see the problem with this type of unpredictability, I'm not even sure how to help.

A quick smattering of related issues (and I'm sure there are many more):

Issue: laravel/framework#8639
PR: laravel/framework#15535

Issue: laravel/framework#19496
PR: laravel/framework#19553

Issue: One of the above issues
PR: laravel/framework#22365

The lack of consensus in this respect has caused a lot of frustration for me and my team. My humble opinion is that we need to dig-in, document the problems and use-cases (which is what I have started to do here), and fix this "the right way", instead of bolting-on one PR after another, each of which undoes the previous one's work, breaks it somehow, or otherwise exacerbates the problem. (As @sisve notes in the Comments, the above-cited PRs were accepted without tests.)

Furthermore, none of this behavior is documented, which contributes significantly to the problem. There seems to be some overwhelming desire to keep anything driver-specific out of the documentation, which I believe to be a huge mistake.

In short, my take is that we need to "get back to basics", identify how Laravel should behave, have the reasoned debates, post links to documentation, etc., and untangle this ball of yarn.

[insert Jerry Maguire speech here]

@cbj4074 cbj4074 changed the title Proposal to clarify distinction between "schema" and "search_path" in PostgreSQL Connector Proposal to address confusion between PostgreSQL concepts "schema" and "search_path" Dec 19, 2017
@sisve
Copy link

sisve commented Dec 19, 2017

Note how all of the PR:s you link are accepted without any tests or anything to avoid a future regression where the next dev forgets about some functionality and accidentally removes/modifies it.

@cbj4074
Copy link
Author

cbj4074 commented Dec 19, 2017

@sisve Duly noted; an excellent point. Thank you!

@mfn
Copy link

mfn commented Dec 19, 2017

The more I dig into the issues

my take is that we need to "get back to basics", identify how Laravel should behave, have the reasoned debates, post links to documentation, etc., and untangle this ball of yarn.

Unless I miss-read something, your laravel/framework#22414 basically fixes the issue. Yep, there was some more feedback and a bit later I also became aware of laravel/framework#22414 (comment) but is there more to it, really?

The renaming should definitely happen but I wouldn't target 5.5 but rather master (and, probably unpopular, I would break this and remove schema for postgres and note it in the upgrade guides; I assume people working with postgres would understand this anyway and you need to have a grasp of this concepts).

@cbj4074
Copy link
Author

cbj4074 commented Dec 20, 2017

Thanks for taking another look here, @mfn .

While I am interested in seeing this resolved to the affected individuals' satisfaction, I have forked and implemented the changes for my own purposes, so the incentive for me to go too far out of my way here is limited.

I would feel differently were it not for the lack of even-handedness in how PRs have been handled historically and how mine was handled in this instance. I won't bore the audience with the details thereof.

At least Taylor closed my first PR because it lacked tests, so that's progress. But the fact that I resubmitted it, with all tests passing, and it was treated like a hot potato, without so much as a comment from the project staff, leaves me with a sour taste in my mouth.

This is another point of frustration for me; there isn't one word about testing, Travis-CI, or any of the relevant expectations on https://laravel.com/docs/5.5/contributions . How is anybody supposed to know what's expected of a PR? If I were only slightly more cynical, I would assume that this is by design and only "those in the know" are allowed to contribute. It feels like an intentional barrier-to-entry that keeps the riff-raff at bay and ensures that only the inner-circle knows how to jump through the necessary hoops. Again, I would document the process, but I'm tired of wasting my time on failed attempts to fix problems with this project.

It would be one thing to say, "Can somebody vet this, please? I don't use PostgreSQL and am not qualified to assess the veracity of the PR", but there was only silence. It seems that pride may have been an obstacle in this specific instance, especially given that Taylor merged recent changes, without tests, that made the problem worse, but who knows.

In any case, I won't close this, but I won't be surprised if somebody else does before I feel like coming back to it.

Thanks again for your help with this @mfn ; I do appreciate it sincerely.

@cbj4074
Copy link
Author

cbj4074 commented Mar 1, 2018

On March 1, 2018, the PostgreSQL authors pushed a release to address CVE-2018-1058.

I have yet to assess fully what implications, if any, this release has regarding the subject of this Issue, but cursory analysis suggests that the changes are to the PostgreSQL documentation, primarily, and not to the Search Path behavior.

The cited guide contains useful background information regarding Search Path behavior, in general, and is worth reading for anybody who is interested in addressing this Issue in Laravel.

@taylorotwell
Copy link
Member

Feel free to open any PRs if you think Laravel is handling this incorrectly.

@cbj4074
Copy link
Author

cbj4074 commented Dec 10, 2020

Implemented via laravel/framework#35463 and laravel/framework#35530 .

Finally, with regard to CVE-2018-1058, as noted in #918 (comment) , my contention is that securing the PostgreSQL environment to prevent the vulnerabilities described is outside the scope of Laravel's responsibilities, so no further action is required.

Thank you @taylorotwell for your willingness to accept these improvements to Laravel's PostgreSQL implementation.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants