[DO NOT SQUASH MERGE] Community Merge for 3_8 stable #498

roshan0708 · 2024-12-09T15:45:03Z

Description

Merge community commits till stamp 15.10.
Extension PR: babelfish-for-postgresql/babelfish_extensions#3223

Issues Resolved

Task: BABEL-5435

Check List

Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the PostgreSQL license, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Replace a static scratch buffer with a local variable, because a static buffer makes the function not thread-safe. This function is used in client-code in libpq, so it needs to be thread-safe. It was until commit b67b57a, which replaced the implementation with the one from pgcrypto. Backpatch to v14, where we switched to the new implementation. Reviewed-by: Robert Haas, Michael Paquier Discussion: https://www.postgresql.org/message-id/[email protected] (cherry picked from commit a38f5f880d1e86fedf1616de5391df5f0d5e4239) (cherry picked from commit 5d0eadb)

If the plancache entry for the CALL statement is already stale, it's possible for us to fetch an old procedure OID out of it, and then fail with "cache lookup failed for function NNN". In ordinary usage this never happens because make_callstmt_target is called just once immediately after building the plancache entry. It can be forced however by setting up an erroneous CALL (that causes make_callstmt_target itself to report an error), then dropping/recreating the target procedure, then repeating the erroneous CALL. To fix, use SPI_plan_get_cached_plan() to fetch the plancache's plan, rather than assuming we can use SPI_plan_get_plan_sources(). This shouldn't add any noticeable overhead in the normal case, and in the stale-plan case we'd have had to replan anyway a little further down. The other callers of SPI_plan_get_plan_sources() seem OK, because either they don't need up-to-date plans or they know that the query was just (re) planned. But add some commentary in hopes of not falling into this trap again. Per bug #18574 from Song Hongyu. Back-patch to v14 where this coding was introduced. (Older branches have comparable code, but it's run after any required replanning, so there's no issue.) Discussion: https://postgr.es/m/[email protected] (cherry picked from commit de35207015567da892d825a704933e6681347f19) (cherry picked from commit f119cd3)

Commit 0b9466f added a dependency on fe_memutils' pnstrdup() inside informix.c. This adds an exit() path in a library, which we don't want. (Unlike libpq, the ecpg libraries don't have an automated check for that, but it makes sense to keep them to a similar standard.) The ecpg code can already handle failure results from the *strdup() call by itself. Author: Jacob Champion <[email protected]> Discussion: https://www.postgresql.org/message-id/CAOYmi+=pg=W5L1h=3MEP_EB24jaBu2FyATrLXqQHGe7cpuvwyg@mail.gmail.com (cherry picked from commit 2de129b356bfde9a02269c174396add065ace260) (cherry picked from commit 25e4983)

getTimelineHistory() is called twice, to read the source and the target timeline history files. However, the loop to print the file with the --debug option used the wrong variable when dealing with the source. As a result, the source's history was always printed as empty. Spotted while debugging bug #18575, but this does not fix that bug, just the debugging output. Backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/[email protected] (cherry picked from commit b5a5027c9796e2958392f4682928d47b5b0d0e47) (cherry picked from commit 5f99767)

Trying to attach a table as a partition which is already on the referenced side of a foreign key on the partitioned table that it is being attached to, leads to strange behavior: we try to clone the foreign key from the parent to the partition, but this new FK points to the partition itself, and the mix of pg_constraint rows and triggers doesn't behave well. Rather than trying to untangle the mess (which might be possible given sufficient time), I opted to forbid the ATTACH. This doesn't seem a problematic restriction, given that we already fail to create the foreign key if you do it the other way around, that is, having the partition first and the FK second. Backpatch to all supported branches. Reported-by: Alexander Lakhin <[email protected]> Reviewed-by: Tender Wang <[email protected]> Discussion: https://postgr.es/m/[email protected] (cherry picked from commit 8c0944ac878d3d3e433da8feb2f7e134be8353f3) (cherry picked from commit 30c4133)

To deparse a reference to a field of a RECORD-type output of a subquery, EXPLAIN normally digs down into the subquery's plan to try to discover exactly which anonymous RECORD type is meant. However, this can fail if the subquery has been optimized out of the plan altogether on the grounds that no rows could pass the WHERE quals, which has been possible at least since 3fc6e2d. There isn't anything remaining in the plan tree that would help us, so fall back to printing the field name as "fN" for the N'th column of the record. (This will actually be the right thing some of the time, since it matches the column names we assign to RowExprs.) In passing, fix a comment typo in create_projection_plan, which I noticed while experimenting with an alternative fix for this. Per bug #18576 from Vasya B. Back-patch to all supported branches. Richard Guo and Tom Lane Discussion: https://postgr.es/m/[email protected] (cherry picked from commit 12010f4146156ae276e5e606400ed9ce39a83e0b) (cherry picked from commit 164de03)

This section claims we use CRC-32 for WAL records and two-phase state files, but we've actually used CRC-32C since v9.5 (commit 5028f22). Fix that. Reviewed-by: Robert Haas Discussion: https://postgr.es/m/ZrUFpLP-w2zTAHqq%40nathan Backpatch-through: 12 (cherry picked from commit 75bb3354e6da6cceae9756d2e3c473088e554b16) (cherry picked from commit f44e4ba)

The code intends to allow GUCs to be set within parallel workers via function SET clauses, but not otherwise. However, doing so fails for "session_authorization" and "role", because the assign hooks for those attempt to set the subsidiary "is_superuser" GUC, and that call falls foul of the "not otherwise" prohibition. We can't switch to using GUC_ACTION_SAVE for this, so instead add a new GUC variable flag GUC_ALLOW_IN_PARALLEL to mark is_superuser as being safe to set anyway. (This is okay because is_superuser has context PGC_INTERNAL and thus only hard-wired calls can change it. We'd need more thought before applying the flag to other GUCs; but maybe there are other use-cases.) This isn't the prettiest fix perhaps, but other alternatives we thought of would be much more invasive. While here, correct a thinko in commit 059de3ca4: when rejecting a GUC setting within a parallel worker, we should return 0 not -1 if the ereport doesn't longjmp. (This seems to have no consequences right now because no caller cares, but it's inconsistent.) Improve the comments to try to forestall future confusion of the same kind. Despite the lack of field complaints, this seems worth back-patching. Thanks to Nathan Bossart for the idea to invent a new flag, and for review. Discussion: https://postgr.es/m/[email protected] (cherry picked from commit 2f4e895be76044fc09bdf96617043c94a12079e7) (cherry picked from commit 6025dba)

Coverity thinks dpns->plan could be null at these points. That shouldn't really be possible, but it's easy enough to modify the Asserts so they'd not core-dump if it were true. These are new in b919a97a6. Back-patch to v13; the v12 version of the patch didn't have these Asserts. (cherry picked from commit 16e67bc5f98f2f5691fb5c01d5a8310de1c62b81) (cherry picked from commit ffdeeba)

If a partition undergoes DETACH CONCURRENTLY immediately followed by DROP, this could cause a problem for a concurrent transaction recomputing the partition descriptor when running a prepared statement, because it tries to dereference a pointer to a tuple that's not found in a catalog scan. The existing retry logic added in commit dbca3469ebf8 is sufficient to cope with the overall problem, provided we don't try to dereference a non-existant heap tuple. Arguably, the code in RelationBuildPartitionDesc() has been wrong all along, since no check was added in commit 898e5e3 against receiving a NULL tuple from the catalog scan; that bug has only become user-visible with DETACH CONCURRENTLY which was added in branch 14. Therefore, even though there's no known mechanism to cause a crash because of this, backpatch the addition of such a check to all supported branches. In branches prior to 14, this would cause the code to fail with a "missing relpartbound for relation XYZ" error instead of crashing; that's okay, because there are no reports of such behavior anyway. Author: Kuntal Ghosh <[email protected]> Reviewed-by: Junwang Zhao <[email protected]> Reviewed-by: Tender Wang <[email protected]> Discussion: https://postgr.es/m/[email protected] (cherry picked from commit 305db95434c9f68811f2489ec57d5a0435608c46) (cherry picked from commit ba3226a)

Change "$1" to "username". Reported-by: [email protected] Discussion: https://postgr.es/m/[email protected] Backpatch-through: 12 (cherry picked from commit e8f36414afac579f32c06573344c8ff4d854b59d) (cherry picked from commit 3e9fd85)

Commit c66a7d75e652 modified DROP DATABASE so that if interrupted, the database is known to be in an invalid state and can only be dropped. This is done by setting a flag using an in-place update, so that it's not lost in case of rollback. For databases with many ACLs, this may however fail like this: ERROR: wrong tuple length This happens because with many ACLs, the pg_database.datacl attribute gets TOASTed. The dropdb() code reads the tuple from the syscache, which means it's detoasted. But the in-place update expects the tuple length to match the on-disk tuple. Fixed by reading the tuple from the catalog directly, not from syscache. Report and fix by Ayush Tiwari. Backpatch to 12. The DROP DATABASE fix was backpatched to 11, but 11 is EOL at this point. Reported-by: Ayush Tiwari Author: Ayush Tiwari Reviewed-by: Tomas Vondra Backpatch-through: 12 Discussion: https://postgr.es/m/CAJTYsWWNkCt+-UnMhg=BiCD3Mh8c2JdHLofPxsW3m2dkDFw8RA@mail.gmail.com (cherry picked from commit df9c5fb5837fed76e2c80daa03a2b0443983614b) (cherry picked from commit 49da0d0)

MacPorts version 2.9.3 started failing in our ci_macports_packages.sh script, for reasons not fully determined, but plausibly linked to the release of 2.10.1. 2.10.1 seems to work, so let's switch to it. Back-patch to 15, where CI began. Reported-by: Peter Eisentraut <[email protected]> Discussion: https://postgr.es/m/81f104e8-f0a9-43c0-85bd-2bbbf590a5b8%40eisentraut.org (cherry picked from commit 4247575c637b4d049346b96514ac9ca3363276d5) (cherry picked from commit 7b9dd52)

Commit ca051d8 called newlocale(LC_COLLATE, ...) instead of newlocale(LC_COLLATE_MASK, ...), in code reached only on FreeBSD. They have the same value on that OS, explaining why it worked. Fix. Back-patch to 14, where ca051d8 landed. (cherry picked from commit c1bb534bae431f4bbc50abf471507457e1b24efc) (cherry picked from commit e481c91)

Commit 274bbced disabled session tickets for TLSv1.3 on top of the already disabled TLSv1.2 session tickets, but accidentally caused a regression where TLSv1.2 session tickets were incorrectly sent. Fix by unconditionally disabling TLSv1.2 session tickets and only disable TLSv1.3 tickets when the right version of OpenSSL is used. Backpatch to all supported branches. Reported-by: Cameron Vogt <[email protected]> Reported-by: Fire Emerald <[email protected]> Reviewed-by: Jacob Champion <[email protected]> Discussion: https://postgr.es/m/DM6PR16MB3145CF62857226F350C710D1AB852@DM6PR16MB3145.namprd16.prod.outlook.com Backpatch-through: v12 (cherry picked from commit 23c200940eae6e7d9cf5712c3514691bfdaf3904) (cherry picked from commit c0dfbe3)

Add a comment explaining dropdb() can't rely on syscache. The issue with flattened rows was fixed by commit 0f92b230f88b, but better to have a clear explanation why the systable scan is necessary. The other places doing in-place updates on pg_database have the same comment. Suggestion and patch by Yugo Nagata. Backpatch to 12, same as the fix. Author: Yugo Nagata Backpatch-through: 12 Discussion: https://postgr.es/m/CAJTYsWWNkCt+-UnMhg=BiCD3Mh8c2JdHLofPxsW3m2dkDFw8RA@mail.gmail.com (cherry picked from commit e498d22e21358e196c1db9b5f1300eae667ea2bd) (cherry picked from commit 4651b01)

When a partition is detached and immediately dropped, a prepared statement could try to compute a new partition descriptor that includes it. This leads to this kind of error: ERROR: could not open relation with OID 457639 Avoid this by skipping the partition in expand_partitioned_rtentry if it doesn't exist. Noted by me while investigating bug #18559. Kuntal Gosh helped to identify the exact failure. Backpatch to 14, where DETACH CONCURRENTLY was introduced. Author: Álvaro Herrera <[email protected]> Reviewed-by: Kuntal Ghosh <[email protected]> Reviewed-by: Junwang Zhao <[email protected]> Discussion: https://postgr.es/m/[email protected] (cherry picked from commit be73e7008558cef1f59b0f4a61cefb6ec53c45c5) (cherry picked from commit 320c5bc)

Document the hard limit stemming from the size of an OID, and also mention the perfomance impact that occurs before the hard limit is reached. Jakub Wartak and Robert Haas Backpatch to all supported versions Discussion: https://postgr.es/m/CAKZiRmwWhp2yxjqJLwbBjHdfbJBcUmmKMNAZyBjjtpgM9AMatQ%40mail.gmail.com (cherry picked from commit e51160fa0bab7203e37094fbb0349b0c324b09e7) (cherry picked from commit 1d3ffda)

The descriptions for ProcArrayGroupUpdate and XactGroupUpdate claim that these events mean we are waiting for the group leader "at end of a parallel operation," but neither pertains to parallel operations. This commit reverts these descriptions to their wording before commit 3048898, i.e., "end of a parallel operation" is changed to "transaction end." Author: Sameer Kumar Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAGPeHmh6UMrKQHKCmX%2B5vV5TH9P%3DKw9en3k68qEem6J%3DyrZPUA%40mail.gmail.com Backpatch-through: 13 (cherry picked from commit 5d9170697e4e6dd282b015890bc29baf978c6cb5) (cherry picked from commit 8c47f98)

This does not make sense. It would write the output of the USING clause into the converted column, which would violate the generation expression. This adds a check to error out if this is specified. There was a test for this, but that test errored out for a different reason, so it was not effective. Reported-by: Jian He <[email protected]> Reviewed-by: Yugo NAGATA <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/c7083982-69f4-4b14-8315-f9ddb20b9834%40eisentraut.org (cherry picked from commit cf49a606c416634fd6af82eb559331fd6052835d) (cherry picked from commit 7a1f030)

This change improves the description of the restrict_nonsystem_relation_kind parameter in guc_table.c and the documentation for better clarity. Backpatch to 12, where this GUC parameter was introduced. Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/6a96f1af-22b4-4a80-8161-1f26606b9ee2%40eisentraut.org Backpatch-through: 12 (cherry picked from commit 886549fb105f0c3d6a24064a9ddaf4e8602e0382) (cherry picked from commit 2e5c0e9)

The first test was sensitive to the insert LSN after setting up the catalogs, which depended on environmental things like the locales on the OS and usernames. Switch to a new WAL file before the first test, as a simple way to put every computer into the same state. Back-patch to all supported releases. Reported-by: Anton Voloshin <[email protected]> Reported-by: Nathan Bossart <[email protected]> Reviewed-by: Tom Lane <[email protected]> Reviewed-by: Nathan Bossart <[email protected]> Discussion: https://postgr.es/m/b26aeac2-cb6d-4633-a7ea-945baae83dcf%40postgrespro.ru (cherry picked from commit 777f50b9b5822d1058c59fc2ee1d1a7e9c766bf8) (cherry picked from commit 19abfd7)

Since commit 2549f06, we reject an identifier immediately following a numeric literal (without separating whitespace), because that risks ambiguity with hex/octal/binary integers. However, that patch used token patterns like "{integer}{ident_start}", which is problematic because {ident_start} matches only a single byte. If the first character after the integer is a multibyte character, this ends up with flex reporting an error message that includes a partial multibyte character. That can cause assorted bad-encoding problems downstream, both in the report to the client and in the postmaster log file. To fix, use {identifier} not {ident_start} in the "junk" token patterns, so that they will match complete multibyte characters. This seems generally better user experience quite aside from the encoding problem: for "123abc" the error message will now say that the error appeared at or near "123abc" instead of "123a". While at it, add some commentary about why these patterns exist and how they work. Report and patch by Karina Litskevich; review by Pavel Borisov. Back-patch to v15 where the problem came in. Discussion: https://postgr.es/m/CACiT8iZ_diop=0zJ7zuY3BXegJpkKK1Av-PU7xh0EDYHsa5+=g@mail.gmail.com (cherry picked from commit f37ac613a835c8ff28a2f23abe14c88fbac8b039) (cherry picked from commit 8c1c2ba)

…essions As introduced by f9900df, a REINDEX CONCURRENTLY job done for an index with predicates or expressions would set PROC_IN_SAFE_IC in its MyProc->statusFlags, causing it to be ignored by other concurrent operations. Such concurrent index rebuilds should never be ignored, as a predicate or an expression could call a user-defined function that accesses a different table than the table where the index is rebuilt. A test that uses injection points is added, backpatched down to 17. Michail has proposed a different test, but I have added something simpler with more coverage. Oversight in f9900df. Author: Michail Nikolaev Discussion: https://postgr.es/m/CANtu0oj9A3kZVduFTG0vrmGnKB+DCHgEpzOp0qAyOgmks84j0w@mail.gmail.com Backpatch-through: 14 (cherry picked from commit 239837a708cc2c6f27d96ddb3faf89691f2a22cd) (cherry picked from commit e54e012)

check_agglevels_and_constraints() asserted that if we find an aggregate function in an EXPR_KIND_FROM_SUBSELECT expression, the expression must be in a LATERAL subquery. Alexander Lakhin found a case where that's not so: because of the odd scoping rules for NEW/OLD within a rule, a reference to NEW/OLD could cause an aggregate to be considered top-level even though it's in an unmarked sub-select. The error message that would be thrown seems sufficiently on-point, so just remove the Assert. (Hence, this is not a bug for production builds.) This Assert was added by me in commit eaccfde (9.3 era). It looks like I put it in to cross-check that the new logic for detecting misplaced aggregates (using agglevelsup) caught the same cases that a previous check on p_lateral_active did. So there might have been some related misbehavior before eaccfde ... but that's very ancient history by now, so I didn't dig any deeper. Per bug #18608 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/[email protected] (cherry picked from commit 78d0bd452c78b17a788d2f15dc4e28bd3d3159c0) (cherry picked from commit f8f7741)

Commit 4b82664 restricted a number of functions provided by contrib modules to only relations that use the "heap" table access method. Sequences always use this table access method, but they do not advertise as such in the pg_class system catalog, so the aforementioned commit also (presumably unintentionally) removed support for sequences from some of these functions. This commit reintroduces said support for sequences to these functions and adds a couple of relevant tests. Co-authored-by: Ayush Vatsa Reviewed-by: Robert Haas, Michael Paquier, Matthias van de Meent Discussion: https://postgr.es/m/CACX%2BKaP3i%2Bi9tdPLjF5JCHVv93xobEdcd_eB%2B638VDvZ3i%3DcQA%40mail.gmail.com Backpatch-through: 12 (cherry picked from commit e03042a7003dcf0e966f0c6ea3ecf128455bf2b3) (cherry picked from commit 9c0e0f9)

I managed to break this test in two different ways in commit 05036a3155. First, the output of the new call to tuple_data_split() on the test sequence is dependent on endianness. This is fixed by setting a special start value for the test sequence that produces the same output regardless of the endianness of the machine. Second, on versions older than v15, the new test case fails under "force_parallel_mode = regress" with the following error: ERROR: cannot access temporary tables during a parallel operation This is because pageinspect's disk-accessing functions are incorrectly marked PARALLEL SAFE on versions older than v15 (see commit aeaaf520f4 for details). This one is fixed by changing the test sequence to be permanent. The only reason it was previously marked temporary was to avoid needing a DROP SEQUENCE command at the end of the test. Unlike some other tests in this file, the use of a permanent sequence here shouldn't result in any test instability like what was fixed by commit e2933a6e11. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/ZuOKOut5hhDlf_bP%40nathan Backpatch-through: 12 (cherry picked from commit a63aef5e496c1db0325845892d946e8569e7f9d6) (cherry picked from commit 9909c62)

When we are building a hash index that is large enough to need pre-sorting (larger than either maintenance_work_mem or NBuffers), the initial sorting phase is interruptible, but the insertion phase wasn't. Add the missing CHECK_FOR_INTERRUPTS(). Per bug #18616 from Alexander Lakhin. Back-patch to all supported branches. Pavel Borisov Discussion: https://postgr.es/m/[email protected] (cherry picked from commit e0857898b87b745235e65c28feed6e23e121876d) (cherry picked from commit 6c4dad4)

Latest versions of Strawberry Perl define USE_THREAD_SAFE_LOCALE, and we therefore get a handshake error when building against such instances. The solution is to perform a test to see if USE_THREAD_SAFE_LOCALE is defined and only define NO_THREAD_SAFE_LOCALE if it isn't. Backpatch the meson.build fix back to release 16 and apply the same logic to Mkvcbuild.pm in releases 12 through 16. Original report of the issue from Muralikrishna Bandaru. (cherry picked from commit 17c35ab236980fce2989f7fac7cee42ca4d5ca04) (cherry picked from commit dcb5c51)

Historically we've used timezone "PST8PDT", but the recent release 2024b of tzdb changes the definition of that zone in a way that breaks many test cases concerned with dates before 1970. Although we've not yet adopted 2024b into our own tree, this is already problematic for people using --with-system-tzdata if their platform has already adopted 2024b. To work with both older and newer versions of tzdb, switch to using "America/Los_Angeles", accepting the ensuing changes in regression test results. Back-patch to all supported branches. Per report and patch from Wolfgang Walther. Discussion: https://postgr.es/m/[email protected] (cherry picked from commit 2b94ee58bf256e1b4682f21f13d0f2c73e1b0cc2) (cherry picked from commit 516a630)

This commit changes libpq so that errors reported by the backend during the protocol negotiation for SSL and GSS are discarded by the client, as these may include bytes that could be consumed by the client and write arbitrary bytes to a client's terminal. A failure with the SSL negotiation now leads to an error immediately reported, without a retry on any other methods allowed, like a fallback to a plaintext connection. A failure with GSS discards the error message received, and we allow a fallback as it may be possible that the error is caused by a connection attempt with a pre-11 server, GSS encryption having been introduced in v12. This was a problem only with v17 and newer versions; older versions discard the error message already in this case, assuming a failure caused by a lack of support for GSS encryption. Author: Jacob Champion Reviewed-by: Peter Eisentraut, Heikki Linnakangas, Michael Paquier Security: CVE-2024-10977 Backpatch-through: 12 (cherry picked from commit d2c3e31c13a6820980c2c6019f0b8f9f0b63ae6e) (cherry picked from commit 9186308)

Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: ecbca81dbf801f683e24897668cec8d1fb0f55a5 (cherry picked from commit 3f2c24e55b7ecfd06a4b65b9a52f74f27b580046) (cherry picked from commit e4000ed)

Many process environment variables (e.g. PATH), bypass the containment expected of a trusted PL. Hence, trusted PLs must not offer features that achieve setenv(). Otherwise, an attacker having USAGE privilege on the language often can achieve arbitrary code execution, even if the attacker lacks a database server operating system user. To fix PL/Perl, replace trusted PL/Perl %ENV with a tied hash that just replaces each modification attempt with a warning. Sites that reach these warnings should evaluate the application-specific implications of proceeding without the environment modification: Can the application reasonably proceed without the modification? If no, switch to plperlu or another approach. If yes, the application should change the code to stop attempting environment modifications. If that's too difficult, add "untie %main::ENV" in any code executed before the warning. For example, one might add it to the start of the affected function or even to the plperl.on_plperl_init setting. In passing, link to Perl's guidance about the Perl features behind the security posture of PL/Perl. Back-patch to v12 (all supported versions). Andrew Dunstan and Noah Misch Security: CVE-2024-10979 (cherry picked from commit e530835c6cc5b2dbf330ebe6b0a7fb9f19f5a54c) (cherry picked from commit 713085a)

If a CTE, subquery, sublink, security invoker view, or coercion projection references a table with row-level security policies, we neglected to mark the plan as potentially dependent on which role is executing it. This could lead to later executions in the same session returning or hiding rows that should have been hidden or returned instead. Reported-by: Wolfgang Walther Reviewed-by: Noah Misch Security: CVE-2024-10976 Backpatch-through: 12 (cherry picked from commit 6db5ea8de8ce15897b706009aaf701d23bd65b23) (cherry picked from commit 0de4f01)

The SQL spec mandates that SET SESSION AUTHORIZATION implies SET ROLE NONE. We tried to implement that within the lowest-level functions that manipulate these settings, but that was a bad idea. In particular, guc.c assumes that it doesn't matter in what order it applies GUC variable updates, but that was not the case for these two variables. This problem, compounded by some hackish attempts to work around it, led to some security-grade issues: * Rolling back a transaction that had done SET SESSION AUTHORIZATION would revert to SET ROLE NONE, even if that had not been the previous state, so that the effective user ID might now be different from what it had been. * The same for SET SESSION AUTHORIZATION in a function SET clause. * If a parallel worker inspected current_setting('role'), it saw "none" even when it should see something else. Also, although the parallel worker startup code intended to cope with the current role's pg_authid row having disappeared, its implementation of that was incomplete so it would still fail. Fix by fully separating the miscinit.c functions that assign session_authorization from those that assign role. To implement the spec's requirement, teach set_config_option itself to perform "SET ROLE NONE" when it sets session_authorization. (This is undoubtedly ugly, but the alternatives seem worse. In particular, there's no way to do it within assign_session_authorization without incompatible changes in the API for GUC assign hooks.) Also, improve ParallelWorkerMain to directly set all the relevant user-ID variables instead of relying on some of them to get set indirectly. That allows us to survive not finding the pg_authid row during worker startup. In v16 and earlier, this includes back-patching 9987a7bf3 which fixed a violation of GUC coding rules: SetSessionAuthorization is not an appropriate place to be throwing errors from. Security: CVE-2024-10978 (cherry picked from commit a5d2e6205f716c79ecfb15eb1aae75bae3f8daa9) (cherry picked from commit 97b12af)

v16 commit 8fe3e697a1a83a722b107c7cb9c31084e1f4d077 used REGRESS_OPTS in a way needing this. That broke "vcregress plcheck". Back-patch v16..v12; newer versions don't have this build system. (cherry picked from commit 16ed4f4d08d69f2aced8ece69ba1076b48c9ce06) (cherry picked from commit 2a92ae4)

TestUpgradeXversion knows how to make the main regression database's references to pg_regress.so be version-independent. But it doesn't do that for plperl's database, so that the C function added by commit b7e3a52a8 is causing cross-version upgrade test failures. Path of least resistance is to just drop the function at the end of the new test. In <= v14, also take the opportunity to clean up the generated test files. Security: CVE-2024-10979 (cherry picked from commit c834b375a6dc36ff92f9f738ef1d7af09d91165f) (cherry picked from commit 6fc3618)

…cks. Commit 5a2fed911 had an unexpected side-effect: the parallel worker launched for the new test case would fail if it couldn't use a superuser-reserved connection slot. The reason that test failed while all our pre-existing ones worked is that the connection privilege tests in InitPostgres had been based on the superuserness of the leader's AuthenticatedUserId, but after the rearrangements of 5a2fed911 we were testing the superuserness of CurrentUserId, which the new test case deliberately made to be a non-superuser. This all seems very accidental and probably not the behavior we really want, but a security patch is no time to be redesigning things. Pending some discussion about desirable semantics, hack it so that InitPostgres continues to pay attention to the superuserness of AuthenticatedUserId when starting a parallel worker. Nathan Bossart and Tom Lane, per buildfarm member sawshark. Security: CVE-2024-10978 (cherry picked from commit 109a323807d752f66699a9ce0762244f536e784f) (cherry picked from commit 0c0f2ec)

Security: CVE-2024-10976, CVE-2024-10977, CVE-2024-10978, CVE-2024-10979 (cherry picked from commit b83b358b1bac61ca46af995a3da83cc515a434c6) (cherry picked from commit 3b777bc)

(cherry picked from commit 0c53d54c812cea0d840490fd107910ed949e18c2) (cherry picked from commit f668fe1)

The current code calls array_eq() and does not provide FmgrInfo. This commit provides initialization of FmgrInfo and uses C collation as the safe option for text comparison because we don't know anything about the semantics of opclass options. Backpatch to 13, where opclass options were introduced. Reported-by: Nicolas Maus Discussion: https://postgr.es/m/18692-72ea398df3ec6712%40postgresql.org Backpatch-through: 13 (cherry picked from commit 713b8546aba66be102bdd8b320c06ea3b2813fd9) (cherry picked from commit a10058e)

Maintain the pg_stat_user_indexes.idx_scan pgstat counter during contrib/Bloom index scans. Oversight in commit 9ee014f, which added the Bloom index contrib module. Author: Masahiro Ikeda <[email protected]> Reviewed-By: Peter Geoghegan <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch: 13- (all supported branches). (cherry picked from commit 16a2bb0793ac269943b0d676bac781e3691e28a0) (cherry picked from commit 7c29c29)

This fixes a set of race conditions with cumulative statistics where a shared stats entry could be dropped while it should still be valid in the event when it is reused: an entry may refer to a different object but requires the same hash key. This can happen with various stats kinds, like: - Replication slots that compute internally an index number, for different slot names. - Stats kinds that use an OID in the object key, where a wraparound causes the same key to be used if an OID is used for the same object. - As of PostgreSQL 18, custom pgstats kinds could also be an issue, depending on their implementation. This issue is fixed by introducing a counter called "generation" in the shared entries via PgStatShared_HashEntry, initialized at 0 when an entry is created and incremented when the same entry is reused, to avoid concurrent issues on drop because of other backends still holding a reference to it. This "generation" is copied to the local copy that a backend holds when looking at an object, then cross-checked with the shared entry to make sure that the entry is not dropped even if its "refcount" justifies that if it has been reused. This problem could show up when a backend shuts down and needs to discard any entries it still holds, causing statistics to be removed when they should not, or even an assertion failure. Another report involved a failure in a standby after an OID wraparound, where the startup process would FATAL on a "can only drop stats once", stopping recovery abruptly. The buildfarm has been sporadically complaining about the problem, as well, but the window is hard to reach with the in-core tests. Note that the issue can be reproduced easily by adding a sleep before dshash_find() in pgstat_release_entry_ref() to enlarge the problematic window while repeating test_decoding's isolation test oldest_xmin a couple of times, for example, as pointed out by Alexander Lakhin. Reported-by: Alexander Lakhin, Peter Smith Author: Kyotaro Horiguchi, Michael Paquier Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/CAA4eK1KxuMVyAryz_Vk5yq3ejgKYcL6F45Hj9ZnMNBS-g+PuZg@mail.gmail.com Discussion: https://postgr.es/m/[email protected] Backpatch-through: 15 (cherry picked from commit 154c5b42a3d80424f7b7beef33a69600245c147d) (cherry picked from commit 1fc9015)

Previously, in unlucky cases, it was possible for pg_rewind to remove certain WAL segments from the rewound demoted primary. In particular this happens if those files have been marked for archival (i.e., their .ready files were created) but not yet archived; the newly promoted node no longer has such files because of them having been recycled, but they are likely critical for recovery in the demoted node. If pg_rewind removes them, recovery is not possible anymore. Fix this by maintaining a hash table of files in this situation in the scan that looks for a checkpoint, which the decide_file_actions phase can consult so that it knows to preserve them. Backpatch to 14. The problem also exists in 13, but that branch was not blessed with commit eb00f1d, so this patch is difficult to apply there. Users of older releases will just have to continue to be extra careful when rewinding. Co-authored-by: Полина Бунгина (Polina Bungina) <[email protected]> Co-authored-by: Alexander Kukushkin <[email protected]> Reviewed-by: Kyotaro Horiguchi <[email protected]> Reviewed-by: Atsushi Torikoshi <[email protected]> Discussion: https://postgr.es/m/CAAtGL4AhzmBRsEsaDdz7065T+k+BscNadfTqP1NcPmsqwA5HBw@mail.gmail.com (cherry picked from commit e28cf2fbc222a607377813590e4bee448fcf0a29) (cherry picked from commit 97b814c)

In commit 08c0d6a which introduced "rainbow" arcs in regex NFAs, I didn't think terribly hard about what to do when creating the color complement of a rainbow arc. Clearly, the complement cannot match any characters, and I took the easy way out by just not building any arcs at all in the complement arc set. That mostly works, but Nikolay Shaplov found a case where it doesn't: if we decide to delete that sub-NFA later because it's inside a "{0}" quantifier, delsub() suffered an assertion failure. That's because delsub() relies on the target sub-NFA being fully connected. That was always true before, and the best fix seems to be to restore that property. Hence, invent a new arc type CANTMATCH that can be generated in place of an empty color complement, and drop it again later when we start NFA optimization. (At that point we don't need to do delsub() any more, and besides there are other cases where NFA optimization can lead to disconnected subgraphs.) It appears that this bug has no consequences in a non-assert-enabled build: there will be some transiently leaked NFA states/arcs, but they'll get cleaned up eventually. Still, we don't like assertion failures, so back-patch to v14 where rainbow arcs were introduced. Per bug #18708 from Nikolay Shaplov. Discussion: https://postgr.es/m/[email protected] (cherry picked from commit 2496c3f6f1bf5a735184d27d81527dfea7ad9e9b) (cherry picked from commit c6e2eeb)

…kwards. Previously LogicalIncreaseRestartDecodingForSlot() accidentally accepted any LSN as the candidate_lsn and candidate_valid after the restart_lsn of the replication slot was updated, so it potentially caused the restart_lsn to move backwards. A scenario where this could happen in logical replication is: after a logical replication restart, based on previous candidate_lsn and candidate_valid values in memory, the restart_lsn advances upon receiving a subscriber acknowledgment. Then, logical decoding restarts from an older point, setting candidate_lsn and candidate_valid based on an old RUNNING_XACTS record. Subsequent subscriber acknowledgments then update the restart_lsn to an LSN older than the current value. In the reported case, after WAL files were removed by a checkpoint, the retreated restart_lsn prevented logical replication from restarting due to missing WAL segments. This change essentially modifies the 'if' condition to 'else if' condition within the function. The previous code had an asymmetry in this regard compared to LogicalIncreaseXminForSlot(), which does almost the same thing for different fields. The WAL removal issue was reported by Hubert Depesz Lubaczewski. Backpatch to all supported versions, since the bug exists since 9.4 where logical decoding was introduced. Reviewed-by: Tomas Vondra, Ashutosh Bapat, Amit Kapila Discussion: https://postgr.es/m/Yz2hivgyjS1RfMKs%40depesz.com Discussion: https://postgr.es/m/85fff40e-148b-4e86-b921-b4b846289132%40vondra.me Backpatch-through: 13 (cherry picked from commit 91771b3fbbc33e066e9a28a7d85bde87f5a0c900) (cherry picked from commit 528310a)

After commit 5a2fed911a85ed6d8a015a6bafe3a0d9a69334ae, the catalog state resulting from these commands ceased to affect sessions. Restore the longstanding behavior, which is like beginning the session with a SET ROLE command. If cherry-picking the CVE-2024-10978 fixes, default to including this, too. (This fixes an unintended side effect of fixing CVE-2024-10978.) Back-patch to v12, like that commit. The release team decided to include v12, despite the original intent to halt v12 commits earlier this week. Tom Lane and Noah Misch. Reported by Etienne LAFARGE. Discussion: https://postgr.es/m/CADOZwSb0UsEr4_UTFXC5k7=fyyK8uKXekucd+-uuGjJsGBfxgw@mail.gmail.com (cherry picked from commit edf80895f6bda824403f843df91cbc83890e4b6c) (cherry picked from commit e06cd04)

Commits aac2c9b4f et al. added a bool field to struct ResultRelInfo. That's no problem in the master branch, but in released branches care must be taken when modifying publicly-visible structs to avoid an ABI break for extensions. Frequently we solve that by adding the new field at the end of the struct, and that's what was done here. But ResultRelInfo has stricter constraints than just about any other node type in Postgres. Some executor APIs require extensions to index into arrays of ResultRelInfo, which means that any change whatever in sizeof(ResultRelInfo) causes a fatal ABI break. Fortunately, this is easy to fix, because the new field can be squeezed into available padding space instead --- indeed, that's where it was put in master, so this fix also removes a cross-branch coding variation. Per report from Pavan Deolasee. Patch v14-v17 only; earlier versions did not gain the extra field, nor is there any problem in master. Discussion: https://postgr.es/m/CABOikdNmVBC1LL6pY26dyxAS2f+gLZvTsNt=2XbcyG7WxXVBBQ@mail.gmail.com (cherry picked from commit 17db248f318f09b143af208fdcc1f067b3b0b2cb) (cherry picked from commit b94f988)

(cherry picked from commit b57d9d2e5d5543cc0c4b2de70d65d7b7c4115da6) (cherry picked from commit c467927)

fixempties() counts the number of in-arcs in the regex NFA and then allocates an array of that many arc pointers. If the NFA contains no arcs, this amounts to malloc(0) for which some platforms return NULL. The code mistakenly treats that as indicating out-of-memory. Thus, we can get a bogus "out of memory" failure for some unsatisfiable regexes. This happens only in v15 and earlier, since bea3d7e38 switched to using palloc() rather than bare malloc(). And at least of the platforms in the buildfarm, only AIX seems to return NULL. So the impact is pretty narrow. But I don't especially want to ship code that is failing its own regression tests, so let's fix this for this week's releases. A quick code survey says that there is only the one place in src/backend/regex/ that is at risk of doing malloc(0), so we'll just band-aid that place. A more future-proof fix could be to install a malloc() wrapper similar to pg_malloc(). But this code seems unlikely to change much more in the affected branches, so that's probably overkill. The only known test case for this involves a complemented character class in a bracket expression, for example [^\s\S], and we didn't support that in v13. So it may be that the problem is unreachable in v13. But I'm not 100% sure of that, so patch v13 too. Discussion: https://postgr.es/m/[email protected] (cherry picked from commit 6ab39c02747c33173e5e33291e66cebbdbc75d82) (cherry picked from commit 79ae156)

(cherry picked from commit a4bd20b6d7f9d42750b797c450592f55d5374c1f) (cherry picked from commit 06df9a4)

Signed-off-by: Mihir Jadhav <[email protected]> (cherry picked from commit 5f36804)

This is due to changes from community commit 8590c942c1a6b861d0cf4fa5aa694ab3a65fa306 Fix data loss at inplace update after heap_update(). (cherry picked from commit 3d9aef7fbc9a8e2da60e8cc67b85a5d09cded629) (cherry picked from commit 7cbce07)

Commit 09f0820095ea8b7c6ca4269454a94695c8421628 introduced locking for individual tuples during certain operations. However, Babelfish introduced the concept of ENR-only relations, which store all catalog tuples in their own local cache and not in the physical catalogs. As these relations and their catalog tuples are completely session-local, there is no need to acquire locks on tuples for these relations (and it would just lead to a crash anyways since the tuples do not have an underlying TID to lock against). Signed-off-by: Jason Teng <[email protected]> (cherry picked from commit 60916a8f80f1164daade1fc707637ec93ff3ee68) (cherry picked from commit e905197)

…t 25d639ee Signed-off-by: Tanzeel Khan <[email protected]> (cherry picked from commit c260de9cbca52ce5acbbe2c2acbf67762024af58) (cherry picked from commit 10d3078)

hlinnaka and others added 30 commits December 9, 2024 13:07

docs: fix incorrect plpgsql error message

49abee4

Change "$1" to "username". Reported-by: [email protected] Discussion: https://postgr.es/m/[email protected] Backpatch-through: 12 (cherry picked from commit e8f36414afac579f32c06573344c8ff4d854b59d) (cherry picked from commit 3e9fd85)

michaelpq and others added 25 commits December 9, 2024 13:07

Translation updates

e99cc40

Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: ecbca81dbf801f683e24897668cec8d1fb0f55a5 (cherry picked from commit 3f2c24e55b7ecfd06a4b65b9a52f74f27b580046) (cherry picked from commit e4000ed)

Last-minute updates for release notes.

718b0f9

Security: CVE-2024-10976, CVE-2024-10977, CVE-2024-10978, CVE-2024-10979 (cherry picked from commit b83b358b1bac61ca46af995a3da83cc515a434c6) (cherry picked from commit 3b777bc)

Stamp 15.9.

3a4150a

(cherry picked from commit 0c53d54c812cea0d840490fd107910ed949e18c2) (cherry picked from commit f668fe1)

Release notes for 17.2, 16.6, 15.10, 14.15, 13.18, 12.22.

3ca0078

(cherry picked from commit b57d9d2e5d5543cc0c4b2de70d65d7b7c4115da6) (cherry picked from commit c467927)

Stamp 15.10.

593289c

(cherry picked from commit a4bd20b6d7f9d42750b797c450592f55d5374c1f) (cherry picked from commit 06df9a4)

BABEL: Fixes after community commit 7abe267

d16d879

Signed-off-by: Mihir Jadhav <[email protected]> (cherry picked from commit 5f36804)

BABEL: fix babelfish commit/rollback inside subtransaction post commi…

5f80c50

…t 25d639ee Signed-off-by: Tanzeel Khan <[email protected]> (cherry picked from commit c260de9cbca52ce5acbbe2c2acbf67762024af58) (cherry picked from commit 10d3078)

roshan0708 mentioned this pull request Dec 9, 2024

[DO NOT SQUASH MERGE] Fixes after community merge babelfish-for-postgresql/babelfish_extensions#3223

Merged

1 task

tanscorpio7 approved these changes Dec 11, 2024

View reviewed changes

rishabhtanwar29 approved these changes Dec 11, 2024

View reviewed changes

shardgupta merged commit 762ffc6 into babelfish-for-postgresql:BABEL_3_8_STABLE__PG_15_10 Dec 11, 2024
2 checks passed

shardgupta deleted the community_merge_3x branch December 11, 2024 14:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT SQUASH MERGE] Community Merge for 3_8 stable #498

[DO NOT SQUASH MERGE] Community Merge for 3_8 stable #498

roshan0708 commented Dec 9, 2024 •

edited

Loading

[DO NOT SQUASH MERGE] Community Merge for 3_8 stable #498

[DO NOT SQUASH MERGE] Community Merge for 3_8 stable #498

Conversation

roshan0708 commented Dec 9, 2024 • edited Loading

Description

Issues Resolved

Check List

roshan0708 commented Dec 9, 2024 •

edited

Loading