Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue resulting from PG 17.1 ResultRelInfo size change #7447

Closed
wants to merge 4 commits into from

Conversation

mkindahl
Copy link
Contributor

@mkindahl mkindahl commented Nov 15, 2024

Because a new member was added to ResulRelInfo in 17.1, our initialization of resultRelInfo in ts_catalog_open_indexes did not work correctly.

Switching to use the PostgreSQL functions CatalogOpenIndexes and CatalogCloseIndexes.


If calling InitResultRelInfo() from a 17.0 build of timescaledb on a 17.1 PostgreSQL server, it will attempt to zero out memory for the new member ri_needLockTagTuple and for that reason (potentially) result in writing to unallocated memory.

Since we are only calling InitResultRelInfo() from one location in the code---inside create_chunk_result_relation_info---we redefine makeNode for this file to allocate 2 * sizeof(ResultRelInfo). This will overallocate memory for the structure, hence allow InitResultRelInfo() to write the additional field.


Calling InitResultRelInfo() in a 17.1 PostgreSQL server from an extension built with a 17.0 server will try to initialize the new field ri_needLockTagTuple, which will attempt to write outsize the bounds of the structure.

To deal with this, we create a union of the ResultRelInfo structure and a buffer with 2 * sizeof(ResultRelInfo), thereby making sure that calls of InitResultRelInfo() will not write outside allocated memory.

Copy link

codecov bot commented Nov 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.15%. Comparing base (59f50f2) to head (bb80f44).
Report is 591 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7447      +/-   ##
==========================================
+ Coverage   80.06%   82.15%   +2.08%     
==========================================
  Files         190      229      +39     
  Lines       37181    42887    +5706     
  Branches     9450    10775    +1325     
==========================================
+ Hits        29770    35232    +5462     
- Misses       2997     3359     +362     
+ Partials     4414     4296     -118     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mkindahl mkindahl marked this pull request as ready for review November 15, 2024 07:19
@mkindahl mkindahl self-assigned this Nov 15, 2024
@erimatnor
Copy link
Contributor

erimatnor commented Nov 15, 2024

Why do we even have this functions when PG has https://github.com/postgres/postgres/blob/91f20bc2f7e4fcf5de5c65a6cb1190e0afa91c0b/src/backend/catalog/indexing.c#L43?

OTOH, that function also doesn't call InitResultRelInfo.

In fact, we also call CatalogOpenIndexes directly in our code so it is kind of confusing.

@mkindahl
Copy link
Contributor Author

Why do we even have this functions when PG has https://github.com/postgres/postgres/blob/91f20bc2f7e4fcf5de5c65a6cb1190e0afa91c0b/src/backend/catalog/indexing.c#L43?

OTOH, that function also doesn't call InitResultRelInfo.

In fact, we also call CatalogOpenIndexes directly in our code so it is kind of confusing.

Good point, we should probably use that function directly instead. Will investigate.

@mkindahl mkindahl changed the title Call InitResultRelInfo in ts_catalog_open_indexes Fix issue resulting from ResultRelInfo size change Nov 15, 2024
@mkindahl mkindahl changed the title Fix issue resulting from ResultRelInfo size change Fix issue resulting from PG 17.1 ResultRelInfo size change Nov 15, 2024
@afiskon afiskon requested review from afiskon and removed request for afiskon November 15, 2024 14:43
.unreleased/pr_7447 Outdated Show resolved Hide resolved
@@ -122,6 +122,9 @@ create_chunk_rri_constraint_expr(ResultRelInfo *rri, Relation rel)
}
}

#undef makeNode
#define makeNode(_type_) ((_type_ *) newNode(2 * sizeof(_type_), T_##_type_))
Copy link
Contributor

@erimatnor erimatnor Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yupp, this is indeed horrible, as you put it. 😄

Copy link
Contributor

@erimatnor erimatnor Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need it? The palloc code allocates 512 byte chunks, which should be enough as explained here: https://www.postgresql.org/message-id/1972733.1731612633%40sss.pgh.pa.us

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that I get warning about writing on unallocated memory without it and the warnings disappear with this patch, I would say that it is needed.

* added an extra field and when calling InitResultRelInfo() on a
* ResultRelInfo object it will attempt to write after the end of the
* structure allocated in the 17.0 server. */
union OverallocatedResultRelInfo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we even need this to be stack allocated? Isn't it better to just heap-allocate it as that would be safer given that palloc already over-allocates? Or, if we are not satisfied with that, we can just give palloc a bigger size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it to be stack allocated, but since it were already, this is a straightforward fix. We can also use palloc, as you suggest. If you feel strongly about it, I can switch to use palloc.

Copy link
Contributor

@erimatnor erimatnor Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PG developers suggested it. See the thread on the mailing list.

makeNode()/palloc + initResultRelation. Neither is done here.

@mfreed
Copy link
Member

mfreed commented Nov 16, 2024

Let's hold off on any merge while we review the planned changes in the out-of-cycle Postgres release:

https://www.postgresql.org/about/news/out-of-cycle-release-scheduled-for-november-21-2024-2958/

@mkindahl
Copy link
Contributor Author

One issue remains to investigate, that of using arrays of ResultRelInfo structures that is present in ExecModifyTable.

@mkindahl
Copy link
Contributor Author

This was now fixed upstreams and a patch is not necessary to solve this immediate problem. However, these changes would protect us against similar problem in future versions.

Copy link
Member

@svenklemm svenklemm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are overshooting the target here, upstream will fix this and do 17.2 so we should just require 17.2 as minimum version (and equivalent for other majors). There is no reason to create overallocated resultrelinfo. Stuff compiled against 17.0 should even work with 17.2 if i understand correctly.

Because a new member was added to `ResulRelInfo` in 17.1, our
initialization of `resultRelInfo` in `ts_catalog_open_indexes` did not
work correctly.

Switching to use the PostgreSQL functions `CatalogOpenIndexes` and
`CatalogCloseIndexes`.
If calling `InitResultRelInfo()` from a 17.0 build of timescaledb on a
17.1 PostgreSQL server, it will attempt to zero out memory for the new
member `ri_needLockTagTuple` and for that reason (potentially) result
in writing to unallocated memory.

Since we are only calling `InitResultRelInfo()` from one location in
the code---inside `create_chunk_result_relation_info`---we redefine
`makeNode` for this file to allocate `2 * sizeof(ResultRelInfo)`. This
will overallocate memory for the structure, hence allow
`InitResultRelInfo()` to write the additional field.
Calling `InitResultRelInfo()` in a 17.1 PostgreSQL server from an
extension built with a 17.0 server will try to initialize the new field
`ri_needLockTagTuple`, which will attempt to write outsize the bounds
of the structure.

To deal with this, we create a union of the `ResultRelInfo` structure
and a buffer with `2 * sizeof(ResultRelInfo)`, thereby making sure that
calls of `InitResultRelInfo()` will not write outside allocated memory.
@mkindahl
Copy link
Contributor Author

Closing issue since we have a 17.2 release that will deal with the problems.

@mkindahl mkindahl closed this Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants