This repository has been archived by the owner on Dec 1, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Initial version from PostgreSQL feature branch
- Loading branch information
1 parent
cf23e30
commit 9408753
Showing
30 changed files
with
1,962 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
*.o | ||
*.so | ||
*.swp | ||
.deps | ||
.vagrant |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
Copyright (c) 2016, Peter Geoghegan | ||
|
||
Permission to use, copy, modify, and distribute this software and its | ||
documentation for any purpose, without fee, and without a written agreement | ||
is hereby granted, provided that the above copyright notice and this | ||
paragraph and the following two paragraphs appear in all copies. | ||
|
||
IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR | ||
DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING | ||
LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS | ||
DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE | ||
POSSIBILITY OF SUCH DAMAGE. | ||
|
||
THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES, | ||
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY | ||
AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS | ||
ON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO | ||
PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
short_ver = 0.1 | ||
long_ver = $(shell (git describe --tags --long '--match=v*' 2>/dev/null || echo $(short_ver)-0-unknown) | cut -c2-) | ||
|
||
MODULE_big = amcheck | ||
OBJS = amcheck.o $(WIN32RES) | ||
|
||
EXTENSION = amcheck | ||
DATA = amcheck--0.1.sql | ||
PGFILEDESC = "amcheck - verify the logical consistency of indexes" | ||
DOCS = README.md | ||
REGRESS = install_amcheck extern_sort_bytea \ | ||
extern_sort_collations extern_sort_numeric | ||
|
||
PG_CONFIG = pg_config | ||
PGXS = $(shell $(PG_CONFIG) --pgxs) | ||
include $(PGXS) | ||
|
||
DEBUILD_ROOT = /tmp/amcheck | ||
|
||
deb: | ||
mkdir -p $(DEBUILD_ROOT) && rm -rf $(DEBUILD_ROOT)/* | ||
rsync -Ca --exclude=build/* ./ $(DEBUILD_ROOT)/ | ||
cd $(DEBUILD_ROOT) && make -f debian/rules orig | ||
cd $(DEBUILD_ROOT) && debuild -us -uc -sa | ||
cp -a /tmp/amcheck_* /tmp/postgresql-9.* build/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,232 @@ | ||
# amcheck: Verify the logical consistency of PostgreSQL B-Tree indexes | ||
|
||
Current version: 0.1 | ||
|
||
Author: Peter Geoghegan [`<[email protected]>`](mailto:[email protected]) | ||
|
||
License: <a href="https://opensource.org/licenses/postgresql">PostgreSQL license</a> | ||
|
||
Minimum supported version: PostgreSQL 9.4 | ||
|
||
## Overview | ||
|
||
The `amcheck` module provides functions that allow you to verify the logical | ||
consistency of the structure of PostgreSQL indexes. If the structure appears | ||
to be valid, no error is raised. Currently, only B-Tree indexes are supported, | ||
although since in practice the majority of PostgreSQL indexes are B-Tree | ||
indexes, `amcheck` is likely to be effective as a general corruption smoke-test | ||
in production PostgreSQL installations. | ||
|
||
See "Using amcheck effectively" below for information about the kinds of | ||
real-world problems `amcheck` is intended to detect. | ||
|
||
### Invariants | ||
|
||
`amcheck` provides functions that specifically verify various *invariants* in | ||
the structure of the representation of particular indexes. The correctness of | ||
the access method functions behind index scans and other important operations | ||
is predicated on these invariants always holding. For example, certain | ||
functions verify, among other things, that all B-Tree pages have items in | ||
"logical" order; if that particular invariant somehow fails to hold, we can | ||
expect binary searches on any affected page to produce wrong answers. As a | ||
consequence of that, index scans may subtly produce wrong answers. | ||
|
||
Verification is performed using the same procedures as those used by index | ||
scans themselves, which may be user-defined operator class code. For example, | ||
B-Tree index verification relies on comparisons made with one or more B-Tree | ||
support function 1 routines, much like B-Tree index scans rely on the routines | ||
to guide the scan to a point in the underlying table; see | ||
http://www.postgresql.org/docs/current/static/xindex.html for details of | ||
operator class support functions. | ||
|
||
### Project background | ||
|
||
`amcheck` is proposed as a contrib extension for PostgreSQL 9.7. This | ||
externally maintained version of the extension exists to target earlier | ||
versions of PostgreSQL (PostgreSQL 9.4+). | ||
|
||
### Bugs | ||
|
||
Report bugs using the <a href="https://github.com/petergeoghegan/amcheck/issues">Github | ||
issue tracker</a>. Feature requests will not accepted, because `amcheck` is | ||
targeted for future inclusion into PostgreSQL as a contrib module. | ||
|
||
## Installation | ||
|
||
### Building using PGXS (generic) | ||
|
||
The module can be built using the standard PGXS infrastructure. For this to | ||
work, you will need to have the `pg_config` program available in your $PATH. | ||
|
||
If you are using a packaged PostgreSQL build and have `pg_config` available | ||
(and in your OS user's $PATH), the procedure is as follows: | ||
|
||
```shell | ||
tar xvzf amcheck-0.1.tar.gz | ||
cd amcheck-0.1 | ||
make | ||
make install | ||
``` | ||
|
||
Note that just because `pg_config` is located in one user's $PATH does not | ||
necessarily make it so for the root user. | ||
|
||
#### Building Debian/Ubuntu packages | ||
|
||
The Makefile also provides a target for building Debian packages. The target | ||
has a dependency on `debhelper`, `devscripts`, `postgresql-server-dev-all`, and | ||
the PostgreSQL source package itself (e.g. `postgresql-server-dev-9.4`). | ||
|
||
The packages can be created and installed from the amcheck directory as | ||
follows: | ||
|
||
```shell | ||
sudo aptitude install debhelper devscripts postgresql-server-dev-all | ||
make deb | ||
sudo dpkg -i ./build/postgresql-9.4-amcheck_*.deb | ||
``` | ||
|
||
### Setting up PostgreSQL | ||
|
||
Once `amcheck` is built and installed, it should be created as a PostgreSQL | ||
extension in every database that requires it: | ||
|
||
`mydb=# CREATE EXTENSION amcheck;` | ||
|
||
`amcheck` functions may be used only by superusers. | ||
|
||
## Interface | ||
|
||
The `amcheck` extension has a simple interface. `amcheck` consists of just a | ||
few functions that can be used for verification of a named B-Tree index. Note | ||
that currently, no function inspects the structure of the underlying heap | ||
representation (table). | ||
|
||
`regclass` function arguments are used by `amcheck` to identify particular | ||
index relations. See the <a href="http://www.postgresql.org/docs/current/static/datatype-oid.html">PostgreSQL | ||
documentation on Object identifier types</a> for more information. | ||
|
||
### `bt_index_check(index regclass) returns void` | ||
|
||
`bt_index_check` tests that its target, a B-Tree index, respects a variety of | ||
invariants. Example usage: | ||
|
||
```sql | ||
SELECT bt_index_check(c.oid), c.relname, c.relpages | ||
FROM pg_index i | ||
JOIN pg_opclass op ON i.indclass[0] = op.oid | ||
JOIN pg_am am ON op.opcmethod = am.oid | ||
JOIN pg_class c ON i.indexrelid = c.oid | ||
JOIN pg_namespace n ON c.relnamespace = n.oid | ||
WHERE am.amname = 'btree' AND n.nspname = 'pg_catalog' | ||
-- Don't check pg_class (bt_index_parent_check() requires this): | ||
AND c.relname NOT LIKE 'pg_class%' | ||
-- Function may throw an error when this is omitted: | ||
AND i.indisready AND i.indisvalid | ||
ORDER BY c.relpages DESC LIMIT 10; | ||
``` | ||
```shell | ||
bt_index_check | relname | relpages | ||
----------------+---------------------------------+---------- | ||
| pg_depend_reference_index | 43 | ||
| pg_depend_depender_index | 40 | ||
| pg_proc_proname_args_nsp_index | 31 | ||
| pg_description_o_c_o_index | 21 | ||
| pg_attribute_relid_attnam_index | 14 | ||
| pg_proc_oid_index | 10 | ||
| pg_attribute_relid_attnum_index | 9 | ||
| pg_amproc_fam_proc_index | 5 | ||
| pg_amop_opr_fam_index | 5 | ||
| pg_amop_fam_strat_index | 5 | ||
``` | ||
|
||
This example shows a session that performs verification of every catalog index | ||
in the database "test" (except those associated with the pg_class catalog). | ||
Details of just the 10 largest indexes verified are displayed. Since no error | ||
is raised, all indexes tested appear to be logically consistent. Naturally, | ||
this query could easily be changed to call `bt_index_check` for every index in | ||
the database where verification is supported. An `AccessShareLock` is acquired | ||
on the target index by `bt_index_check`. This lock mode is the same lock mode | ||
acquired on relations by simple `SELECT` statements. | ||
|
||
`bt_index_check` does not verify invariants that span child/parent | ||
relationships, nor does it verify that the target index is consistent with its | ||
heap relation. When a routine, lightweight test for corruption is required in | ||
a live production environment, using `bt_index_check` often provides the best | ||
trade-off between thoroughness of verification and limiting the impact on | ||
application performance and availability. | ||
|
||
### `bt_index_parent_check(index regclass) returns void` | ||
|
||
`bt_index_parent_check` tests that its target, a B-Tree index, respects a | ||
variety of invariants. The checks performed by `bt_index_parent_check` are a | ||
superset of the checks performed by `bt_index_check`. `bt_index_parent_check` | ||
can be thought of as a more thorough variant of `bt_index_check`: unlike | ||
`bt_index_check`, `bt_index_parent_check` also checks invariants that span | ||
parent/child relationships. However, it does not verify that the target index | ||
is consistent with its heap relation. `bt_index_parent_check` follows the | ||
general convention of raising an error if it finds a logical inconsistency or | ||
other problem. | ||
|
||
An `ExclusiveLock` is required on the target index by bt_index_parent_check (a | ||
`ShareLock` is also acquired on the heap relation). These locks prevent | ||
concurrent data modification from `INSERT`, `UPDATE`, and `DELETE` commands. | ||
The locks also prevent the underlying relation from being concurrently | ||
processed by `VACUUM` and certain other utility commands. Note that the | ||
function holds locks for as short a duration as possible, so there is no | ||
advantage to verifying each index individually in a series of transactions. | ||
|
||
`bt_index_parent_check`'s additional verification is more likely to detect | ||
various pathological cases. These cases may involve an incorrectly implemented | ||
B-Tree operator class used by the index that is checked, or, hypothetically, | ||
undiscovered bugs in the underlying B-Tree index access method code. Note that | ||
`bt_index_parent_check` requires locks in a way that is incompatible with Hot | ||
Standby, unlike `bt_index_check`. | ||
|
||
## Using amcheck effectively | ||
|
||
`amcheck` can be effective at detecting various types of failure modes that | ||
data page checksums will always fail to catch. These include: | ||
|
||
* Filesystem or storage subsystem faults where checksums happen to simply not | ||
be enabled. Note that `amcheck` examines a page as represented in some | ||
shared memory buffer at the time of verification if there is only a shared | ||
buffer hit when accessing the block. Consequently, `amcheck` does not | ||
necessarily examine data read from the filesystem at the time of | ||
verification. Note that when checksums are enabled, `amcheck` may raise an | ||
error due to a checksum failure when a corrupt block is read into a buffer. | ||
|
||
* Corruption caused by faulty RAM, and the broader memory subsystem and | ||
operating system. PostgreSQL does not protect against correctable memory | ||
errors and it is assumed you will operate using RAM that uses industry | ||
standard Error Correcting Codes (ECC) or better protection. However, ECC | ||
memory is typically only immune to single-bit errors, and should not be | ||
assumed to provide *absolute* protection against failures that result in | ||
memory corruption. | ||
|
||
* Structural inconsistencies caused by incorrect operator class | ||
implementations, including issues due to the comparison rules of operating | ||
system collations changing the ordering implied by comparisons of an | ||
underlying collatable type, such as text. Though rare, updates to operating | ||
system collation rules can cause these issues. More commonly, an | ||
inconsistency in the collation order between a master server and a standby | ||
server is implicated, possibly because the *major* operating system version | ||
in use is inconsistent. Such inconsistencies will generally only arise on | ||
standby servers, and so can generally only be detected on standby servers. | ||
If a problem like this arises, it may not affect each individual index that | ||
is ordered using an affected collation, simply because *indexed* values might | ||
happen to have the same absolute ordering regardless of the behavioral | ||
inconsistency. | ||
|
||
* Corruption caused by hypothetical undiscovered bugs in the underlying | ||
PostgreSQL access method code. Automatic verification of the structural | ||
integrity of indexes plays a role in the general testing of new or proposed | ||
PostgreSQL features that could plausibly allow a logical inconsistency to be | ||
introduced. One obvious testing strategy is to call `amcheck` functions | ||
continuously when running the standard regression tests. | ||
|
||
No error concerning corruption raised by `amcheck` should ever be a false | ||
positive. Users are strongly advised to perform a `REINDEX` for any index | ||
where corruption is indicated, although it is also important to identify and | ||
correct the underlying problem. In general, `amcheck` can only prove the | ||
presence of corruption; it cannot prove its absence. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# -*- mode: ruby -*- | ||
# vi: set ft=ruby : | ||
|
||
# Vagrantfile API/syntax version. Don't touch unless you know what you're doing! | ||
VAGRANTFILE_API_VERSION = "2" | ||
|
||
Vagrant.configure("2") do |config| | ||
config.vm.box = "debian/jessie64" | ||
|
||
config.vm.provision :shell, :path => "bootstrap.sh" | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
/* contrib/amcheck/amcheck--0.1.sql */ | ||
|
||
-- complain if script is sourced in psql, rather than via CREATE EXTENSION | ||
\echo Use "CREATE EXTENSION amcheck" to load this file. \quit | ||
|
||
-- | ||
-- bt_index_check() | ||
-- | ||
CREATE FUNCTION bt_index_check(index regclass) | ||
RETURNS VOID | ||
AS 'MODULE_PATHNAME', 'bt_index_check' | ||
LANGUAGE C STRICT; | ||
|
||
-- | ||
-- bt_index_parent_check() | ||
-- | ||
CREATE FUNCTION bt_index_parent_check(index regclass) | ||
RETURNS VOID | ||
AS 'MODULE_PATHNAME', 'bt_index_parent_check' | ||
LANGUAGE C STRICT; |
Oops, something went wrong.