Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Poltgres-compatible indexes #561

Merged
merged 1 commit into from
Aug 27, 2024
Merged

Conversation

Hydrocharged
Copy link
Collaborator

@Hydrocharged Hydrocharged commented Aug 1, 2024

This implements a proof-of-concept for true Postgres indexes.

Current Implementation

The current index implementation relies on the methods used in GMS and Dolt, which are inherently MySQL-based. There are a lot of layers to make indexes as efficient as possible (while also allowing for different integrators to use their own implementations), but I'm going to focus on the inner "core" logic. There are two parts to the core: storage and iteration.

Storage

At a high level, we've hardcoded how values should be stored in an index by their observed behavior. NULL is treated as being smaller than the smallest possible value, and is always stored first. Integers are ordered from negative to positive. Strings are in the order defined by their collation, which is usually in alphabetical order with some casing differences. In Dolt the ordering is concrete, and in GMS this order is assumed for all indexable types.

Iteration

In GMS, we take an expression (filter or join) and create a range (or multiple ranges) that expresses the values that the index should return. That range contains the lower and upper bounds, and those are based on the value given. For example, the filter column >= 6 (where column is an integer) uses the value of 6 to construct a range of [6, ∞). This range is then passed to Dolt, which uses the inclusive 6 as its starting point, and knows to iterate over the remaining index data until the end. If given some value that uses a different type than the indexed column's type, then that value is automatically cast to the index's type.

Postgres vs. MySQL

With the storage and iteration in place for how MySQL (GMS and Dolt) work, let's now look at some key differences with indexes in Postgres.

  • Storage Control: The layout of values are defined by a function. For the default types, Postgres ships with the function pre-defined. It is assumed that this function lays out values such that distinct (not equal using the = operator) values are not equivalent.
  • Iteration via Operators: Specifically for b-trees, iterating over the storage is handled by the comparison operators. As long as the 5 main comparison operators exist (=, >, >=, <, <=), then any value may be used to iterate over storage. It is assumed that these operators map to some logical form of continuity, but that is not strictly required (the Postgres analyzer can actually catch some forms of discontinuity and apply additional filters, pretty cool actually). For example, it is possible that < and > could return true for the same input, but again it is assumed that this is not the case.
  • Null Ordering: Nulls can be viewed as either the smallest possible value or the largest possible value, changing where they're positioned in the index.
  • Null Distinction: For UNIQUE indexes, this controls whether we permit multiple NULL values. If NULLs are distinct, then multiple rows may use NULL. In MySQL, NULLs are always considered distinct.

Indexed Joins

This is originally what kickstarted this small project. Now that I've covered how the current implementation works, and a few ways how Postgres differs, it should be much easier to show how proper indexed joins would not work in the current implementation. At their simplest, an index join has a form like SELECT * FROM t1 JOIN t2 ON t1.col = t2.col;. It may not be obvious, but this touches at least 2 of the 4 differences that I mentioned in the previous section.

  • Iteration via Operators: Postgres doesn't need to be able to convert between two types, it just needs the comparison operators. For some types, this would actually lead to different results. For example, SELECT 0.99999994::float4 = 0.99999997::float8; returns false, as there is a defined = operator for the two types. SELECT 0.99999994::float4 = 0.99999997::float8::float4; returns true, as casting from float8 to float4 loses some precision. In this exact case, as long as we keep the filter expression then it's okay, but can lead to data corruption otherwise. There are many more examples than this, so don't take this as the only case, but it's an easier one to understand (compared to a more "realistic" example using reg... types). If our index framework is built on casting (like the current GMS implementation), then we will always have cases where we are tracking down bugs due to invalid behavior.
  • Null Ordering: It's obvious how index differences aren't handled at all. If NULL values are sorted differently between indexes, then that must be taken into account by some analyzer step. The current implementation does not do this, as it does not need to.
  • Null Distinction: I'm actually not sure how this is supposed to be handled in Postgres in this case (haven't done too much research on it), so this could be something else that the analyzer needs to take into account, but it hasn't been verified.

The Simplest Solution

Right now on main, we are implementing indexes by casting everything to the column type for filters. This "works" in that we are able to get some tests and performance metrics working, but that's only for items that have a cast. As mentioned earlier, this casting logic is not correct, but our limited testing at least works with it. Once we leave the small bubble, we start to see all of the changes that would have to be made in order to get Postgres indexes "working", such as special casing types and expressions to work under the assumptions made in GMS and Dolt, and still it would be incorrect. Some of the special casing would even be hard to figure out in the first place, like how the reg... types that were mentioned earlier should interact with other types.

I propose, with this PR (and attached Dolt PR) that the simplest solution is to just do what Postgres is doing. Postgres defines functions that control the layout of values, so we can implement those functions and simply pass them down to Dolt's storage layer, which uses those for ordering rather than the hardcoded versions. This PR doesn't yet implement this part, but it is what we are already doing with the introduction of sql.ExtendedType, which uses the comparisons defined on the type to control the layout. We just have to change which function is being used, which is relatively simple. This PR, instead, focuses on the filter and retrieval part (since it's the more involved portion).

Postgres simply passes the relevant operator functions down to its storage layer, and runs a tree search (using those operators on its internal b-tree) to find where the storage iterator should start. It then iterates until those operators are no longer fulfilled. This completely sidesteps the casting part, and focuses strictly on the comparison operators, which is exactly what we want. And that's all this PR (and the Dolt one) does in essence. It's a bit more complicated as I'm still trying to take advantage of as much infrastructure as possible, but at it's core it's passing the operators down to Dolt's storage layer to find a start (and stop) point. By passing down functions, this not only gives us full support for everything, but it even allows us to handle things like custom types without any additional code.

There are still additional things that need to be done, such as covering indexes vs. non covering indexes, composite indexes, etc. but those are mostly Doltgres-side changes to match how Postgres behaves. Also, code layout is not final (everything is in one file), comments are missing, names are bad, stuffing things into special structs rather than creating new fields, no GMS changes yet, etc. Look not at the code, but at the intention of the code, as none of this is final or production-quality.

Dolt PR

References

@Hydrocharged
Copy link
Collaborator Author

In addition to references above, I created a custom type and played around with it to get even more details on how the analyzer works, confirm the documentation, etc. I named it test_float8 as it operates similarly to float8, except that it orders using the decimal portion first, then the integer portion. For example, [7.2, 1.2, 3.1, 3.4, 4.3] sorts to [3.1, 1.2, 7.2, 4.3, 3.4]. When the layout function test_float8_cmp does not match the behavior of the comparison operators, then the analyzer will sometimes catch it and change the plan to account for it, which is very interesting (and seems very hard to implement). I'm assuming it can figure these things out since the functions are implemented in SQL rather than C. Anyway, if the analyzer doesn't catch it, then you can end up with some weird results that confirm quite a bit of the internals.

That which is commented out is the "normal" code to make the type behave the same as float8.

DROP TYPE IF EXISTS test_float8 CASCADE;
CREATE TYPE test_float8 AS (value float8);

--CREATE FUNCTION test_float8_eq(test_float8, test_float8) RETURNS boolean AS $$
--	SELECT $1.value = $2.value;
--	$$ LANGUAGE SQL IMMUTABLE STRICT;
--CREATE FUNCTION test_float8_ne(test_float8, test_float8) RETURNS boolean AS $$
--	SELECT $1.value <> $2.value;
--	$$ LANGUAGE SQL IMMUTABLE STRICT;
--CREATE FUNCTION test_float8_lt(test_float8, test_float8) RETURNS boolean AS $$
--	SELECT $1.value < $2.value;
--	$$ LANGUAGE SQL IMMUTABLE STRICT;
--CREATE FUNCTION test_float8_le(test_float8, test_float8) RETURNS boolean AS $$
--	SELECT $1.value <= $2.value;
--	$$ LANGUAGE SQL IMMUTABLE STRICT;
--CREATE FUNCTION test_float8_gt(test_float8, test_float8) RETURNS boolean AS $$
--	SELECT $1.value > $2.value;
--	$$ LANGUAGE SQL IMMUTABLE STRICT;
--CREATE FUNCTION test_float8_ge(test_float8, test_float8) RETURNS boolean AS $$
--	SELECT $1.value >= $2.value;
--	$$ LANGUAGE SQL IMMUTABLE STRICT;

CREATE FUNCTION test_float8_eq(test_float8, test_float8) RETURNS boolean AS $$
	SELECT (MOD($1.value::numeric, 1)::float8, FLOOR($1.value)::int8) = (MOD($2.value::numeric, 1)::float8, FLOOR($2.value)::int8);
	$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE FUNCTION test_float8_ne(test_float8, test_float8) RETURNS boolean AS $$
	SELECT (MOD($1.value::numeric, 1)::float8, FLOOR($1.value)::int8) <> (MOD($2.value::numeric, 1)::float8, FLOOR($2.value)::int8);
	$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE FUNCTION test_float8_lt(test_float8, test_float8) RETURNS boolean AS $$
	SELECT (MOD($1.value::numeric, 1)::float8, FLOOR($1.value)::int8) < (MOD($2.value::numeric, 1)::float8, FLOOR($2.value)::int8);
	$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE FUNCTION test_float8_le(test_float8, test_float8) RETURNS boolean AS $$
	SELECT (MOD($1.value::numeric, 1)::float8, FLOOR($1.value)::int8) <= (MOD($2.value::numeric, 1)::float8, FLOOR($2.value)::int8);
	$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE FUNCTION test_float8_gt(test_float8, test_float8) RETURNS boolean AS $$
	SELECT (MOD($1.value::numeric, 1)::float8, FLOOR($1.value)::int8) > (MOD($2.value::numeric, 1)::float8, FLOOR($2.value)::int8);
	$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE FUNCTION test_float8_ge(test_float8, test_float8) RETURNS boolean AS $$
	SELECT (MOD($1.value::numeric, 1)::float8, FLOOR($1.value)::int8) >= (MOD($2.value::numeric, 1)::float8, FLOOR($2.value)::int8);
	$$ LANGUAGE SQL IMMUTABLE STRICT;

CREATE FUNCTION numeric_to_test_float8(numeric) RETURNS test_float8 AS $$
	DECLARE
		result test_float8;
	BEGIN
		result.value := $1::float8;
		RETURN result;
	END;
	$$ LANGUAGE plpgsql IMMUTABLE STRICT;
CREATE FUNCTION test_float8_to_float8(test_float8) RETURNS float8 AS $$
	DECLARE
		result float8;
	BEGIN
		result := $1;
		RETURN result;
	END;
	$$ LANGUAGE plpgsql IMMUTABLE STRICT;
CREATE FUNCTION test_float8_to_numeric(test_float8) RETURNS numeric AS $$
	DECLARE
		result numeric;
	BEGIN
		result := $1::float8::numeric;
		RETURN result;
	END;
	$$ LANGUAGE plpgsql IMMUTABLE STRICT;
CREATE CAST (numeric AS test_float8) WITH FUNCTION numeric_to_test_float8(numeric) AS IMPLICIT;
CREATE CAST (test_float8 AS float8) WITH FUNCTION test_float8_to_float8(test_float8) AS IMPLICIT;
CREATE CAST (test_float8 AS numeric) WITH FUNCTION test_float8_to_numeric(test_float8) AS ASSIGNMENT;
--CREATE FUNCTION test_float8_cmp(test_float8, test_float8) RETURNS integer AS $$
--	BEGIN
--		IF $1.value < $2.value THEN
--			RETURN -1;
--		ELSIF $1.value > $2.value THEN
--			RETURN 1;
--		ELSE
--			RETURN 0;
--		END IF;
--	END;
--	$$ LANGUAGE plpgsql IMMUTABLE STRICT;
CREATE FUNCTION test_float8_cmp(test_float8, test_float8) RETURNS integer AS $$
	BEGIN
		IF (MOD($1.value::numeric, 1)::float8, FLOOR($1.value)::int8) < (MOD($2.value::numeric, 1)::float8, FLOOR($2.value)::int8) THEN
			RETURN -1;
		ELSIF (MOD($1.value::numeric, 1)::float8, FLOOR($1.value)::int8) > (MOD($2.value::numeric, 1)::float8, FLOOR($2.value)::int8) THEN
			RETURN 1;
		ELSE
			RETURN 0;
		END IF;
	END;
	$$ LANGUAGE plpgsql IMMUTABLE STRICT;
CREATE OPERATOR = (
	LEFTARG = test_float8,
	RIGHTARG = test_float8,
	PROCEDURE = test_float8_eq);
CREATE OPERATOR <> (
	LEFTARG = test_float8,
	RIGHTARG = test_float8,
	PROCEDURE = test_float8_ne);
CREATE OPERATOR < (
	LEFTARG = test_float8,
	RIGHTARG = test_float8,
	PROCEDURE = test_float8_lt);
CREATE OPERATOR <= (
	LEFTARG = test_float8,
	RIGHTARG = test_float8,
	PROCEDURE = test_float8_le);
CREATE OPERATOR > (
	LEFTARG = test_float8,
	RIGHTARG = test_float8,
	PROCEDURE = test_float8_gt);
CREATE OPERATOR >= (
	LEFTARG = test_float8,
	RIGHTARG = test_float8,
	PROCEDURE = test_float8_ge);
CREATE OPERATOR FAMILY test_float8_ops USING btree;
CREATE OPERATOR CLASS test_float8_ops
	DEFAULT FOR TYPE test_float8 USING btree AS
		OPERATOR 1 < ,
		OPERATOR 2 <= ,
		OPERATOR 3 = ,
		OPERATOR 4 >= ,
		OPERATOR 5 > ,
		FUNCTION 1 test_float8_cmp(test_float8, test_float8);

@max-hoffman
Copy link
Contributor

So I might be overlooking some of the nuance in your write up, but I would re-summarize your goals here as:

  1. PG has logical requirements for converting scalar expressions to logical ranges
  2. PG has physical requirements for converting logical ranges to execution ranges

The last time we were talking about PG indexes, we wanted to multiplex on new expressions that the MySQL side didn't have, so we added an interface that bridged the custom PG logic into the format index selection understands. I think that's related to goal 1 above. You get index behaviors that overlap with MySQL for free. You add whatever operators you want and still get default index selection that'll be the limits of the performance of our current indexes for free. You don't get arbitrarily complex range intervals, because we don't have storage side metadata that would make the interval complexity overhead outweigh reading more data from disk. We haven't had a customer that hasn't needed precise equality index matching and single range matching. I'm not aware of any customers with complex OLAP interval queries.

The way I would have approached the current PR is similar. If we want to presentation/logical ranges represented differently, there are multiple points where you can interface PG logic to do slightly different translations. Ex: a PG RangeBuilder that adjusts the logical ranges. Or a PG NewRangePartitionIter that converts logical ranges to PG ranges. MySQL sounds more pessimistic than PG, which would be a perf issue more than a correctness issue.

In summary, I wouldn't diverge from the index matching interfaces and lifecycle.

@Hydrocharged Hydrocharged force-pushed the daylon/indexes branch 2 times, most recently from aa9c3b3 to e73640b Compare August 19, 2024 13:10
@Hydrocharged Hydrocharged force-pushed the daylon/indexes branch 2 times, most recently from faaedaf to 567603e Compare August 22, 2024 14:04
Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a closer look, but I'm not sure I see the big picture here. GMS and Dolt rely on the MySQL-specific range implementation, so what's the point of the abstractions? Are the abstractions just placeholders to get something started that will be filled in later?

server/analyzer/replace_indexed_tables.go Outdated Show resolved Hide resolved
server/analyzer/replace_indexed_tables.go Show resolved Hide resolved
server/expression/join_comparator.go Show resolved Hide resolved
@Hydrocharged Hydrocharged changed the title POC for index changes Add support for Poltgres-compatible indexes Aug 27, 2024
@Hydrocharged Hydrocharged marked this pull request as ready for review August 27, 2024 11:17
@Hydrocharged Hydrocharged merged commit 0a3d267 into main Aug 27, 2024
9 checks passed
@Hydrocharged Hydrocharged deleted the daylon/indexes branch August 27, 2024 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants