Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-join pushdown appends recursively _<number> to column names so that they exceed maximum alias length #18924

Conversation

dominikzalewski
Copy link
Member

@dominikzalewski dominikzalewski commented Sep 5, 2023

Acceptance Criteria

  • Solution fixes the bug for Oracle pushdown JOIN

Out of Scope

  • Integration test for other databases than Oracle
  • If Alias field restriction has same bug on aggregation, it won’t be fixed
  • If other databases have restriction on the alias field that result in same bug, this wont' be fixed

How to Reproduce

  • Start ‘OracleQueryRunner’ in debug mode.
  • Setup a breakpoint in DefaultJdbcMetadata.applyJoin() method where the code is calling jdbcClient.implementJoin().
  • Start ‘Trino’ console client.
trino> use oracle.trino_test;

The following enables join pushdown that otherwise would not happen [1], [2].

trino> set session oracle.join_pushdown_enabled=true;

The following query

trino> SELECT c.custkey, o.orderkey, n.nationkey
      FROM orders o JOIN customer c ON c.custkey = o.custkey
      JOIN nation n ON c.nationkey = n.nationkey;

should trigger the breakpoint.

Screenshot 2023-08-09 at 15 28 12

In ‘Threads & Variables’ view you should be able to observe the ‘… AS COLUMN_X_Y’ aliases for columns that have as many underscores with a follow up number, as the number of joins in the query.

Proposed solution

Divide the number of characters available for the alias (N) into two:

  • ‘Column Name Prefix’ of length P
  • ‘Uniqueness Magic Number’ of length M

where N - 1 = P + M. Use the additional character '_' to visually divide 'Original Column Name Prefix' from 'Uniqueness Magic Number'.

Example

Assume N = 8, P = 4, M = 3

Query:

SELECT c.custkey, o.orderkey
  FROM orders o JOIN customer c
  ON c.custkey = o.custkey

The aliases would then become:

  • custkey => cust_001
  • orderkey => orde_002

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Oracle
* Fix possible query failure when joins are pushed down to Oracle.

@cla-bot cla-bot bot added the cla-signed label Sep 5, 2023
@dominikzalewski dominikzalewski changed the title [WIPMulti-join pushdown appends recursively _<number> to column names so that they exceed maximum alias length [WiP] Multi-join pushdown appends recursively _<number> to column names so that they exceed maximum alias length Sep 5, 2023
@dominikzalewski dominikzalewski marked this pull request as draft September 5, 2023 09:11
@dominikzalewski dominikzalewski self-assigned this Sep 5, 2023
Comment on lines 23 to 24
public static final int DEFAULT_COLUMN_ALIAS_LENGTH = 30;
public static final int ORIGINAL_COLUMN_NAME_LENGTH = 24;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no change requested yet:

notice that the code lives in trino-base-jdbc but the constants are for Oracle database limitations.
This is why it was requested to accept the max length limit as an argument to the method (easiest to test and bind in Guice) or the class itself (still ok to test, harder to bind in Guice).

Comment on lines 94 to 144
@Test
public void testPushdownJoinWithLongNameSucceeds()
{
tryCleanupTemporaryTable();
try {
String baseColumnName = "test_pushdown_" + randomNameSuffix();
String validColumnName = baseColumnName + "z".repeat(MAX_CHARS_COLUMN_ALIAS - baseColumnName.length());

assertUpdate(format("""
CREATE TABLE orders_1 as
SELECT orderkey as %s,
custkey,
orderstatus,
totalprice,
orderdate,
orderpriority,
clerk,
shippriority,
comment
FROM orders
""", validColumnName), "VALUES 15000");

Session session = Session.builder(getSession())
.setCatalogSessionProperty("oracle", JOIN_PUSHDOWN_ENABLED, "true")
.build();
assertQuery(session,
format("""
SELECT c.custkey, o.%s, n.nationkey
FROM orders_1 o JOIN customer c ON c.custkey = o.custkey
JOIN nation n ON c.nationkey = n.nationkey
""", validColumnName),
"""
SELECT c.custkey, o.orderkey, n.nationkey
FROM orders o JOIN customer c ON c.custkey = o.custkey
JOIN nation n ON c.nationkey = n.nationkey
""");
}
finally {
tryCleanupTemporaryTable();
}
}

private void tryCleanupTemporaryTable()
{
try {
assertUpdate("DROP TABLE orders_1");
}
catch (Exception ex) {
ex.printStackTrace();
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Test
public void testPushdownJoinWithLongNameSucceeds()
{
tryCleanupTemporaryTable();
try {
String baseColumnName = "test_pushdown_" + randomNameSuffix();
String validColumnName = baseColumnName + "z".repeat(MAX_CHARS_COLUMN_ALIAS - baseColumnName.length());
assertUpdate(format("""
CREATE TABLE orders_1 as
SELECT orderkey as %s,
custkey,
orderstatus,
totalprice,
orderdate,
orderpriority,
clerk,
shippriority,
comment
FROM orders
""", validColumnName), "VALUES 15000");
Session session = Session.builder(getSession())
.setCatalogSessionProperty("oracle", JOIN_PUSHDOWN_ENABLED, "true")
.build();
assertQuery(session,
format("""
SELECT c.custkey, o.%s, n.nationkey
FROM orders_1 o JOIN customer c ON c.custkey = o.custkey
JOIN nation n ON c.nationkey = n.nationkey
""", validColumnName),
"""
SELECT c.custkey, o.orderkey, n.nationkey
FROM orders o JOIN customer c ON c.custkey = o.custkey
JOIN nation n ON c.nationkey = n.nationkey
""");
}
finally {
tryCleanupTemporaryTable();
}
}
private void tryCleanupTemporaryTable()
{
try {
assertUpdate("DROP TABLE orders_1");
}
catch (Exception ex) {
ex.printStackTrace();
}
}
@Test
public void testPushdownJoinWithLongNameSucceeds()
{
String maximumLengthColumnIdentifier = "z".repeat(MAX_CHARS_COLUMN_ALIAS);
try (TestTable table = new TestTable(getQueryRunner()::execute, "long_identifier", "(%s bigint)".formatted(maximumLengthColumnIdentifier))) {
assertThat(query(joinPushdownEnabled(getSession()), """
SELECT r.name, t.%s, n.name
FROM %s t JOIN region r ON r.regionkey = t.%s
JOIN nation n ON r.regionkey = n.regionkey""".formatted(maximumLengthColumnIdentifier, table.getName(), maximumLengthColumnIdentifier)))
.isFullyPushedDown();
}
}

This seems to be sufficient to test.
Try to avoid creating data in tests since it can be slow, specially if using larger tables like orders. nation and region are best if your test doesn't care about size of tables.

Note: we need to verify that the query was indeed pushed down because otherwise it's possible that the connector would generate a plain SELECT with no synthetic ids and hence our test will pass.

Joins cannot always be pushed down even if join pushdown is enabled (e.g. if the join condition or some other operation in the query prevents pushdown).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also while this test by itself is sufficient to prove that the aliasing logic works it doesn't prove that we chose an optimal value for the max identifier length.

IMO we should have another test to verify the length we chose:

  • Try create a column with length = max_length - must succeed
  • Try create a column with length = max_length + 1 - must fail in expected manner

the reason why i believe this is useful is because for example we can use 30 as the max globally but it un-necessarily loses information in connectors which can actually support longer identifiers making reading explain plans harder than needed. Do you agree?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That's a really neat API. Swapped the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use 30 as the max globally but it un-necessarily loses information in connectors which can actually support longer identifiers making reading explain plans harder than needed

If that is the case, we can try to leave the old 'formatter', in case it's a non-Oracle implementation. Then the code will run exactly as was (possibly with a bug) for any jdbc that is non-Oracle. Not sure whether this is better (because we have the old logic that does not truncate, so it is easier to trace/debug), or worse (because there might still be a bug).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we know we'll follow up on this I don't think either choice makes a difference. End result after the follow up will be that each JDBC connector will use some formatter.

@dominikzalewski
Copy link
Member Author

@hashhar, I have stepped back one tiny bit and started thinking whether:

  • [1] we have a conceptual problem here with the should there be/should there be NOT a misconfiguration test
  • [2] a technical problem on how to bind the constant on what the limit is and be able to change it independently in tests/various connector (even if in the future).

I started analyzing the code with that in mind and here are my conclusions (I might be wrong, so please straighten this up if that is the case).

Normally, when I deal with test code where DI is used, especially when guice is used, I see guice modules being used via annotating classes with @guice top-level. In those modules there's usually a mixture of production bindings but also a fraction of test-only modules where one binds specific constants for specific tests, or mocks for tests.

Now, when browsing through TestOracleConnectorTest and its base class ie. BaseOracleConnectorTest, I noticed that all instantiations are made through 'new' constructor calls. Whenever there's a variation needed per Oracle/Postgres/etc., there's a method in the base class that is being overriden by the subsequent test class.

This feels awkward to me and counter intuitive. Normally I'd expect that for instance TestOracleConnectorTest has a @guice annotation with a module specified that would look like:

   public class OracleTestModule extends AbstractModule {
       @Override
       public void configure() {
          bind(TestingOracleServer.class).in(Singleton.class);
       }
       @Provides
       protected QueryRunner createQueryRunner(TestingOracleServer oracleServer) throws Exception
      {
        return OracleQueryRunner.createOracleQueryRunner(
                oracleServer,
                ImmutableMap.of(),
                ImmutableMap.<String, String>builder()
                        .put("connection-url", oracleServer.getJdbcUrl())
                        .put("connection-user", TEST_USER)
                        .put("connection-password", TEST_PASS)
                        .put("oracle.connection-pool.enabled", "false")
                        .put("oracle.remarks-reporting.enabled", "true")
                        .buildOrThrow(),
                REQUIRED_TPCH_TABLES);
      }
   }

and the testing class code would then be:

@Guice(modules = OracleTestModule.class)
public class TestOracleConnectorTest
        extends BaseOracleConnectorTest
{
    ...

    @Inject
    private TestingOracleServer oracleServer;
    ...
}

The fact that we use guice DI in production in the usual way through the bindings and modules, makes it harder to swap something out in the tests if we don't use guice in the tests at all.

So my conclussion/suggestion is that if we go back to the analysis of [1] and [2], I believe that the problem at hand is that we would be able to agree on the variation of [1], however because we have a technical problem [2], we have divided ourselves into two camps. It's easier to say the test is not needed, because we the don't have to admit [2] :)

Now, I understand that rewriting (or even getting an approval for rewriting) even a single test to use DI the way I was used to is probably out of scope/insane idea. At this point I believe the only reasonable thing we can do is to agree on the 'tactical' solution to deal with this problem.

I know what I have written does not resolve anything. Just throwing it in, so that I can cross check whether it improves our understanding of the problem.

@dominikzalewski dominikzalewski force-pushed the dominikzalewski/pushdown_alias_column_names_exceed_limit_new_review_framework branch 5 times, most recently from 3ed886a to a08b307 Compare September 5, 2023 16:12
@hashhar
Copy link
Member

hashhar commented Sep 6, 2023

we'll I don't know how to answer regarding DI in tests, consistency comes first since multiple people work on the project and familiarity is important for open-source contributors.

Rewriting the world is possible but there are complications which aren't visible at first sight and cause the rewrite to take longer than expected without bringing too much value. i.e. does using DI in tests solve a problem we have right now? See for example #9378 and #14027. It's been going on for 2 years now with not a lot of immediately visible benefits (even though I agree we should do it).

@hashhar
Copy link
Member

hashhar commented Sep 6, 2023

Also note that DI includes constructor injection so using Guice or a setup method doesn't make a practical difference.

And in this specific case the module won't save you repetition because if you look further you'll notice each test class needs a differently configured query runner so you'll have multiple modules anyway and the modules make the test setup lie farther away from the test code which doesn't seem beneficial.

TL; DR: I fail to see what practical problem does DI in tests solve.

@dominikzalewski dominikzalewski marked this pull request as ready for review September 6, 2023 09:51
@dominikzalewski dominikzalewski changed the title [WiP] Multi-join pushdown appends recursively _<number> to column names so that they exceed maximum alias length Multi-join pushdown appends recursively _<number> to column names so that they exceed maximum alias length Sep 6, 2023
@dominikzalewski dominikzalewski force-pushed the dominikzalewski/pushdown_alias_column_names_exceed_limit_new_review_framework branch from a08b307 to 30ae72b Compare September 6, 2023 11:07
Copy link
Member

@nineinchnick nineinchnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some nitpicks

{
this.jdbcClient = requireNonNull(jdbcClient, "jdbcClient is null");
this.jdbcQueryEventListeners = ImmutableSet.copyOf(requireNonNull(jdbcQueryEventListeners, "queryEventListeners is null"));
this.builder = requireNonNull(builder, "aliasFormatter is null");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, match the message name and consider a better name for this property

@dominikzalewski dominikzalewski force-pushed the dominikzalewski/pushdown_alias_column_names_exceed_limit_new_review_framework branch from 30ae72b to d47d13a Compare September 7, 2023 11:43
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - all comments are editorial comments now

@dominikzalewski dominikzalewski force-pushed the dominikzalewski/pushdown_alias_column_names_exceed_limit_new_review_framework branch from d47d13a to 75f893a Compare September 8, 2023 10:53
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % some argument renames + one nitpick

@dominikzalewski dominikzalewski force-pushed the dominikzalewski/pushdown_alias_column_names_exceed_limit_new_review_framework branch from 75f893a to 8dcbe36 Compare September 8, 2023 12:06
@dominikzalewski dominikzalewski force-pushed the dominikzalewski/pushdown_alias_column_names_exceed_limit_new_review_framework branch from 8dcbe36 to 2ee0b52 Compare September 8, 2023 12:09
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I'll merge after CI is done.

@hashhar
Copy link
Member

hashhar commented Sep 8, 2023

Only comment I have is that the commit history is less than ideal.
In this case since we've been able to make the solution a "single logical change" I'll just squash them during the merge.

Comment on lines +118 to +119
public DefaultJdbcMetadata(JdbcClient jdbcClient,
boolean precalculateStatisticsForPushdown,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We format arguments like below when putting arguments on separate lines:

    public DefaultJdbcMetadata(
            JdbcClient jdbcClient,
            boolean precalculateStatisticsForPushdown,
            Set<JdbcQueryEventListener> jdbcQueryEventListeners,
            SyntheticColumnHandleBuilder syntheticColumnBuilder)

Same for Ignite and Phoenix.

{
private final SyntheticColumnHandleBuilder syntheticColumnHandleBuilder = new SyntheticColumnHandleBuilder();

@DataProvider(name = "columns")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helper method should be located after the caller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate or provide an example. I don't think I understand your comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean replacing the method order:

    @Test(dataProvider = "columns")
    public void testColumnAliasTruncation(String columnName, int nextSynthenticId, String expectedSyntheticColumnName)
    {
        JdbcColumnHandle column = getDefaultColumnHandleBuilder()
                .setColumnName(columnName)
                .build();

        JdbcColumnHandle result = syntheticColumnHandleBuilder.get(column, nextSynthenticId);

        assertThat(result.getColumnName()).isEqualTo(expectedSyntheticColumnName);
    }

    @DataProvider(name = "columns")
    public static Object[][] testData()
    {
        return new Object[][] {
                {"column_0", 999, "column_0_999"},
                {"column_with_over_twenty_characters", 100, "column_with_over_twenty_ch_100"},
                {"column_with_over_twenty_characters", MAX_VALUE, "column_with_over_tw_2147483647"}
        };
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that we are reading from the top to bottom. That way you read more abstract things, then if you care you go deeper and understand details. This follows Clean Code book.

private final SyntheticColumnHandleBuilder syntheticColumnHandleBuilder = new SyntheticColumnHandleBuilder();

@DataProvider(name = "columns")
public static Object[][] testData()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually use dataProvider suffix for @DataProvider. I would rename to columnDataProvider or something.

.setColumnName("column_0")
.build();

assertThatThrownBy(() -> syntheticColumnHandleBuilder.get(column, -2147483648)).isInstanceOf(VerifyException.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would better to verify the error message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise here. Can you explain in more detail?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        assertThatThrownBy(() -> syntheticColumnHandleBuilder.get(column, -2147483648))
                .isInstanceOf(VerifyException.class)
                .hasMessage("nextSyntheticColumnId rolled over and is not monotonically increasing any more");

@dominikzalewski
Copy link
Member Author

@ebyhr , there is a follow up PR already opened by @hashhar. There is a chance he'll address those in that PR: #18984

@dominikzalewski dominikzalewski deleted the dominikzalewski/pushdown_alias_column_names_exceed_limit_new_review_framework branch September 15, 2023 10:33
{
private final SyntheticColumnHandleBuilder syntheticColumnHandleBuilder = new SyntheticColumnHandleBuilder();

@DataProvider(name = "columns")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that we are reading from the top to bottom. That way you read more abstract things, then if you care you go deeper and understand details. This follows Clean Code book.

@@ -83,4 +88,16 @@ protected SqlExecutor onRemoteDatabase()
{
return oracleServer::execute;
}

@Test
public void testPushdownJoinWithLongNameSucceeds()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test with multiple super long column names used in the same Trino query that are almost the same (zzz..zza, zzz..zzb, ...)? Such columns should represent different data. The thing I was thinking we could check is to validate that synthetic names do not generate conflicts.

Or is it an overkill?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, see the tests being added in #18984

fortunately the remote databases which silently truncate long identifiers (SQLServer, Postgres, Redshift) all prevent this bug from happening because they realise that the truncated alias is now ambiguous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants