Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Improve Scope of Table Name Regex to Resolve Errors #188

Merged
merged 4 commits into from
Sep 6, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 21 additions & 5 deletions frontend/src/utils/indexerRunner.js
Original file line number Diff line number Diff line change
Expand Up @@ -57,16 +57,32 @@ export default class IndexerRunner {
const correctTableNameFormat = /^[a-zA-Z_][a-zA-Z0-9_]*$/;

tableNames.forEach(name => {
if (!correctTableNameFormat.test(name)) {
if (!name.includes("\"") && !correctTableNameFormat.test(name)) { // Only test if table name doesn't have quotes
throw new Error(`Table name ${name} is not formatted correctly. Table names must not start with a number and only contain alphanumerics or underscores.`);
}
});
}

getTableNames (schema) {
const tableRegex = /CREATE TABLE\s+(?:IF NOT EXISTS)?\s+"?(.+?)"?\s*\(/g;
const tableNames = Array.from(schema.matchAll(tableRegex), match => {
let tableName;
if (match[1].includes('.')) { // If expression after create has schemaName.tableName, return only tableName
tableName = match[1].split('.')[1];
tableName = tableName.startsWith('"') ? tableName.substring(1) : tableName;
} else {
tableName = match[1];
}
return /^\w+$/.test(tableName) ? tableName : `"${tableName}"`; // If table name has special characters, it must be inside double quotes
});
this.validateTableNames(tableNames);
console.log('Retrieved the following table names from schema: ', tableNames);
return tableNames;
}

async executeIndexerFunction(height, blockDetails, indexingCode, schema, schemaName) {
let innerCode = indexingCode.match(/getBlock\s*\([^)]*\)\s*{([\s\S]*)}/)[1];
let tableNames = Array.from(schema.matchAll(/CREATE TABLE\s+"(\w+)"/g), match => match[1]); // Get first capturing group of each match
this.validateTableNames(tableNames);
const tableNames = this.getTableNames(schema);

if (blockDetails) {
const block = Block.fromStreamerMessage(blockDetails);
Expand Down Expand Up @@ -172,8 +188,8 @@ export default class IndexerRunner {
try {
const result = tables.reduce((prev, tableName) => ({
...prev,
[`insert_${tableName}`]: async (objects) => await this.insert(blockHeight, schemaName, tableName, objects),
[`select_${tableName}`]: async (object, limit = 0) => await this.select(blockHeight, schemaName, tableName, object, limit),
[`insert_${tableName.replace(/[^a-zA-Z0-9_]/g, '_')}`]: async (objects) => await this.insert(blockHeight, schemaName, tableName, objects),
[`select_${tableName.replace(/[^a-zA-Z0-9_]/g, '_')}`]: async (object, limit = 0) => await this.select(blockHeight, schemaName, tableName, object, limit),
}), {});
return result;
} catch (error) {
Expand Down
118 changes: 115 additions & 3 deletions runner/src/indexer/indexer.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,99 @@ describe('Indexer unit tests', () => {
"block_timestamp" DECIMAL(20, 0) NOT NULL,
"receipt_id" VARCHAR NOT NULL,
CONSTRAINT "post_likes_pkey" PRIMARY KEY ("post_id", "account_id")
);'`;
);`;

const STRESS_TEST_SCHEMA = `
CREATE TABLE
creator_quest (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to test same line too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. I'll make this one one line to validate that. And I agree with the overall comment. I'll make it return an empty object if it detects a failure for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a great catch. Turns out the regex would have \s+ twice. So, the space between TABLE and the name would only satisfy one and it would be skipped.

account_id VARCHAR PRIMARY KEY,
num_components_created INTEGER NOT NULL DEFAULT 0,
completed BOOLEAN NOT NULL DEFAULT FALSE
);

CREATE TABLE
composer_quest (
account_id VARCHAR PRIMARY KEY,
num_widgets_composed INTEGER NOT NULL DEFAULT 0,
completed BOOLEAN NOT NULL DEFAULT FALSE
);

CREATE TABLE
"contractor - quest" (
account_id VARCHAR PRIMARY KEY,
num_contracts_deployed INTEGER NOT NULL DEFAULT 0,
completed BOOLEAN NOT NULL DEFAULT FALSE
);

CREATE TABLE
"posts" (
"id" SERIAL NOT NULL,
"account_id" VARCHAR NOT NULL,
"block_height" DECIMAL(58, 0) NOT NULL,
"receipt_id" VARCHAR NOT NULL,
"content" TEXT NOT NULL,
"block_timestamp" DECIMAL(20, 0) NOT NULL,
"accounts_liked" JSONB NOT NULL DEFAULT '[]',
"last_comment_timestamp" DECIMAL(20, 0),
CONSTRAINT "posts_pkey" PRIMARY KEY ("id")
);

CREATE TABLE
"comments" (
"id" SERIAL NOT NULL,
"post_id" SERIAL NOT NULL,
"account_id" VARCHAR NOT NULL,
"block_height" DECIMAL(58, 0) NOT NULL,
"content" TEXT NOT NULL,
"block_timestamp" DECIMAL(20, 0) NOT NULL,
"receipt_id" VARCHAR NOT NULL,
CONSTRAINT "comments_pkey" PRIMARY KEY ("id")
);

CREATE TABLE
"post_likes" (
"post_id" SERIAL NOT NULL,
"account_id" VARCHAR NOT NULL,
"block_height" DECIMAL(58, 0),
"block_timestamp" DECIMAL(20, 0) NOT NULL,
"receipt_id" VARCHAR NOT NULL,
CONSTRAINT "post_likes_pkey" PRIMARY KEY ("post_id", "account_id")
);

CREATE UNIQUE INDEX "posts_account_id_block_height_key" ON "posts" ("account_id" ASC, "block_height" ASC);

CREATE UNIQUE INDEX "comments_post_id_account_id_block_height_key" ON "comments" (
"post_id" ASC,
"account_id" ASC,
"block_height" ASC
);

CREATE INDEX
"posts_last_comment_timestamp_idx" ON "posts" ("last_comment_timestamp" DESC);

ALTER TABLE
"comments"
ADD
CONSTRAINT "comments_post_id_fkey" FOREIGN KEY ("post_id") REFERENCES "posts" ("id") ON DELETE NO ACTION ON UPDATE NO ACTION;

ALTER TABLE
"post_likes"
ADD
CONSTRAINT "post_likes_post_id_fkey" FOREIGN KEY ("post_id") REFERENCES "posts" ("id") ON DELETE CASCADE ON UPDATE NO ACTION;

CREATE TABLE IF NOT EXISTS
"public"."My Table1" (id serial PRIMARY KEY);

CREATE TABLE
"Another-Table" (id serial PRIMARY KEY);

CREATE TABLE
IF NOT EXISTS
"Third-Table" (id serial PRIMARY KEY);

CREATE TABLE
yet_another_table (id serial PRIMARY KEY);
`;

beforeEach(() => {
process.env = {
Expand Down Expand Up @@ -425,10 +517,30 @@ describe('Indexer unit tests', () => {
});

const indexer = new Indexer('mainnet', { DmlHandler: mockDmlHandler });
const context = indexer.buildContext(SOCIAL_SCHEMA, 'morgs.near/social_feed1', 1, 'postgres');
const context = indexer.buildContext(STRESS_TEST_SCHEMA, 'morgs.near/social_feed1', 1, 'postgres');

// These calls would fail on a real database, but we are merely checking to ensure they exist
expect(Object.keys(context.db)).toStrictEqual(['insert_posts', 'select_posts', 'insert_comments', 'select_comments', 'insert_post_likes', 'select_post_likes']);
expect(Object.keys(context.db)).toStrictEqual(
['insert_creator_quest',
'select_creator_quest',
'insert_composer_quest',
'select_composer_quest',
'insert__contractor___quest_',
'select__contractor___quest_',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a _ at the end?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an artifact of some edge case I had considered from an earlier implementation. The outside _ are where the double quote symbol was replaced. The current implementation adds quotes after the table names are collected. I had an edge case in mind where there might be "tableName" and tableName and we would want to distinguish between them but I think the current code will fetch only tableName as it does not require double quotes since its a valid name on its own. If someone wanted "tableName" as the actual name, it'd have to be ""tableName"" or something. In any case, I can strip the outer quotes before doing the replacement.

'insert_posts',
'select_posts',
'insert_comments',
'select_comments',
'insert_post_likes',
'select_post_likes',
'insert__My_Table1_',
'select__My_Table1_',
'insert__Another_Table_',
'select__Another_Table_',
'insert__Third_Table_',
'select__Third_Table_',
'insert_yet_another_table',
'select_yet_another_table']);
});

test('Indexer.runFunctions() allows imperative execution of GraphQL operations', async () => {
Expand Down
31 changes: 22 additions & 9 deletions runner/src/indexer/indexer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -187,21 +187,30 @@ export default class Indexer {
}

validateTableNames (tableNames: string[]): void {
if (!(tableNames.length > 0)) {
if (!(Array.isArray(tableNames) && tableNames.length > 0)) {
throw new Error('Schema does not have any tables. There should be at least one table.');
}
const correctTableNameFormat = /^[a-zA-Z_][a-zA-Z0-9_]*$/;

tableNames.forEach((name: string) => {
if (!correctTableNameFormat.test(name)) {
tableNames.forEach(name => {
if (!name.includes('"') && !correctTableNameFormat.test(name)) { // Only test if table name doesn't have quotes
throw new Error(`Table name ${name} is not formatted correctly. Table names must not start with a number and only contain alphanumerics or underscores.`);
}
});
}

getTableNames (schema: string): string[] {
const tableNameMatcher = /CREATE TABLE\s+"(\w+)"/g;
const tableNames = Array.from(schema.matchAll(tableNameMatcher), match => match[1]); // Get first capturing group of each match
const tableRegex = /CREATE TABLE\s+(?:IF NOT EXISTS)?\s+"?(.+?)"?\s*\(/g;
const tableNames = Array.from(schema.matchAll(tableRegex), match => {
let tableName;
if (match[1].includes('.')) { // If expression after create has schemaName.tableName, return only tableName
tableName = match[1].split('.')[1];
tableName = tableName.startsWith('"') ? tableName.substring(1) : tableName;
} else {
tableName = match[1];
}
return /^\w+$/.test(tableName) ? tableName : `"${tableName}"`; // If table name has special characters, it must be inside double quotes
});
this.validateTableNames(tableNames);
console.log('Retrieved the following table names from schema: ', tableNames);
return tableNames;
Expand Down Expand Up @@ -245,13 +254,17 @@ export default class Indexer {
let dmlHandler: DmlHandler | null = null;
const result = tables.reduce((prev, tableName) => ({
...prev,
[`insert_${tableName}`]: async (objects: any) => {
await this.writeLog(`context.db.insert_${tableName}`, blockHeight, `Calling context.db.insert_${tableName}.`, `Inserting object ${JSON.stringify(objects)} into table ${tableName} on schema ${schemaName}`);
[`insert_${tableName.replace(/[^a-zA-Z0-9_]/g, '_')}`]: async (objects: any) => {
await this.writeLog(`context.db.insert_${tableName.replace(/[^a-zA-Z0-9_]/g, '_')}`, blockHeight,
`Calling context.db.insert_${tableName.replace(/[^a-zA-Z0-9_]/g, '_')}.`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`Calling context.db.insert_${tableName.replace(/[^a-zA-Z0-9_]/g, '_')}.`,
`Calling context.db.insert_${this.sanitizeTableName(tableName)}.`,

This is used all over the place - maybe extract to its own method?

Copy link
Collaborator Author

@darunrs darunrs Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

`Inserting object ${JSON.stringify(objects)} into table ${tableName} on schema ${schemaName}`);
dmlHandler = dmlHandler ?? new this.deps.DmlHandler(account);
return await dmlHandler.insert(schemaName, tableName, Array.isArray(objects) ? objects : [objects]);
},
[`select_${tableName}`]: async (object: any, limit = null) => {
await this.writeLog(`context.db.select_${tableName}`, blockHeight, `Calling context.db.select_${tableName}.`, `Selecting objects with values ${JSON.stringify(object)} from table ${tableName} on schema ${schemaName} with limit ${limit === null ? 'no' : limit}`);
[`select_${tableName.replace(/[^a-zA-Z0-9_]/g, '_')}`]: async (object: any, limit = null) => {
await this.writeLog(`context.db.select_${tableName.replace(/[^a-zA-Z0-9_]/g, '_')}`, blockHeight,
`Calling context.db.select_${tableName.replace(/[^a-zA-Z0-9_]/g, '_')}.`,
`Selecting objects with values ${JSON.stringify(object)} from table ${tableName} on schema ${schemaName} with limit ${limit === null ? 'no' : limit}`);
dmlHandler = dmlHandler ?? new this.deps.DmlHandler(account);
return await dmlHandler.select(schemaName, tableName, object, limit);
},
Expand Down