-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
api: allow filter EVM tokens by name #460
Conversation
|
||
CREATE EXTENSION pg_trgm; | ||
|
||
CREATE INDEX ix_evm_tokens ON chain.evm_tokens USING GIST (token_name gist_trgm_ops); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://www.postgresql.org/docs/current/pgtrgm.html
can someone check if this is right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh the index name was not good. fixed to be more descriptive
storage/client/queries/queries.go
Outdated
@@ -462,10 +462,11 @@ const ( | |||
WHERE | |||
(tokens.runtime = $1) AND | |||
($2::oasis_addr IS NULL OR tokens.token_address = $2::oasis_addr) AND | |||
($3::text IS NULL OR tokens.token_name ILIKE '%' || $3 || '%') AND |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name input is not escaped, searches with %
and _
will do special things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue also mentioned filtering by ticker (symbol
in the db).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh let me add that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added same stuff for symbol
|
||
BEGIN; | ||
|
||
CREATE EXTENSION pg_trgm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this breaks our wipe db script 💀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some code to drop extensions too 💀 💀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also just go with CREATE EXTENSION IF NOT EXISTS pg_trgm
, and avoid the wiping. I'd prefer that, actually, for simplicity. It's extremely unlikely that we'll want to redefine the extension at some point :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, I think it would also not be irresponsible to just omit the index. We have 2k EVM tokens in mainnet emerald. Even at 10k, scanning through their short names should be very fast.
Edit: I tried in prod mainnet now, with and without the index. The index does get used. Make of the time difference what you will:
[0/21466]
oasisindexer=> explain analyze select count(*) from chain.evm_tokens where token_name ilike '%eth%';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Aggregate (cost=68.17..68.18 rows=1 width=8) (actual time=1.491..1.493 rows=1 loops=1)
-> Seq Scan on evm_tokens (cost=0.00..68.04 rows=53 width=0) (actual time=0.016..1.481 rows=58 loops=1)
Filter: (token_name ~~* '%eth%'::text)
Rows Removed by Filter: 1865
Planning Time: 0.325 ms
Execution Time: 1.544 ms
(6 rows)
oasisindexer=> CREATE EXTENSION pg_trgm;
CREATE EXTENSION
oasisindexer=>
oasisindexer=> CREATE INDEX ix_evm_tokens_name ON chain.evm_tokens USING GIST (token_name gist_trgm_ops);
CREATE INDEX
oasisindexer=> CREATE INDEX ix_evm_tokens_symbol ON chain.evm_tokens USING GIST (symbol gist_trgm_ops);
CREATE INDEX
oasisindexer=> explain analyze select count(*) from chain.evm_tokens where token_name ilike '%eth%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=51.69..51.70 rows=1 width=8) (actual time=0.455..0.456 rows=1 loops=1)
-> Bitmap Heap Scan on evm_tokens (cost=4.56..51.56 rows=53 width=0) (actual time=0.360..0.447 rows=58 loops=1)
Recheck Cond: (token_name ~~* '%eth%'::text)
Heap Blocks: exact=30
-> Bitmap Index Scan on ix_evm_tokens_name (cost=0.00..4.54 rows=53 width=0) (actual time=0.344..0.345 rows=58 loops=1)
Index Cond: (token_name ~~* '%eth%'::text)
Planning Time: 0.538 ms
Execution Time: 0.651 ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on 58 entries?
well I'll move this to a separate PR, let's hold off until it becomes slower
this query is going to run ~on keystroke, so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh but
{"args":[],"caller":"client.go:60","db":"indexer","err":"ERROR: cannot drop type gtrgm because extension pg_trgm requires it (SQLSTATE 2BP01)","level":"error","module":"postgres","msg":"Query","pid":188,"sql":"DROP TYPE public.gtrgm CASCADE;","time":"158.785µs","ts":"2023-06-27T18:11:29.493112027Z"}
with the current wipe code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can manually exclude gtrgm
from the list of types to be wiped? It's a little hairy but oh well, the whole Wipe() concept is vaguely a hack. I checked just now and it seems to be the only type created by the extension.
on 58 entries?
58 that match'%eth%'
, but ~2k entries total (Rows Removed by Filter: 1865
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then there's a bunch of functions that it depends on
7bda4f6
to
78e65b1
Compare
storage/postgres/client.go
Outdated
if err != nil { | ||
return fmt.Errorf("failed to list extensions: %w", err) | ||
} | ||
defer rows.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use multiple defer rows.Close()
in this function, reassigning rows
💀 💀 💀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think defer
captures enough context that it's all OK? Example: https://go.dev/play/p/6g8xQemALDi
If not, please do fix 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct, pointer receiver
https://go.dev/play/p/fWui7UAvo7N -> 2 2 ❌
pointer to struct, pointer receiver
https://go.dev/play/p/FNsKXdCQ7Fs -> 2 1 ✔️
interface with pointer to struct, pointer receiver
https://go.dev/play/p/Nywd7MMYyvd -> 2 1 ✔️
interface with struct, struct receiver
https://go.dev/play/p/E4f6lT8zO5o -> 2 1 ✔️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this stuff is too... it's like the phrase "sensitive to initial conditions" that you might say of a physical system. I'm just gonna wrap these in iifes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although I'm going to get rid of this and go with the create extension if not exists way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm there's a listIndexerTables in its own method, so I'll go with methods
storage/postgres/client.go
Outdated
return err | ||
} | ||
// This one was here when I got here. | ||
if extension == "plpgsql" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never heard of this extension before, so not touching it??? 💀 💀 💀 💀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised this is even an extension, technically. It's the de-facto default procedural language for writing functions in postgres. And yes, good call not to remove, I've used it in ad-hoc ways before, and we might use it later.
WDYT about avoiding this if
and just doing SELECT ... WHERE extname != 'plpgsql'
instead? It's how most other filtering in Wipe() is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh ok
although let's go with the create extension if not exists
rebased on main |
name: name | ||
schema: | ||
type: string | ||
description: A filter on the name, the name must contain this value as a substring. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugh this should be updated to say that the symbol can match too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I have some ideas for simplifying, see what you think.
storage/postgres/client.go
Outdated
return err | ||
} | ||
// This one was here when I got here. | ||
if extension == "plpgsql" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised this is even an extension, technically. It's the de-facto default procedural language for writing functions in postgres. And yes, good call not to remove, I've used it in ad-hoc ways before, and we might use it later.
WDYT about avoiding this if
and just doing SELECT ... WHERE extname != 'plpgsql'
instead? It's how most other filtering in Wipe() is done.
storage/postgres/client.go
Outdated
if err != nil { | ||
return fmt.Errorf("failed to list extensions: %w", err) | ||
} | ||
defer rows.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think defer
captures enough context that it's all OK? Example: https://go.dev/play/p/6g8xQemALDi
If not, please do fix 🙏
|
||
BEGIN; | ||
|
||
CREATE EXTENSION pg_trgm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also just go with CREATE EXTENSION IF NOT EXISTS pg_trgm
, and avoid the wiping. I'd prefer that, actually, for simplicity. It's extremely unlikely that we'll want to redefine the extension at some point :)
|
||
BEGIN; | ||
|
||
CREATE EXTENSION pg_trgm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, I think it would also not be irresponsible to just omit the index. We have 2k EVM tokens in mainnet emerald. Even at 10k, scanning through their short names should be very fast.
Edit: I tried in prod mainnet now, with and without the index. The index does get used. Make of the time difference what you will:
[0/21466]
oasisindexer=> explain analyze select count(*) from chain.evm_tokens where token_name ilike '%eth%';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Aggregate (cost=68.17..68.18 rows=1 width=8) (actual time=1.491..1.493 rows=1 loops=1)
-> Seq Scan on evm_tokens (cost=0.00..68.04 rows=53 width=0) (actual time=0.016..1.481 rows=58 loops=1)
Filter: (token_name ~~* '%eth%'::text)
Rows Removed by Filter: 1865
Planning Time: 0.325 ms
Execution Time: 1.544 ms
(6 rows)
oasisindexer=> CREATE EXTENSION pg_trgm;
CREATE EXTENSION
oasisindexer=>
oasisindexer=> CREATE INDEX ix_evm_tokens_name ON chain.evm_tokens USING GIST (token_name gist_trgm_ops);
CREATE INDEX
oasisindexer=> CREATE INDEX ix_evm_tokens_symbol ON chain.evm_tokens USING GIST (symbol gist_trgm_ops);
CREATE INDEX
oasisindexer=> explain analyze select count(*) from chain.evm_tokens where token_name ilike '%eth%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=51.69..51.70 rows=1 width=8) (actual time=0.455..0.456 rows=1 loops=1)
-> Bitmap Heap Scan on evm_tokens (cost=4.56..51.56 rows=53 width=0) (actual time=0.360..0.447 rows=58 loops=1)
Recheck Cond: (token_name ~~* '%eth%'::text)
Heap Blocks: exact=30
-> Bitmap Index Scan on ix_evm_tokens_name (cost=0.00..4.54 rows=53 width=0) (actual time=0.344..0.345 rows=58 loops=1)
Index Cond: (token_name ~~* '%eth%'::text)
Planning Time: 0.538 ms
Execution Time: 0.651 ms
78e65b1
to
f29b2db
Compare
tested locally on ~500 blocks from Emerald. very few tokens though