-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser trips up on common snowflake query history #78
Comments
Yes. This is a known issue and happy to discuss how to approach parsing Snowflake queries better. Right now, this package uses pglast for parsing. So it parses Postgres syntax. It also does pretty well with Redshift since the dialect is very similar.
Both of these require substantial effort and I am looking for opportunities to fund it. |
Ok. Is it fair to say that this package doesn't really support snowflake yet? I see it in the marketing materials - https://tokern.io/data-lineage/ .. but in practice when I try the examples, I never get any result in the graph because it trips up on so many queries. If there is an implementation that does use snowflake effectively, or at least filter out the syntax that pglast doesn't support, I'd be happy to use it... but right now I think I am going to have to try a different package because it's getting to a point where I have so many query syntax exclusions that it's becoming not feasible... (even then, queries are failing and not telling me why.. I'm just getting a bunch of these now):
|
Yes. The open source package as it currently stands has poor Snowflake coverage. This is because there is no good open source Snowflake SQL parser. All other OSS packages have a similar problem. |
Currently, the parser trips up on many common snowflake query history entries like
select query_text from table(information_schema.query_history());
- also queries with therm @SNOWFLAKE_...
syntax... also queries with the keywordrecluster
... in the latter case, the error beingsyntax error at or near "recluster", at index 35
... I am systematically removing these from analysis prior to sending to the analyzer but just FYI that without doing this, the analyzer throws an exceptionThe text was updated successfully, but these errors were encountered: