-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add another flexible way for JDBC paging (manual mode) #95
Add another flexible way for JDBC paging (manual mode) #95
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -55,6 +55,9 @@ def setup_jdbc_config | |||
# Be aware that ordering is not guaranteed between queries. | |||
config :jdbc_paging_enabled, :validate => :boolean, :default => false | |||
|
|||
# Whether to use manual mode during the JDBC paging | |||
config :jdbc_paging_manual_mode, :validate => :boolean, :default => false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late review, did you guys consider having better naming for the feature? e.g.
jdbc_paging_enabled => true jdbc_paging_mode => offset
any chance this reads better (than "manual"
)?
(jdbc_paging_mode => count
would de the default value)
should also keep things open for changing the paging strategy again (offset paging is a bit of an anti pattern although for mostly static data and LS' needs of at least once delivery it's sufficient), wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kares I like your suggestion, instead of using a boolean flag defined a string value for the strategy is more readable.
However this feature doesn't cover only static data but also use cases where the Sequel
automatic paging doesn't apply correctly, mostly:
- procedure calls with paging limits
- paging in inner queries in case of nested SQL queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, maybe we need better naming there - haven't thought about it that much.
however the manual just read a bit weird - had no idea what that could mean SQL wise.
what I did is check Sequel's each_page
and it's using a COUNT(...)
initial query (the details on how the statement is transformed into a count is usually adapter specific), haven't checked any adapter specific overrides for each_page
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kares to me the naming you are proposing is good, better than the boolean flag. So I would use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kares @andsel @karenzone Thanks for your time. I also think this suggestion is good and will improve according to it.
I have a little question about the name.
Actually, this feature involves two options:
- Not to add paging condition automatically, but explicitly customized by user.
- Not to use count query first
The first one is the main purpose of this feature.
So, is it easy for users to understand the two values of 'count' and 'offset'?
If everyone don't have other suggested names, I will implement it according to current suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, we could use
automatic
fordbc_paging_mode => count
andexplicit
forjdbc_paging_mode => offset
@kares is it a better naming in your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those sound better instead of automatic maybe jdbc_paging_mode => auto
to choose the best available strategy based on the driver (Sequel does this). jdbc_paging_mode => explicit
or manual
both sounds okay to me.
# @yieldparam row [Hash{Symbol=>Object}] | ||
def perform_query(db, sql_last_value, jdbc_paging_enabled, jdbc_page_size) | ||
def perform_query(db, sql_last_value, jdbc_paging_enabled, jdbc_paging_manual_mode, jdbc_page_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice if there was a new class as it's a different algorithm e.g. ExplicitStatementHandler
instead of hacking the algorithm into the existing one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice idea, I come up with some commits to switch this
@@ -55,6 +55,9 @@ def setup_jdbc_config | |||
# Be aware that ordering is not guaranteed between queries. | |||
config :jdbc_paging_enabled, :validate => :boolean, :default => false | |||
|
|||
# Whether to use manual mode during the JDBC paging | |||
config :jdbc_paging_manual_mode, :validate => :boolean, :default => false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those sound better instead of automatic maybe jdbc_paging_mode => auto
to choose the best available strategy based on the driver (Sequel does this). jdbc_paging_mode => explicit
or manual
both sounds okay to me.
@kares I've integrated your great latest suggest, I would ask another round of your eyes on this, please. |
* some changes about code specification
…criptive peging mode, with and values
2c21b3b
to
e456bf7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, well done!
Many thanks @dingx1 for all of this 🥇 it was published with https://rubygems.org/gems/logstash-integration-jdbc/versions/5.2.0 |
Wow, it's finally done. Great! ^_^ |
Release notes
Added
jdbc_paging_mode
option to choose if useexplicit
pagination in statements and avoid the initial count query or useauto
to delegate to the underlying library.What does this PR do?
This commit introduces a new configuration option to avoid the initial count select statement executed by Sequel in case of paginated queries.
The paginated queries could be done now with initial count or not, in case
jdbc_paging_mode
is valuesexplicit
the plugin executes paged queries till it reach a page with less rows than the expected, instead of relying on the total row count.In this case the SQL statement has to explicitly use the pagination keywords (like
LIMIT
andOFFSET
) receiving as:offset
and:size
as implicit parameters.When the paging mode
jdbc_paging_mode
is set toauto
then the pagination happens automatically without the intervention of the user to create paginated query.Why is it important/What is the impact to the user?
In some circumstances, like stored procedure call with pagination bounds, the initial count query is simply not meaningful.
In cases where the pages to retrieve are few the initial count query could be an useful overhead.
In some complex nested queries the count could generate extra work that's not usefull
Checklist
I have made corresponding change to the default configuration files (and/or docker env variables)Author's Checklist
How to test this PR locally
jdbc_paging_avoid_count
and check it retrieves all the expected rows. In logs should be present a line for each paged queryRelated issues
Use cases
Simple query with small result sets
In a simple query like:
the automatic pagination would generate the count query plus the paginated queries.
With small resultsets it issues 2 queries instead of one query.
A stored procedure that requires pagination information. Suppose you have a stored like
the automatic pagination would generate an invalid SQL statement:
which is not what's expected.
Nested queries with pagination on the inner query
Suppose the query you want to paginate is the inner on, like in:
Without this feature, the automatic pagination would generate the following SQL:
which is the pagination applied to the outer query instead of the inner one.