Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDBC Input set jdbc_default_timezone loses milliseconds #140

Closed
Rob--- opened this issue Jun 8, 2016 · 9 comments · Fixed by #260
Closed

JDBC Input set jdbc_default_timezone loses milliseconds #140

Rob--- opened this issue Jun 8, 2016 · 9 comments · Fixed by #260
Assignees
Labels

Comments

@Rob---
Copy link

Rob--- commented Jun 8, 2016

The bug:

  1. When setting the timezone for UTC within JDBC input plug, it will lose the millesconds values of the datetime2 field ingested from sql.

For example: 2016-06-07T10:20:05.223Z with no timezone set will convert via the machine's timezone with no data loss. e.g. if the computer was New York time, it would convert by adding +4 to the hours, and maintaining the rest of the data.
However, if you specify that the timezone is UTC by putting jdbc_default_timezone => "UTC" in your input JDBC plugin, it will convert the time to be 2016-06-07T10:20:05.000Z

Setup Info:
Logstash Version: 2.3.2 (current)
OS: Windows 7
Config File:

input {
  jdbc {
    jdbc_driver_library => "xxx\sqljdbc4.jar"
    jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    jdbc_connection_string => "xxx"
    jdbc_user => "xxx"
    jdbc_password => "xxx"
    statement => "xxx"
    type => "xxx"
    jdbc_default_timezone => "UTC"
  }


}

filter {
}

output {
    stdout {
      codec => json_lines
    }
    elasticsearch{
        index => "alarms"
        document_type => "alarm"            
    }
}

Sample Data:
SQL, pulling data from a datetime2 field.

Steps to reproduce: Ingest data into elasticsearch via the minimal logstash config shown above using the JDBC plugin with the JDBC driver sqljdbc4.jar from a SQL database table that contains a datetime2(7) field.

@artyombaranovskiytr
Copy link

+1

2 similar comments
@mymasse
Copy link

mymasse commented Sep 13, 2016

+1

@mbelousov7
Copy link

+1

@ajakuszyk
Copy link

ajakuszyk commented Mar 23, 2017

The truncation still happens on latest logstash 5.2.2 for jdbc_default_timezone => "Europe/Zurich"
As a workaround I've removed jdbc_default_timezone from input and used this filter:

filter {	
	# need to convert to string before changing timezone
	mutate {
		convert => { "timestamp" => "string" }
	}
	
	date {
		match => [ "timestamp", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'" ]
		timezone => "Europe/Zurich"
		remove_field => "timestamp"
	}	
}

@jeslopalo
Copy link

It's same for me with 'Europe/Madrid'

@jbwl
Copy link

jbwl commented Sep 8, 2017

Please fix this, it leads to data duplication which is obviously dangerous.

@bzier
Copy link

bzier commented Oct 10, 2017

I specifically encounter this (or something similar) when using a datetime2 field as the tracking_column. The timestamp stored in the last_run_metadata_path has the full precision, but when it is used in the statement as the :sql_last_value, the value sent is truncated to a precision of .fff. In this way it is a little different than what the OP posted because milliseconds aren't lost completely, but the precision is off, and results in duplicated data. If this should be logged as a separate issue, please let me know.

@guyboertje
Copy link
Contributor

guyboertje commented Feb 8, 2018

OK I now know what the problem is here.

This line converts a DateTime into a Time but drops the sub-seconds in the process.

@guyboertje guyboertje self-assigned this Feb 8, 2018
guyboertje pushed a commit that referenced this issue Feb 27, 2018
…gs (#260)

When :sql_last_value is sent the to database, it is converted to the JVM default timezone. This change honors the jdbc_default_timezone for use in the :sql_last_value timezone conversion.

Assuming America/Chicago as the default JVM timezone, but the configuration declares UTC as the database timezone jdbc_default_timezone => "UTC"
The last_run is correctly recorded as 2018-02-23 22:30:34.054592000 Z , but when the > :sql_last_value gets sent to the database, it incorrectly gets converted to the JVM default timezone ( > '2018-02-23 16:30:34' ).

The same holds true if reversed, Logstash has UTC as the JVM timezone and your database records datetimes in a local timezone.

This change fixes the conversion when sending the :sql_last_value to the database to honor the jdbc_default_timezone setting. Specifically, this leverages the fact that the Sequel library will handle the timezone conversions properly if passed a DateTime object, and won't don't any timezone conversions if passed a Time object.

This change also refactors parts of the code for better readability and adds more tests.

Fixes #140
@mtonpe
Copy link

mtonpe commented Apr 29, 2020

I specifically encounter this (or something similar) when using a datetime2 field as the tracking_column. The timestamp stored in the last_run_metadata_path has the full precision, but when it is used in the statement as the :sql_last_value, the value sent is truncated to a precision of .fff. In this way it is a little different than what the OP posted because milliseconds aren't lost completely, but the precision is off, and results in duplicated data. If this should be logged as a separate issue, please let me know.

Is this been fixed in any version of logstash-jdbc?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet