Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cmd/influx/write): add new processing options and enhancements #18779

Merged
merged 15 commits into from
Sep 12, 2020

Conversation

sranka
Copy link
Contributor

@sranka sranka commented Jun 29, 2020

Closes #18744 : indicate a loss of precision when parsing long/unsignedLong data types

When parsing long or unsignedLong value from a string value with fraction digits, the whole CSV row fails when in a strict mode, which is turned on using a column dataFormat that starts with strict, such as long:strict. A warning is printed when not in a strict mode, saying line x: column y: '1.2' truncated to '1' to fit into long data type. Package documentation explains strict parsing.

Closes #18742 : add the ability to write rejected records to a specific file

There is a new --error-file option in influx write

   --errors-file string   The path to the file to write rejected rows to

The file then contains rows than cannot be imported because of data quality issues, CSV comments are also written to let the user know of what was wrong, for example:

# error : line 3: column 'a': '1.1' cannot fit into long data type
cpu,1.1

Closes #18741: add the ability to construct a timestamp from multiple columns

A new #concat annotation adds a new column that is concatenated from existing columns according to bash-like string interpolation literal with variables referencing existing column labels. For example: #concat,string,fullName,${firstName} ${lastName}

@sranka sranka changed the title 18744/csv2lp feat(pkg/csv2lp): indicate a loss of precision when parsing long/unsignedLong data types Jun 29, 2020
@sranka sranka changed the title feat(pkg/csv2lp): indicate a loss of precision when parsing long/unsignedLong data types feat(cmd/influx/write): new processing options and enhancements Jun 30, 2020
@sranka sranka changed the title feat(cmd/influx/write): new processing options and enhancements feat(cmd/influx/write): add new processing options and enhancements Jun 30, 2020
@sranka sranka marked this pull request as ready for review July 2, 2020 08:41
pkg/csv2lp/csv2lp_test.go Outdated Show resolved Hide resolved
@GeorgeMac
Copy link
Contributor

Thanks for addressing my concern. It looks much better.
I would defer to the other reviewers who may be more familiar with this part. As I don't have much context on this part of the codebase.
But my static pass of the code looks good from my end.

@russorat russorat requested review from glinton and removed request for jsteenb2, sebito91 and GeorgeMac September 1, 2020 16:19
Comment on lines 119 to 121
columnLabel := placeholder[2 : len(placeholder)-1] // ${columnLabel}
if columnLabel == "$" {
return nil // ${$} is a way to print $
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 there is no need to do so

func normalizeNumberString(value string, format string, removeFraction bool) string {
if len(format) == 0 {
func normalizeNumberString(value string, format string, removeFraction bool) (normalized string, truncated bool) {
if format == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if format == "" {
if len(format) == 0 {

the former method is technically faster

@@ -110,20 +112,20 @@ func normalizeNumberString(value string, format string, removeFraction bool) str
}
if c == fractionRune {
if removeFraction {
break ForAllCharacters
return retVal.String(), true
Copy link
Contributor

@glinton glinton Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i assume this new behavior is intended (not looping through values again)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is intended, the function was changed to return also a truncated indicator, which is true only in this case

@sranka sranka merged commit be8b2a9 into master Sep 12, 2020
@sranka sranka deleted the 18744/csv2lp branch September 12, 2020 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants