Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More Converting operations #133

Merged
merged 7 commits into from
Aug 9, 2022
Merged

More Converting operations #133

merged 7 commits into from
Aug 9, 2022

Conversation

Kopilov
Copy link
Contributor

@Kopilov Kopilov commented Aug 1, 2022

When working with specific local settings, we still can get data in international format. And it should be parsed rather than rejected. Here we have fallback on parsing doubles with POSIX number format. (This was made for convertTo function but also works on primary parsing. For example, readCsvWithFloats test now passes on my computer without switching to empty locale manually.) May be, we should have similar logic in date, time and some other parsers.

Also some new converters (to Boolean and LocalTime) have been added.

@koperagen
Copy link
Collaborator

Hi! Thank you for the PR. Parsing numbers is quite an old problem here :) #9

@Kopilov
Copy link
Contributor Author

Kopilov commented Aug 4, 2022

Review current version, please.

Also do you think we should make rendering test locale-independent as was suggested here?

Locale.setDefault(Locale.forLanguageTag("ru-RU"))

columnDot.convertTo<Double>().shouldBe(columnOf(12.345, 67.89))
columnComma.convertTo<Double>().shouldBe(columnOf(12345.0, 67890.0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be 12.345, 67.890? Looks suspicious that default convertTo works identically for en_US and ru_RU.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks! But I have not found a better way than calling DataColumn<String>.convertToDouble directly from universal convertTo<>.

val options = if (locale != null) ParserOptions(
locale = locale
) else null
return parserToDoubleWithOptions.toConverter(options)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use parserToDoubleWithOptions.applyOptions here, i think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Originally I used toConverter for getting TypeConversionException to be thrown. Now it is thrown in .convertToDouble function and this is renamed to getDoubleParser.

*/
@JvmName("convertToDoubleFromStringNullable")
public fun DataColumn<String?>.convertToDouble(locale: Locale? = null): DataColumn<Double?> {
if (locale is Locale) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think?

val converter = if (locale != null) {
    Parsers.getDoubleConverter(locale)
} else {
    Parsers.getDoubleConverter()
}

Copy link
Contributor Author

@Kopilov Kopilov Aug 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then simply call map { converter(it) }? This doesn't seem to work because we wanted the scalar converter not to use fallback logic. And it is implemented here now…

@koperagen
Copy link
Collaborator

Review current version, please.

Also do you think we should make rendering test locale-independent as was suggested here?

Sure. I guess right now it's impossible to run all tests on a PC with locale that uses comma? Would be nice to fix it

@Kopilov
Copy link
Contributor Author

Kopilov commented Aug 7, 2022

Sure. I guess right now it's impossible to run all tests on a PC with locale that uses comma?

Until now I used export LANG=C.UTF-8 command in bash session to avoid problems.

Would be nice to fix it

Done. Double printing is unified by DecimalFormatSymbols.getInstance().decimalSeparator and LocalDate parsing is forced by using en_US explicitly.

@koperagen koperagen merged commit e8ce4b1 into Kotlin:master Aug 9, 2022
@Kopilov Kopilov deleted the converting branch August 9, 2022 13:28
@Jolanrensen Jolanrensen added this to the 0.9.0 milestone Dec 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants