Skip to content

Commit

Permalink
Data analysts should be able to use Text.to_case to change the case…
Browse files Browse the repository at this point in the history
… of Text values (#3302)

* Move to_upper_case and to_lower_case into to_case

* Add an export, not sure about it

* Implement title case

TODO: some more tests would be good

* Add more tests

* explain title case

* fix todo

* changelog
  • Loading branch information
radeusgd authored Feb 28, 2022
1 parent a3914f3 commit 0d96f59
Show file tree
Hide file tree
Showing 7 changed files with 71 additions and 48 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@
functions][3287]
- [Implemented new `Text.starts_with` and `Text.ends_with` functions, replacing
existing functions][3292]
- [Implemented `Text.to_case`, replacing `Text.to_lower_case` and
`Text.to_upper_case`][3302]

[debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
Expand Down Expand Up @@ -85,6 +87,7 @@
[3285]: https://github.com/enso-org/enso/pull/3285
[3287]: https://github.com/enso-org/enso/pull/3287
[3292]: https://github.com/enso-org/enso/pull/3292
[3302]: https://github.com/enso-org/enso/pull/3302

#### Enso Compiler

Expand Down
10 changes: 10 additions & 0 deletions distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Case.enso
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
## Specifies the casing options for text conversion.
type Case
## All letters in lower case.
type Lower

## All letters in upper case.
type Upper

## First letter of each word in upper case, rest in lower case.
type Title
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ from Standard.Builtins import Text, Prim_Text_Helpers

import Standard.Base.Data.Text.Regex
import Standard.Base.Data.Text.Regex.Mode
import Standard.Base.Data.Text.Case
import Standard.Base.Data.Text.Line_Ending_Style
import Standard.Base.Data.Text.Split_Kind
import Standard.Base.Data.Text.Text_Sub_Range
Expand All @@ -13,6 +14,7 @@ import Standard.Base.Meta

from Standard.Builtins export Text

export Standard.Base.Data.Text.Case
export Standard.Base.Data.Text.Split_Kind
export Standard.Base.Data.Text.Line_Ending_Style

Expand Down Expand Up @@ -551,7 +553,7 @@ Text.equals_ignore_case that locale=Locale.default =
used to perform case-insensitive comparisons.
Text.to_case_insensitive_key : Locale -> Text
Text.to_case_insensitive_key locale=Locale.default =
this.to_lower_case locale . to_upper_case locale
this.to_case Case.Lower locale . to_case Case.Upper locale

## Compare two texts to discover their ordering.

Expand Down Expand Up @@ -984,40 +986,11 @@ Text.drop range =
if char_range.end == (Text_Utils.char_length this) then prefix else
prefix + Text_Utils.drop_first this char_range.end

## ALIAS Lower Case

Converts each character in `this` to lower case.

Arguments:
- locale: specifies the locale for character case mapping. Defaults to the
`Locale.default` locale.

! What is a Character?
A character is defined as an Extended Grapheme Cluster, see Unicode
Standard Annex 29. This is the smallest unit that still has semantic
meaning in most text-processing applications.

> Example
Converting a text to lower case in the default locale:

"My TeXt!".to_lower_case == "my text!"

> Example
Converting a text to lower case in a specified locale (here, Turkey):

from Standard.Base import all
import Standard.Base.Data.Locale

example_case_with_locale = "I".to_lower_case (Locale.new "tr") == "ı"
Text.to_lower_case : Locale.Locale -> Text
Text.to_lower_case locale=Locale.default =
UCharacter.toLowerCase locale.java_locale this

## ALIAS Upper Case

Converts each character in `this` to upper case.
## ALIAS lower, upper, title, proper
Converts each character in `this` to the specified case.

Arguments:
- case_option: specifies how to convert the characters.
- locale: specifies the locale for character case mapping. Defaults to
`Locale.default`.

Expand All @@ -1026,18 +999,26 @@ Text.to_lower_case locale=Locale.default =
Standard Annex 29. This is the smallest unit that still has semantic
meaning in most text-processing applications.

! What is title case?
Title case capitalizes the first letter of every word and ensures that all
the remaining letters are in lower case. Some definitions of title case
avoid capitalizing minor words (like the article "the" in English) but this
implementation treats all words in the same way.

> Example
Converting a text to upper case in the default locale:
Converting a text to lower case in the default locale:

"My TeXt!".to_upper_case == "MY TEXT!"
"My TeXt!".to_case == "my text!"

> Example
Converting a text to upper case in a specified locale:

from Standard.Base import all
import Standard.Base.Data.Locale

example_case_with_locale = "i".to_upper_case (Locale.new "tr") == "İ"
Text.to_upper_case : Locale.Locale -> Text
Text.to_upper_case locale=Locale.default =
UCharacter.toUpperCase locale.java_locale this
example_case_with_locale = "i".to_case Upper (Locale.new "tr") == "İ"
Text.to_case : Case -> Locale -> Text
Text.to_case case_option=Case.Lower locale=Locale.Default = case case_option of
Case.Lower -> UCharacter.toLowerCase locale.java_locale this
Case.Upper -> UCharacter.toUpperCase locale.java_locale this
Case.Title -> UCharacter.toTitleCase locale.java_locale this Nothing
9 changes: 8 additions & 1 deletion distribution/lib/Standard/Base/0.0.0-dev/src/Main.enso
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,14 @@ from project.Data.Number.Extensions export all hiding Math, String, Double
from project.Data.Noise export all hiding Noise
from project.Data.Pair export Pair
from project.Data.Range export Range
from project.Data.Text.Extensions export Text, Split_Kind, Line_Ending_Style
## TODO [RW] Once autoscoping is implemented or automatic imports for ADTs are
fixed in the IDE, we should revisit if we want to export ADTs like `Case` by
default. It may be unnecessary pollution of scope, but until the issues are
fixed, common standard library functions are almost unusable in the GUI.
Relevant issues:
https://www.pivotaltracker.com/story/show/181403340
https://www.pivotaltracker.com/story/show/181309938
from project.Data.Text.Extensions export Text, Split_Kind, Line_Ending_Style, Case
from project.Data.Text.Matching export Case_Insensitive, Text_Matcher, Regex_Matcher
from project.Error.Common export all
from project.Error.Extensions export all
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ type Table
representation for this table.
default_visualization : Visualization.Id.Id
default_visualization =
cols = this.columns.map .name . map .to_lower_case
cols = this.columns.map .name . map name-> name.to_case Case.Lower
if cols.contains "latitude" && cols.contains "longitude" then Visualization.Id.geo_map else
if cols.contains "x" && cols.contains "y" then Visualization.Id.scatter_plot else
Visualization.Id.table
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ type PointData

## PRIVATE
name : Text
name = this.to_text.to_lower_case
name = this.to_text.to_case Case.Lower

## PRIVATE
fallback_column : Table -> Column ! No_Fallback_Column
Expand Down
34 changes: 28 additions & 6 deletions test/Tests/src/Data/Text_Spec.enso
Original file line number Diff line number Diff line change
Expand Up @@ -300,12 +300,34 @@ spec =
'✨🚀🚧😍😃😍😎😙😉☺'.drop (Range -3 -1) . should_equal '✨🚀🚧😍😃😍😎☺'

Test.specify "should correctly convert character case" <|
"FooBar Baz".to_lower_case.should_equal "foobar baz"
"FooBar Baz".to_upper_case.should_equal "FOOBAR BAZ"
"i".to_upper_case . should_equal "I"
"I".to_lower_case . should_equal "i"
"i".to_upper_case (Locale.new "tr") . should_equal "İ"
"I".to_lower_case (Locale.new "tr") . should_equal "ı"
"FooBar Baz".to_case Case.Lower . should_equal "foobar baz"
"FooBar Baz".to_case Case.Upper . should_equal "FOOBAR BAZ"

"foo bar baz".to_case Case.Title . should_equal "Foo Bar Baz"
"foo-bar, baz.baz foo_foo".to_case Case.Title . should_equal "Foo-Bar, Baz.baz Foo_foo"
"jAck the rippER".to_case Case.Title (Locale.uk) . should_equal "Jack The Ripper"

"i".to_case Case.Upper . should_equal "I"
"I".to_case Case.Lower . should_equal "i"
"i".to_case Case.Upper (Locale.new "tr") . should_equal "İ"
"I".to_case Case.Lower (Locale.new "tr") . should_equal "ı"
"İ".to_case Case.Lower . should_equal "i̇"
"ı".to_case Case.Upper . should_equal "I"

"Straße".to_case Case.Upper . should_equal "STRASSE"
"STRASSE".to_case Case.Lower . should_equal "strasse"
"et cætera".to_case Case.Upper . should_equal "ET CÆTERA"
("β".to_case Case.Upper == "B") . should_be_false
"δλφξ".to_case Case.Upper . should_equal "ΔΛΦΞ"
"ΔΛΦΞ".to_case Case.Lower . should_equal "δλφξ"
"δλ φξ".to_case Case.Title . should_equal "Δλ Φξ"

'✨🚀🚧😍😃😎😙😉☺'.to_case Case.Upper . should_equal '✨🚀🚧😍😃😎😙😉☺'
'✨🚀🚧😍😃😎😙😉☺'.to_case Case.Lower . should_equal '✨🚀🚧😍😃😎😙😉☺'
'✨🚀🚧😍😃😎😙😉☺'.to_case Case.Title . should_equal '✨🚀🚧😍😃😎😙😉☺'

"123".to_case Case.Upper . should_equal "123"
"abc123".to_case Case.Upper . should_equal "ABC123"

Test.specify "should dump utf-16 characters to a vector" <|
kshi_chars = kshi.utf_16
Expand Down

0 comments on commit 0d96f59

Please sign in to comment.