Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renaming and refactoring DataKey, DataEntry, and other traits; adding ErasedDataProvider #431

Merged
merged 19 commits into from
Jan 5, 2021

Conversation

sffc
Copy link
Member

@sffc sffc commented Dec 23, 2020

Fixes #243
Fixes #244
Fixes #422
Fixes #433

This PR makes a number of changes to the icu_provider crate.

  • The DataProvider crate now has a type parameter, allowing it to be used with generic types and zero-copy deserialization. Types that accept a DataProvider should accept the generic version, which is auto-implemented for the type-erased version.
  • The type-erased version is now named ErasedDataProvider, and it lives in the erased module.
  • New names for DataKey and friends: a DataRequest encapsulates a ResourcePath. A ResourcePath encapsulates a ResourceKey (compile time) and a ResourceOptions (runtime). The ResourceKey is const and copyable; the ResourceOptions is neither.
  • Organized the symbols better in the icu_provider crate, with spot improvements to API.
  • Added a new built-in provider named HelloWorldProvider.
  • Made the LanguageIdentifier field in ResourceOptions (formerly DataEntry) optional.
  • Added benchmarks to provider_fs.

@sffc sffc requested review from zbraniecki and a team as code owners December 23, 2020 08:29
@codecov-io
Copy link

codecov-io commented Dec 23, 2020

Codecov Report

Merging #431 (0a1ee06) into master (90f7381) will decrease coverage by 0.72%.
The diff coverage is 63.51%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #431      +/-   ##
==========================================
- Coverage   75.56%   74.83%   -0.73%     
==========================================
  Files          92       90       -2     
  Lines        4534     4641     +107     
==========================================
+ Hits         3426     3473      +47     
- Misses       1108     1168      +60     
Impacted Files Coverage Δ
components/datetime/src/format.rs 100.00% <ø> (ø)
components/datetime/src/lib.rs 100.00% <ø> (ø)
components/plurals/src/data.rs 79.54% <ø> (ø)
components/plurals/src/lib.rs 66.66% <ø> (ø)
components/provider/src/error.rs 0.00% <0.00%> (ø)
components/provider/src/lib.rs 100.00% <ø> (ø)
components/provider/src/structs/dates.rs 0.00% <ø> (-44.69%) ⬇️
.../provider_cldr/src/download/cldr_paths_download.rs 34.37% <ø> (ø)
components/provider_cldr/src/lib.rs 100.00% <ø> (ø)
components/provider_cldr/src/support.rs 11.76% <0.00%> (ø)
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 90f7381...0a1ee06. Read the comment docs.

@coveralls
Copy link

coveralls commented Dec 23, 2020

Pull Request Test Coverage Report for Build d8325f0e02aa3eb26a2accdd518dc1c82d781e69-PR-431

  • 329 of 520 (63.27%) changed or added relevant lines in 18 files are covered.
  • 26 unchanged lines in 8 files lost coverage.
  • Overall coverage decreased (-0.7%) to 74.02%

Changes Missing Coverage Covered Lines Changed/Added Lines %
components/provider/src/inv.rs 24 25 96.0%
components/provider_fs/src/fs_data_provider.rs 24 25 96.0%
components/provider/src/structs/icu4x.rs 5 7 71.43%
components/provider_fs/src/bin/icu4x-cldr-export.rs 0 2 0.0%
resources/testdata/src/bin/icu4x-gen-testdata.rs 0 2 0.0%
components/provider_fs/src/export/fs_exporter.rs 5 8 62.5%
components/provider/src/hello_world.rs 47 51 92.16%
components/provider/src/structs/plurals.rs 1 6 16.67%
components/provider_cldr/src/support.rs 0 5 0.0%
components/provider/src/data_provider.rs 13 19 68.42%
Files with Coverage Reduction New Missed Lines %
components/provider_cldr/src/transform/dates.rs 1 69.03%
components/provider_cldr/src/transform/mod.rs 1 12.5%
components/provider_cldr/src/transform/plurals.rs 1 56.64%
components/provider_fs/src/deserializer.rs 1 41.67%
components/provider/src/data_provider.rs 1 61.9%
components/provider_cldr/src/support.rs 2 11.76%
components/provider/src/error.rs 3 0%
components/provider/src/structs/dates.rs 16 0%
Totals Coverage Status
Change from base Build 90f7381e0cbea53b38d7b6d53279903e4f19efe5: -0.7%
Covered Lines: 3587
Relevant Lines: 4846

💛 - Coveralls

@sffc sffc requested a review from filmil as a code owner December 24, 2020 07:01
@sffc sffc changed the title Renaming and refactoring DataKey, DataEntry, and other traits Renaming and refactoring DataKey, DataEntry, and other traits; adding ErasedDataProvider Dec 24, 2020
@sffc
Copy link
Member Author

sffc commented Dec 24, 2020

I updated the PR to also add ErasedDataProvider. There is a failing test, which is pending an answer to this question:

I have a trait with several methods. Some of the methods have default implementations, and I want to override those implementations if a generic type parameter satisfies a trait bound. For example, I have a method set_to_default(&mut self), which returns an error by default, but if the struct implements Default, then I want to have a custom implementation of that method.

Here's what I tried, but I am getting an error: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=d3ace0821bbc3527c8586e45f9ef1200

trait Foo<T> {
    fn method1(&self) -> T;
    fn method2(&self) -> Option<T> {
        None
    }
}

struct Bar<T> where T: Clone {
    data: T,
}

impl<T> Foo<T> for Bar<T> where T: Clone {
    fn method1(&self) {
        self.data.clone()
    }
}

impl<T> Foo<T> for Bar<T> where T: Default {
    fn method2(&self) {
        Some(T::default())
    }
}

@sffc
Copy link
Member Author

sffc commented Dec 24, 2020

I also included Criterion benchmarks comparing the generic version of DataProvider to the dynamic ErasedDataProvider that goes through Any. The results:


     Running /home/sffc/projects/omnicu/target/release/deps/provider_fs-410d506882edbd70
json/overview           time:   [19.138 us 19.361 us 19.586 us]                           
                        change: [-2.5016% -0.9742% +0.6057%] (p = 0.22 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild

json/generic            time:   [12.444 us 12.620 us 12.797 us]                          
                        change: [-4.5120% -2.4169% -0.5153%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild

json/erased             time:   [13.270 us 13.425 us 13.580 us]                         
                        change: [-1.1545% +0.2018% +1.6537%] (p = 0.78 > 0.05)
                        No change in performance detected.

bincode/generic         time:   [10.014 us 10.173 us 10.330 us]                             
                        change: [-6.3577% -4.8819% -3.4220%] (p = 0.00 < 0.05)
                        Performance has improved.

bincode/erased          time:   [10.523 us 10.647 us 10.774 us]                            
                        change: [-0.9207% +0.3940% +1.7870%] (p = 0.57 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) low mild
  2 (2.00%) high mild

In table form:

Data Format Runtime, Generic Runtime, Erased
JSON 12.620 us 13.425 us
Bincode 10.173 us 10.647 us

So, Generic is faster than Erased, and Bincode is faster than JSON, but the differences are not earth-shattering.

@sffc sffc marked this pull request as draft December 29, 2020 03:51
@sffc
Copy link
Member Author

sffc commented Dec 29, 2020

I plan to resolve this problem by requiring Default to be implemented when ErasedDataProvider is used, but not when plain DataProvider is used.

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/provider/src/data_receiver.rs is no longer changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@Manishearth
Copy link
Member

Got the test compiling at https://github.com/Manishearth/omnicu/tree/works

@sffc
Copy link
Member Author

sffc commented Dec 31, 2020

Okay, this is ready for review. When reviewing, you can focus on (1) the API docs in the icu_provider crate and (2) code changes outside of that crate. There are a lot of code changes inside the crate, but I feel pretty confident about them.

@sffc sffc marked this pull request as ready for review December 31, 2020 09:10
zbraniecki
zbraniecki previously approved these changes Jan 4, 2021
Copy link
Member

@zbraniecki zbraniecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Minor nits and suggestions inline

components/provider/Cargo.toml Outdated Show resolved Hide resolved
components/provider/src/data_provider.rs Show resolved Hide resolved
components/provider/src/hello_world.rs Show resolved Hide resolved
components/provider/src/resource.rs Outdated Show resolved Hide resolved
components/provider/src/structs/icu4x.rs Show resolved Hide resolved
@sffc sffc requested a review from zbraniecki January 5, 2021 22:37
@sffc sffc merged commit 8a8893b into unicode-org:master Jan 5, 2021
@sffc sffc deleted the resource branch January 5, 2021 23:01
@sffc sffc mentioned this pull request Mar 16, 2021
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants