Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lower-level data API to PluralRules #575

Merged
merged 3 commits into from
Apr 8, 2021

Conversation

Manishearth
Copy link
Member

Progress on #560

This is a draft of how the lower-level data API will look. Want to get confirmation on this before adding FFI versions and also perhaps adding the same API to other types.

@Manishearth Manishearth requested a review from sffc March 25, 2021 23:05

/// Lower level helper that allows extracting PluralRules-relevant data from a data provider
/// without constructing PluralRules
pub fn get_plural_data<'d, D: DataProvider<'d, provider::PluralRuleStringsV1<'d>> + ?Sized>(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are on PluralRules instead of DataProvider because we can still use slightly higher level types here.

I could instead move this to DataProvider and have it return PluralRuleStringsV1 if people prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have a slight preference toward storing this on Provider data struct (and that's what I do on DTF) but that may be because I don't see the value you describe on storing it here yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zbraniecki if it's on Provider then I have to return PluralRuleStringsV1 instead of the converted type

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Should I be pulling out a PluralRulesV1 or a PluralRuleList?

I kinda prefer it on the type itself because it's all in one place then

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd advocate for having a function in the icu_plurals::provider module (probably in a sub module) that returns PluralRuleStringsV1.

Why PluralRuleStringsV1?

  1. It's public
  2. The same architecture (returning data structs) can be used in other crates
  3. No dependencies outside of the provider module

@codecov-io
Copy link

Codecov Report

Merging #575 (fe25b4f) into main (30a3909) will increase coverage by 0.00%.
The diff coverage is 95.65%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #575   +/-   ##
=======================================
  Coverage   74.22%   74.22%           
=======================================
  Files         128      128           
  Lines        7840     7845    +5     
=======================================
+ Hits         5819     5823    +4     
- Misses       2021     2022    +1     
Impacted Files Coverage Δ
components/plurals/src/lib.rs 91.17% <95.65%> (+4.96%) ⬆️
components/plurals/src/data.rs 72.72% <0.00%> (-6.82%) ⬇️
components/provider_ppucd/src/parse_ppucd.rs 93.13% <0.00%> (+0.13%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30a3909...fe25b4f. Read the comment docs.

@coveralls
Copy link

coveralls commented Mar 25, 2021

Pull Request Test Coverage Report for Build 180c8335e579ba3eaba9ffc5fa7e06a217d74a2b-PR-575

  • 23 of 24 (95.83%) changed or added relevant lines in 2 files are covered.
  • 4 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.02%) to 72.697%

Changes Missing Coverage Covered Lines Changed/Added Lines %
components/plurals/src/lib.rs 9 10 90.0%
Files with Coverage Reduction New Missed Lines %
components/provider_ppucd/src/parse_ppucd.rs 1 93.0%
components/plurals/src/data.rs 3 72.73%
Totals Coverage Status
Change from base Build 653c3ec23b76715caf9e50bc248828fac2049ee2: -0.02%
Covered Lines: 6928
Relevant Lines: 9530

💛 - Coveralls

@Manishearth
Copy link
Member Author

Updated to use PluralRuleStringsV1.

Unfortunately, this means we need to use a Cow since we have to return it, unless we're forced to clone it.

Either way this will be boxed on the FFI side I guess, so it doesn't matter all that much.

use std::borrow::Cow;

impl<'s> super::PluralRuleStringsV1<'s> {
pub fn new_from_provider<D: DataProvider<'s, super::PluralRuleStringsV1<'s>> + ?Sized>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn on whether to put this function on the data struct itself, or whether to make it a function within the module. I think I tend to prefer "pure functions" with well-defined inputs and outputs. Another reason to prefer standalone functions is that in other, more complicated components, these functions might want to do things like return multiple data structs or return an enum with one of several structs depending on the input. So, associating the function with a single data struct might not be the most generalizable choice.

@@ -46,3 +46,35 @@ pub struct PluralRuleStringsV1<'s> {
)]
pub many: Option<Cow<'s, str>>,
}

mod convert {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Basically what we want is a way to map "locale + options => data struct". Module name brainstorm:

  • resolver
  • resolution
  • mapping
  • mapper

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah since it's a method it was a private module so I didn't care about the name, but if I'm making it a function I will.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed; in the latest commit.

(I also rebased over master but did not change the commit structure, unsure if GitHub's diffs handle that well)

/// data obtained from a provider
pub fn new_from_data<'d>(
langid: LanguageIdentifier,
data: &PluralRuleStringsV1<'d>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be taken by value. I think in most cases we should take ownership of the data struct in order to avoid unnecessary clones. It's not useful for the caller to keep the data struct around any longer than when they call new_from_data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no unnecessary clone happening here, is there? We're parsing the data struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the data struct owns its own strings, then passing by & means that they need to be cloned in the general case. For PluralRuleStringsV1 in particular, maybe this doesn't matter.

I guess the question is whether we should make these methods "always take by value", "always take by reference", or "case-by-case basis"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bear in mind, we usually get a Cow out, so if we wanted to take ownership we would be forced to clone

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #615, once that is resolved we can revisit this since the data model will change.

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/plurals/src/lib.rs is different
  • components/plurals/src/provider.rs is different
  • components/plurals/src/provider/resolver.rs is now changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/plurals/src/lib.rs is different
  • components/plurals/src/provider/resolver.rs is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@Manishearth Manishearth marked this pull request as ready for review April 7, 2021 18:19
@sffc sffc self-requested a review April 7, 2021 20:14
@Manishearth Manishearth merged commit 03be3a1 into unicode-org:main Apr 8, 2021
@Manishearth Manishearth deleted the lower-level branch April 8, 2021 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants