-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing of mixed-cased locale ID. #361
Changes from all commits
6844ccc
3450a67
78eea4c
a5cb831
1a6731e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,6 +15,7 @@ | |
import os | ||
import threading | ||
from collections import MutableMapping | ||
from itertools import chain | ||
|
||
from babel._compat import pickle | ||
|
||
|
@@ -24,15 +25,29 @@ | |
_dirname = os.path.join(os.path.dirname(__file__), 'locale-data') | ||
|
||
|
||
def normalize_locale(name): | ||
"""Normalize a locale ID by stripping spaces and apply proper casing. | ||
|
||
Returns the normalized locale ID string or `None` if the ID is not | ||
recognized. | ||
""" | ||
name = name.strip().lower() | ||
for locale_id in chain.from_iterable([_cache, locale_identifiers()]): | ||
if name == locale_id.lower(): | ||
return locale_id | ||
|
||
|
||
def exists(name): | ||
"""Check whether locale data is available for the given locale. Ther | ||
return value is `True` if it exists, `False` otherwise. | ||
"""Check whether locale data is available for the given locale. | ||
|
||
Returns `True` if it exists, `False` otherwise. | ||
|
||
:param name: the locale identifier string | ||
""" | ||
if name in _cache: | ||
return True | ||
return os.path.exists(os.path.join(_dirname, '%s.dat' % name)) | ||
file_found = os.path.exists(os.path.join(_dirname, '%s.dat' % name)) | ||
return True if file_found else bool(normalize_locale(name)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would iteration over in-mem objects would be faster than disk access? If so, please consider performing the in-memory check first. :) (super duper minor optimization) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That'd complicate the whole deal unnecessarily, imho. It'd become something like
which would kinda interleave normalization with this "simple" existence checking... |
||
|
||
|
||
def locale_identifiers(): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,8 @@ | |
|
||
import doctest | ||
import unittest | ||
import random | ||
from operator import methodcaller | ||
|
||
from babel import localedata | ||
|
||
|
@@ -73,6 +75,23 @@ def test_merge(): | |
localedata.merge(d, {1: 'Foo', 2: 'Bar'}) | ||
assert d == {1: 'Foo', 2: 'Bar', 3: 'baz'} | ||
|
||
|
||
def test_locale_identification(): | ||
for l in localedata.locale_identifiers(): | ||
assert localedata.exists(l) | ||
|
||
|
||
def test_unique_ids(): | ||
# Check all locale IDs are uniques. | ||
all_ids = localedata.locale_identifiers() | ||
assert len(all_ids) == len(set(all_ids)) | ||
# Check locale IDs don't collide after lower-case normalization. | ||
lower_case_ids = list(map(methodcaller('lower'), all_ids)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, it would be, and more Pythonic, but that's kinda minor, and can be trivially corrected later if need be. |
||
assert len(lower_case_ids) == len(set(lower_case_ids)) | ||
|
||
|
||
def test_mixedcased_locale(): | ||
for l in localedata.locale_identifiers(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it necessary to iterate through all ids? Would it be sufficient to spot check some deliberately selected hardcoded values? Eg. 'Fi', 'FI_fi', 'fi_FI', 'fI_fI', 'ZH_hanT_HK' There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's another optimization that could be done (via There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. True that. I just get the heebie jeebies when I see |
||
locale_id = ''.join([ | ||
methodcaller(random.choice(['lower', 'upper']))(c) for c in l]) | ||
assert localedata.exists(locale_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, greedy method.
I'm really looking forward to #305 !
(just an observation, please ignore :)