-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uncompact #41
uncompact #41
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this could be useful, in particular if we decide not to widen the reference type by default. It's always good to be able to give a simple solution to the user from error messages.
""" | ||
uncompact{T, N}(A::CatArray{T, N}) = | ||
convert(arraytype(typeof(A)){T, N, DefaultRefType}, A) | ||
uncompact{T}(P::CategoricalPool{T}) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go into pool.jl. But it is really needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, no. I'll remove it
uncompact(A::NullableCategoricalArray) | ||
|
||
Return a copy of categorical array `A` using the default reference type. If `A` is using | ||
a small reference type (such as UInt8 or UInt16) the uncompact array will have room for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backticks around type names. Also, "uncompacted" would be better.
@@ -182,5 +182,5 @@ ordered!(pool::CategoricalPool, ordered) = (pool.ordered = ordered; pool) | |||
# LevelsException | |||
function Base.showerror{T, R}(io::IO, err::LevelsException{T, R}) | |||
levs = join(repr.(err.levels), ", ", " and ") | |||
print(io, "cannot store level(s) $levs since reference type $R can only hold $(typemax(R)) levels. Convert categorical array to a larger reference type to add more levels.") | |||
print(io, "cannot store level(s) $levs since reference type $R can only hold $(typemax(R)) levels. Convert categorical array to a larger reference type to add more levels (see uncompact).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would simply say "Use the uncompact function to add more levels".
uncompact(A::CategoricalArray) | ||
uncompact(A::NullableCategoricalArray) | ||
|
||
Return a copy of categorical array `A` using the default reference type. If `A` is using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"default reference type ($DefaultRefType)" should work.
a small reference type (such as UInt8 or UInt16) the uncompact array will have room for | ||
more levels. | ||
|
||
Avoid using compact to avoid having to call uncompact. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"compact" should be a link, see how @ref
is used elsewhere. Though I find this phrasing a bit weird. Maybe something along the lines of "To avoid the need to call uncompact
, ensure compact
is not called during creation or importation of data as categorical arrays."?
@test x == A(string.(Char.(65:318))) | ||
lev = copy(levels(x)) | ||
levels!(x, vcat(lev, "az")) | ||
@test levels(x) == vcat(lev, "az") | ||
x2 = uncompact(x) | ||
levels!(x2, vcat(levels(x2), "bz", "cz")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this test in a different block. You don't actually need to test calling levels!
: just ensure the type matches what is expected (the rest is tested elsewhere).
Current coverage is 86.75% (diff: 100%)@@ master #41 diff @@
==========================================
Files 8 8
Lines 437 438 +1
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 379 380 +1
Misses 58 58
Partials 0 0
|
Thanks! I think I'll rename |
Return a copy of categorical array
A
using the default reference type. This is needed when adding levels beyond what a compact pool has room for. Added a suggestion to useuncompact
when hitting aLevelsException
.Related: JuliaData/DataFrames.jl#990
Example:
Using
uncompact
in the first place would probably be an anti-pattern so I'm not even sure if it's a good idea to add this function. None-the-less I did run into this myself when I usedcompact
to keep the array as small as possible in the inner loop yet wanted to merge my output with other results later. I also believe it fits well with the description of what to do when you hit aLevelsException
.