-
Notifications
You must be signed in to change notification settings - Fork 11
usefulness of a colwise!
method?
#36
Comments
Are you imagining that example: julia> dt
10×2 DataTables.DataTable
│ Row │ A │ B │
├─────┼────┼────┤
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 3 │ 3 │ 3 │
│ 4 │ 4 │ 4 │
│ 5 │ 5 │ 5 │
│ 6 │ 6 │ 6 │
│ 7 │ 7 │ 7 │
│ 8 │ 8 │ 8 │
│ 9 │ 9 │ 9 │
│ 10 │ 10 │ 10 │
julia> colwise!(normalize, dt)
10×2 DataTables.DataTable
│ Row │ A │ B │
├─────┼───────────┼───────────┤
│ 1 │ 0.0509647 │ 0.0509647 │
│ 2 │ 0.101929 │ 0.101929 │
│ 3 │ 0.152894 │ 0.152894 │
│ 4 │ 0.203859 │ 0.203859 │
│ 5 │ 0.254824 │ 0.254824 │
│ 6 │ 0.305788 │ 0.305788 │
│ 7 │ 0.356753 │ 0.356753 │
│ 8 │ 0.407718 │ 0.407718 │
│ 9 │ 0.458682 │ 0.458682 │
│ 10 │ 0.509647 │ 0.509647 │
# compared to
julia> aggregate(dt, normalize)
10×2 DataTables.DataTable
│ Row │ A_normalize │ B_normalize │
├─────┼─────────────┼─────────────┤
│ 1 │ 0.0509647 │ 0.0509647 │
│ 2 │ 0.101929 │ 0.101929 │
│ 3 │ 0.152894 │ 0.152894 │
│ 4 │ 0.203859 │ 0.203859 │
│ 5 │ 0.254824 │ 0.254824 │
│ 6 │ 0.305788 │ 0.305788 │
│ 7 │ 0.356753 │ 0.356753 │
│ 8 │ 0.407718 │ 0.407718 │
│ 9 │ 0.458682 │ 0.458682 │
│ 10 │ 0.509647 │ 0.509647 │ That should be pretty straight forward to do. What would be the preferable behavior for functions that reduce to scalars like sum/length or extrema? Fail on a dimension mismatch or resize the DataTable to be a single row? (I'm not sure if example julia> colwise!(length, dt)
1×2 DataTables.DataTable
│ Row │ A │ B │
├─────┼────┼────┤
│ 1 │ 10 │ 10 │
# or
Error: DimensionMismatch: columns cannot be resized in place, use aggregate(function, dt) instead |
Though thinking about it more, I'm not sure how useful such a function would be. The other issue with such a function is that |
Reading your concerns here and thinking about it more, I am also not sure this functionality would be broadly useful. It came out of a natural impulse to work with a DataTable inplace because this is so ideomatic in julia. |
Yes, that would make sense. Maybe a bit surprising that you would be able to modify an iterator, but doesn't sound like a big issue. |
Leaving a breadcrumb to https://stackoverflow.com/questions/44235201/shortcut-to-transforming-a-dataframe, which provides an example of a use case for this. |
DataTables implement a
colwise
method, but nocolwise!
. There aren't a massive number of use cases for this, but I can think of e.g. data centering and normalization. If there are no major technical obstacles with making such a function I think it'd make a nice addition.The text was updated successfully, but these errors were encountered: