Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean_names does not produce the expected names for some columns #25

Closed
durraniu opened this issue Aug 9, 2023 · 6 comments
Closed

Comments

@durraniu
Copy link

durraniu commented Aug 9, 2023

Three column names in my dataframe are Vehicle_ID, Frame_ID, and Lane_ID. When I use @clean_names, other columns are formatted just like R's janitor::clean_names(), but the aforementioned columns are formatted as vehicle_i_d, frame_i_d, and lane_i_d. The data I read is in a parquet file. I used the following code:

using ParquetFiles, DataFrames, Tidier

df = DataFrame(load("data/df_raw.parquet"))

df = @chain df begin
      @clean_names
     end

julia> first(df)
DataFrameRow
 Row │ vehicle_i_d  frame_i_d  total_frames  global_time    local_x  local_y  global_x   globa ⋯
     │ Int64        Int64      Int64         Int64          Float64  Float64  Float64    Float ⋯
─────┼──────────────────────────────────────────────────────────────────────────────────────────
   1 │           1         12           884  1113433136100   16.884  52.7478  6.04284e6  2.133 ⋯
                                                                              11 columns omitted
@kdpsingh
Copy link
Member

kdpsingh commented Aug 9, 2023

Thanks for catching this. We are wrapping the polish_names() function from Cleaner.jl package. I would suggest filing an issue there: https://github.com/TheRoniOne/Cleaner.jl/tree/master

We can certainly try to re-implement this from scratch or modifying that code but would check and see if they are interested to fix. Thanks.

@durraniu
Copy link
Author

Thank you.

@kdpsingh
Copy link
Member

I asked on the Cleaner.jl repository. We may even be able to help fix this but want to make sure the fix lives in Cleaner.jl if possible.

@kdpsingh
Copy link
Member

Per the author, the Cleaner.jl issue is fixed! There's a patch released on the registry.

I just need to update the dependency version for Cleaner within TidierData for this to work. Will get that done shortly.

I'll leave this issue open until that's done on our end.

@kdpsingh
Copy link
Member

kdpsingh commented Aug 14, 2023

I realized there is a dependency mismatch between DataFrames.jl v1.5+ (which TidierData.jl depends on) and the latest version of Cleaner.jl.

I need to wait for this issue to be resolved (TheRoniOne/Cleaner.jl#6) before I can update TidierData.jl to take advantage of this.

@kdpsingh
Copy link
Member

This is now fixed in Cleaner.jl v1.0.3, which is now on the registry. Closing the issue.

If you want to see it take effect, feel free to update the Cleaner package (and make sure TidierData is also up-to-date). I'm going to leave TidierData as compatible with older versions of Cleaner just to allow users flexibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants