Skip to content
This repository has been archived by the owner on Dec 11, 2022. It is now read-only.

DataFrames cannot be written to CSV #52

Closed
tpoisot opened this issue Feb 19, 2021 · 4 comments
Closed

DataFrames cannot be written to CSV #52

tpoisot opened this issue Feb 19, 2021 · 4 comments
Assignees

Comments

@tpoisot
Copy link
Member

tpoisot commented Feb 19, 2021

CSV.write refuses to write nothing to a file - I think it would be acceptable to remove all rows with a nothing in values, right?

@gabrieldansereau
Copy link
Member

Are you talking about the DataFrame() overload?

It's really just returning all the values from the layer grid in a DataFrame, so the nothing values are the same as in the grid, yes. You can remove them with filter if you want , then export with CSV.write.

temperature = worldclim(1)
temperature_df = DataFrame(temperature)
filter!(x -> !isnothing(x.values), temperature_df)
CSV.write("test1.csv", temperature_df)

@gabrieldansereau
Copy link
Member

gabrieldansereau commented Feb 19, 2021

Do you mean we should instead modify the DataFrame() overload so it doesn't return the nothing values?

I like the behaviour as it is. To me it's more intuitive like this, with the overload returning a DataFrame with the values for all grid cells, which we can then filter or not. It's similar to the raster package in R.

@tpoisot
Copy link
Member Author

tpoisot commented Feb 20, 2021

I agree with the general idea, the only point of friction I can see is that missing values in DataFrames should be missing, not nothing. That being said, you have used the package more than me so if the behavior makes sense to you, let's keep it. The ascii read/write methods in #54 are also going to offer another way to export data.

@tpoisot tpoisot closed this as completed Feb 20, 2021
@gabrieldansereau
Copy link
Member

Reopening this.

After working with the DataFrames overload for a while, I agree it would be simpler to use missing, not nothing. missing has better support in the DataFrames functions, and I find converting from nothing to missing unintuitive and a bit of a pain (see below). Especially to remove missing values.

Since #101 & v0.7.0 already bring a breaking release, I'll change this at the same time so that DataFrame(layer) returns missing for values which are nothing in the layer.

using SimpleSDMLayers
using DataFrames

layer = SimpleSDMPredictor(WorldClim, BioClim, 1)

df = DataFrame([layer, layer])
allowmissing!(df)
for col in [:x1, :x2]
    replace!(df[!, col], nothing => missing)
end
dropmissing(df, [:x1, :x2])

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants