Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add svyby functionality for svymean #50

Merged
merged 2 commits into from
Sep 3, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions src/Survey.jl
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,13 @@ include("svyhist.jl")
include("svyplot.jl")
include("dimnames.jl")
include("svyboxplot.jl")
include("svyby.jl")

export load_data
export AbstractSurveyDesign, SimpleRandomSample, StratifiedSample
export svydesign
export svyglm
export svyby
export dim, colnames, dimnames
export svymean, svytotal, svyquantile
export @formula
Expand Down
38 changes: 38 additions & 0 deletions src/svyby.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
"""
The `svyby` function can be used to generate subsets of a survey design.

```jldoctest
julia> using Survey

julia> apisrs = load_data("apisrs");

julia> srs = SimpleRandomSample(apisrs);

julia> svyby(:api00, :cname, srs, svytotal)
38×2 DataFrame
Row │ cname total
│ String15 Float64
─────┼──────────────────────────
1 │ Kern 5736.0
2 │ Los Angeles 29617.0
3 │ Orange 6744.0
4 │ San Luis Obispo 739.0
5 │ San Francisco 1675.0
6 │ Modoc 671.0
7 │ Alameda 7437.0
8 │ Solano 1869.0
⋮ │ ⋮ ⋮
32 │ Kings 939.0
33 │ Shasta 1508.0
34 │ Yolo 475.0
35 │ Calaveras 790.0
36 │ Napa 1454.0
37 │ Lake 804.0
38 │ Merced 595.0
23 rows omitted
```
"""
function svyby(formula::Symbol, by::Symbol, design::AbstractSurveyDesign, func::Function, params = [])
gdf = groupby(design.data, by)
return combine(gdf, [formula ] => ((a) -> func(a , design ,params...)) => AsTable)
end
12 changes: 12 additions & 0 deletions src/svymean.jl
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,22 @@ function var_of_mean(x::Symbol, design::SimpleRandomSample)
return design.fpc / design.sampsize * var(design.data[!, x])
end

function var_of_mean(x::AbstractVector, design::SimpleRandomSample)
return design.fpc / design.sampsize * var(x)
end

function sem(x, design::SimpleRandomSample)
return sqrt(var_of_mean(x, design))
end

function sem(x::AbstractVector, design::SimpleRandomSample)
return sqrt(var_of_mean(x, design))
end

function svymean(x, design::SimpleRandomSample)
return DataFrame(mean = mean(design.data[!, x]), sem = sem(x, design::SimpleRandomSample))
end

function svymean(x::AbstractVector , design::SimpleRandomSample)
return DataFrame(mean = mean(x), sem = sem(x, design::SimpleRandomSample))
end