-
Notifications
You must be signed in to change notification settings - Fork 19
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'design_update' into design_update
- Loading branch information
Showing
20 changed files
with
305 additions
and
317 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,62 @@ | ||
# Examples | ||
|
||
The following examples use the Academic Performance Index (API) dataset for Californian schools. | ||
The following examples use the | ||
[Academic Performance Index](https://r-survey.r-forge.r-project.org/survey/html/api.html) | ||
(API) dataset for Californian schools. The data sets contain information for all schools | ||
with at least 100 students and for various probability samples of the data. | ||
|
||
```@docs | ||
svyby(formula::Symbol, by, design::svydesign, func::Function, params = []) | ||
The API program has been discontinued at the end of 2018. Information is archived at | ||
[https://www.cde.ca.gov/re/pr/api.asp](https://www.cde.ca.gov/re/pr/api.asp) | ||
|
||
## Simple Random Sample | ||
|
||
Firstly, a survey design needs a dataset from which to gather information. A dataset | ||
can be loaded as a `DataFrame` using the `load_data` function: | ||
|
||
```julia | ||
julia> apisrs = load_data("apisrs"); | ||
``` | ||
|
||
Next, we can build a design. The most basic survey design is a simple random sample design. | ||
A [`SimpleRandomSample`](@ref) can be instantianted by calling the constructor: | ||
|
||
```julia | ||
julia> srs = SimpleRandomSample(apisrs; weights = :pw) | ||
SimpleRandomSample: | ||
data: 200x42 DataFrame | ||
weights: 31.0, 31.0, 31.0, ..., 31.0 | ||
probs: 0.0323, 0.0323, 0.0323, ..., 0.0323 | ||
fpc: 6194, 6194, 6194, ..., 6194 | ||
popsize: 6194 | ||
sampsize: 200 | ||
sampfraction: 0.0323 | ||
ignorefpc: false | ||
``` | ||
|
||
With a `SimpleRandomSample` (as well as with any subtype of [`AbstractSurveyDesign`](@ref)) | ||
it is possible to calculate estimates of the mean or population total for a given variable, | ||
along with the corresponding standard errors. | ||
|
||
```julia | ||
julia> svymean(:api00, srs) | ||
1×2 DataFrame | ||
Row │ mean sem | ||
│ Float64 Float64 | ||
─────┼────────────────── | ||
1 │ 656.585 9.24972 | ||
|
||
julia> svytotal(:api00, srs) | ||
1×2 DataFrame | ||
Row │ total se_total | ||
│ Float64 Float64 | ||
─────┼───────────────────── | ||
1 │ 4.06689e6 57292.8 | ||
``` | ||
|
||
The design can be tweaked by specifying the population or sample size or whether | ||
or not to account for finite population correction (fpc). By default the weights | ||
are equal to one, the sample size is equal to the number of rows in `data` and the | ||
fpc is not ignored. The population size is calculated from the weights. | ||
|
||
When `ignorefpc` is set to `false` the `fpc` is calculated from the sample and population | ||
sizes. When it is set to `true` it is set to 1. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
const PKG_DIR = joinpath(pathof(Survey), "..", "..") |> normpath | ||
asset_path(args...) = joinpath(PKG_DIR, "assets", args...) | ||
|
||
""" | ||
load_data(name) | ||
Load a dataset as a `DataFrame`. | ||
All available datasets can be found in the [`assets/`](https://github.com/xKDR/Survey.jl/tree/main/assets) | ||
directory. | ||
```jldoctest | ||
julia> apisrs = load_data("apisrs") | ||
200×40 DataFrame | ||
Row │ Column1 cds stype name sname ⋯ | ||
│ Int64 Int64 String1 String15 String ⋯ | ||
─────┼────────────────────────────────────────────────────────────────────────── | ||
1 │ 1039 15739081534155 H McFarland High McFarland High ⋯ | ||
2 │ 1124 19642126066716 E Stowers (Cecil Stowers (Cecil B.) E | ||
3 │ 2868 30664493030640 H Brea-Olinda Hig Brea-Olinda High | ||
4 │ 1273 19644516012744 E Alameda Element Alameda Elementary | ||
5 │ 4926 40688096043293 E Sunnyside Eleme Sunnyside Elementary ⋯ | ||
6 │ 2463 19734456014278 E Los Molinos Ele Los Molinos Elementa | ||
7 │ 2031 19647336058200 M Northridge Midd Northridge Middle | ||
8 │ 1736 19647336017271 E Glassell Park E Glassell Park Elemen | ||
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ | ||
194 │ 4880 39686766042782 E Tyler Skills El Tyler Skills Element ⋯ | ||
195 │ 993 15636851531987 H Desert Junior/S Desert Junior/Senior | ||
196 │ 969 15635291534775 H North High North High | ||
197 │ 1752 19647336017446 E Hammel Street E Hammel Street Elemen | ||
198 │ 4480 37683386039143 E Audubon Element Audubon Elementary ⋯ | ||
199 │ 4062 36678196036222 E Edison Elementa Edison Elementary | ||
200 │ 2683 24657716025621 E Franklin Elemen Franklin Elementary | ||
36 columns and 185 rows omitted | ||
``` | ||
""" | ||
function load_data(name) | ||
name = name * ".csv" | ||
@assert name ∈ readdir(asset_path()) | ||
|
||
CSV.read(asset_path(name), DataFrame, missingstring="NA") | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.