Skip to content

Commit

Permalink
Merge branch 'xKDR:design_update' into design_update
Browse files Browse the repository at this point in the history
  • Loading branch information
sayantikaSSG authored Nov 2, 2022
2 parents 32f4860 + bc6b242 commit 437393d
Show file tree
Hide file tree
Showing 21 changed files with 332 additions and 492 deletions.
168 changes: 0 additions & 168 deletions clean_examples.jl

This file was deleted.

Binary file modified docs/src/assets/hist.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/src/assets/scatter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
61 changes: 58 additions & 3 deletions docs/src/examples.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,62 @@
# Examples

The following examples use the Academic Performance Index (API) dataset for Californian schools.
The following examples use the
[Academic Performance Index](https://r-survey.r-forge.r-project.org/survey/html/api.html)
(API) dataset for Californian schools. The data sets contain information for all schools
with at least 100 students and for various probability samples of the data.

```@docs
svyby(formula::Symbol, by, design::svydesign, func::Function, params = [])
The API program has been discontinued at the end of 2018. Information is archived at
[https://www.cde.ca.gov/re/pr/api.asp](https://www.cde.ca.gov/re/pr/api.asp)

## Simple Random Sample

Firstly, a survey design needs a dataset from which to gather information. A dataset
can be loaded as a `DataFrame` using the `load_data` function:

```julia
julia> apisrs = load_data("apisrs");
```

Next, we can build a design. The most basic survey design is a simple random sample design.
A [`SimpleRandomSample`](@ref) can be instantianted by calling the constructor:

```julia
julia> srs = SimpleRandomSample(apisrs; weights = :pw)
SimpleRandomSample:
data: 200x42 DataFrame
weights: 31.0, 31.0, 31.0, ..., 31.0
probs: 0.0323, 0.0323, 0.0323, ..., 0.0323
fpc: 6194, 6194, 6194, ..., 6194
popsize: 6194
sampsize: 200
sampfraction: 0.0323
ignorefpc: false
```

With a `SimpleRandomSample` (as well as with any subtype of [`AbstractSurveyDesign`](@ref))
it is possible to calculate estimates of the mean or population total for a given variable,
along with the corresponding standard errors.

```julia
julia> svymean(:api00, srs)
1×2 DataFrame
Row │ mean sem
│ Float64 Float64
─────┼──────────────────
1656.585 9.24972

julia> svytotal(:api00, srs)
1×2 DataFrame
Row │ total se_total
│ Float64 Float64
─────┼─────────────────────
14.06689e6 57292.8
```

The design can be tweaked by specifying the population or sample size or whether
or not to account for finite population correction (fpc). By default the weights
are equal to one, the sample size is equal to the number of rows in `data` and the
fpc is not ignored. The population size is calculated from the weights.

When `ignorefpc` is set to `false` the `fpc` is calculated from the sample and population
sizes. When it is set to `true` it is set to 1.
30 changes: 27 additions & 3 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,32 @@ This package is the Julia implementation of the [Survey package in R](https://cr

At [xKDR](https://xkdr.org/) we processed millions of records from household surveys using the survey package in R. This process took hours of computing time. By implementing the code in Julia, we are able to do the processing in seconds. In this package we have implemented the functions `svymean`, `svyquantile` and `svysum`. We have kept the syntax between the two packages similar so that we can easily move our existing code to the new language.

Documentation for [Survey](https://github.com/Survey.jl).
## Index

```@autodocs
Modules = [Survey]
```@index
Module = [Survey]
Private = false
```

## API
```@docs
load_data
AbstractSurveyDesign
SimpleRandomSample
StratifiedSample
ClusterSample
dim(design::AbstractSurveyDesign)
colnames(design::AbstractSurveyDesign)
dimnames(design::AbstractSurveyDesign)
svymean(x::Symbol, design::SimpleRandomSample)
svytotal(x::Symbol, design::SimpleRandomSample)
svyby
svyglm
svyplot(design::AbstractSurveyDesign, x::Symbol, y::Symbol; kwargs...)
svyhist(design::AbstractSurveyDesign, var::Symbol,
bins::Union{Integer, AbstractVector} = freedman_diaconis(design, var);
normalization = :density,
kwargs...
)
svyboxplot(design::AbstractSurveyDesign, x::Symbol, y::Symbol; kwargs...)
```
101 changes: 0 additions & 101 deletions shikharTests.jl

This file was deleted.

4 changes: 2 additions & 2 deletions src/Survey.jl
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,19 @@ using AlgebraOfGraphics
using CategoricalArrays

include("SurveyDesign.jl")
include("show.jl")
include("svydesign.jl")
include("svymean.jl")
include("svyquantile.jl")
include("svytotal.jl")
include("example.jl")
include("load_data.jl")
include("svyglm.jl")
include("svyhist.jl")
include("svyplot.jl")
include("dimnames.jl")
include("svyboxplot.jl")
include("svyby.jl")
include("ht.jl")
include("show.jl")

export load_data
export AbstractSurveyDesign, SimpleRandomSample, StratifiedSample
Expand Down
Loading

0 comments on commit 437393d

Please sign in to comment.