Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exhaustively testing SimpleRandomSample, software eng improvements #94

Merged
merged 27 commits into from
Nov 26, 2022
Merged
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
a34aa52
Merge branch 'xKDR:design_update' into design_update
smishr Nov 10, 2022
4c7bbf6
Merge pull request #3 from smishr/domain_stratified_sample
smishr Nov 22, 2022
e6a90de
Merge branch 'xKDR:design_update' into design_update
smishr Nov 22, 2022
191f84e
Merge branch 'xKDR:design_update' into design_update
smishr Nov 22, 2022
f2dd1d2
Update SRS struct, Add type checking, reorder
smishr Nov 23, 2022
a3936b0
Remove accidental twice fn declaration
smishr Nov 23, 2022
6bc4094
Editing sayantika SRS tests
smishr Nov 23, 2022
fa8a01a
Add and reorder type checks on weights
smishr Nov 23, 2022
3cdf066
Edited weights and probs checking
smishr Nov 23, 2022
94fd34f
Add ErrorException to @test_throws lines
smishr Nov 23, 2022
36b35c0
popsize sampsize Unsigned not Integer
smishr Nov 23, 2022
e1149da
Add testthrows for type checking keywords
smishr Nov 23, 2022
39c060a
ingorefpc improvements still not 100%, revert T,S parametric types
smishr Nov 25, 2022
79af6ab
line by line adding tests SRS
smishr Nov 25, 2022
9331edd
Ran Julia Formatter
smishr Nov 25, 2022
9aca851
Change Float64 to <:Real as sometime Int64
smishr Nov 25, 2022
9a42e84
Add popsize Symbol StratifiedSample
smishr Nov 25, 2022
94586d1
Fixed Strat and SurveyDesign tests for time being
smishr Nov 25, 2022
0eb9f6e
Fix testing suite with ignorefpc=true
smishr Nov 25, 2022
c0c36c2
Update src/SurveyDesign.jl
smishr Nov 26, 2022
8eefd75
Update src/SurveyDesign.jl
smishr Nov 26, 2022
60be750
Update src/SurveyDesign.jl
smishr Nov 26, 2022
d92c63d
Update src/SurveyDesign.jl
smishr Nov 26, 2022
2751ae3
Update src/SurveyDesign.jl
smishr Nov 26, 2022
6a2fff3
Update src/SurveyDesign.jl
smishr Nov 26, 2022
b9f09a1
Update src/SurveyDesign.jl
smishr Nov 26, 2022
3ea908c
Update src/SurveyDesign.jl
smishr Nov 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
ingorefpc improvements still not 100%, revert T,S parametric types
smishr committed Nov 25, 2022
commit 39c060a7b4b530dfd54ce0e5f1f3be2bc3548ae5
71 changes: 40 additions & 31 deletions src/SurveyDesign.jl
Original file line number Diff line number Diff line change
@@ -34,25 +34,25 @@ abstract type AbstractSurveyDesign end
If `popsize` not given, `weights` or `probs` must be given, so that in combination
with `sampsize`, `popsize` can be calculated.
"""
struct SimpleRandomSample{T<:Real, S<:Unsigned} <: AbstractSurveyDesign
struct SimpleRandomSample <: AbstractSurveyDesign
data::AbstractDataFrame
sampsize::Union{S,Nothing}
popsize::Union{S,Nothing}
sampfraction::T
fpc::T
sampsize::Union{Unsigned,Nothing}
popsize::Union{Unsigned,Nothing}
sampfraction::Float64
fpc::Float64
ignorefpc::Bool
function SimpleRandomSample(data::AbstractDataFrame;
popsize=nothing,
sampsize=nrow(data),
sampsize=nrow(data) |> UInt,
weights=nothing,
probs=nothing,
ignorefpc=false
)
# Only valid argument types given to constructor
argtypes_weights = Union{Nothing,Symbol,Vector{<:R} where R<:Real}
argtypes_probs = Union{Nothing,Symbol,Vector{<:R} where R<:Real}
argtypes_popsize = Union{Nothing,Symbol,<:Unsigned,Vector{<:R} where R<:Real}
argtypes_sampsize = Union{Nothing,Symbol,<:Unsigned,Vector{<:R} where R<:Real}
argtypes_weights = Union{Nothing,Symbol,Vector{Float64}}
argtypes_probs = Union{Nothing,Symbol,Vector{Float64}}
argtypes_popsize = Union{Nothing,Symbol,<:Unsigned,Vector{Float64}}
argtypes_sampsize = Union{Nothing,Symbol,<:Unsigned,Vector{Float64}}
# If any invalid type raise error
if !(isa(weights,argtypes_weights))
error("Invalid type of argument given for `weights` argument")
smishr marked this conversation as resolved.
Show resolved Hide resolved
@@ -72,9 +72,9 @@ struct SimpleRandomSample{T<:Real, S<:Unsigned} <: AbstractSurveyDesign
probs = data[!, probs]
end
# If weights/probs vector not numeric/real, ie. string column passed for weights, then raise error
if !isa(weights, Union{Nothing,Vector{<:Real}})
if !isa(weights, Union{Nothing,Vector{Float64}})
error("Weights should be Vector{<:Real}. You passed $(typeof(weights))")
elseif !isa(probs, Union{Nothing,Vector{<:Real}})
elseif !isa(probs, Union{Nothing,Vector{Float64}})
error("Sampling probabilities should be Vector{<:Real}. You passed $(typeof(probs))")
end
# If popsize given as Symbol or Vector, check all records equal
@@ -83,7 +83,7 @@ struct SimpleRandomSample{T<:Real, S<:Unsigned} <: AbstractSurveyDesign
error("popsize must be same for all observations in Simple Random Sample")
end
popsize = first(data[!,popsize]) |> UInt
elseif isa(popsize , Vector{<:Real})
elseif isa(popsize , Vector{Float64})
if !all(w -> w == first(popsize), popsize)
error("popsize must be same for all observations in Simple Random Sample")
end
@@ -95,7 +95,7 @@ struct SimpleRandomSample{T<:Real, S<:Unsigned} <: AbstractSurveyDesign
error("sampsize must be same for all observations in Simple Random Sample")
end
sampsize = first(data[!,sampsize]) |> UInt
elseif isa(sampsize , Vector{<:Real})
elseif isa(sampsize , Vector{Float64})
if !all(w -> w == first(sampsize), sampsize)
error("sampsize must be same for all observations in Simple Random Sample")
end
@@ -106,22 +106,16 @@ struct SimpleRandomSample{T<:Real, S<:Unsigned} <: AbstractSurveyDesign
probs = 1 ./ weights
data[!, :probs] = probs
end
# If ignorefpc then set weights to 1 ??
# TODO: This works under some cases, but should find better way to process ignoring fpc correction
if ignorefpc
@warn "assuming all weights are equal to 1.0"
weights = ones(nrow(data))
end
# popsize must be nothing or <:Integer by now
if isnothing(popsize)
# If popsize not given, fallback to weights, probs and sampsize to estimate `popsize`
@warn "Using weights/probs and sampsize to estimate `popsize`"
# Check that all weights (or probs if weights not given) are equal, as SRS is by definition equi-weighted
if typeof(weights) <: Vector{<:Real}
if typeof(weights) <: Vector{Float64}
if !all(w -> w == first(weights), weights)
error("all frequency weights must be equal for Simple Random Sample")
end
elseif typeof(probs) <: Vector{<:Real}
elseif typeof(probs) <: Vector{Float64}
if !all(p -> p == first(probs), probs)
error("all probability weights must be equal for Simple Random Sample")
end
@@ -134,20 +128,35 @@ struct SimpleRandomSample{T<:Real, S<:Unsigned} <: AbstractSurveyDesign
if sampsize > popsize
error("population size was estimated to be greater than given sampsize. Please check input arguments.")
end
elseif typeof(popsize) <: Integer
elseif typeof(popsize) <: Unsigned
weights = fill(popsize / sampsize, nrow(data)) # If popsize is given, weights vector is made concordant with popsize and sampsize, regardless of given weights argument
else
error("something went wrong")
error("Something went wrong. Please check validity of inputs.")
smishr marked this conversation as resolved.
Show resolved Hide resolved
end
# If ignorefpc then set weights to 1 ??
# TODO: This works under some cases, but should find better way to process ignoring fpc
if ignorefpc
@warn "assuming all weights are equal to 1.0"
weights = ones(nrow(data))
probs = 1 ./ weights
end
# sum of weights must equal to `popsize` for SRS
if !ignorefpc && !isnothing(weights) && !(isapprox(sum(weights), popsize; atol = 1e-4))
@show sum(1 ./ weights)
error("Sum of inverse of sampling weights must be equal to `popsize` for Simple Random Sample")
if !isnothing(weights) && !(isapprox(sum(weights), popsize; atol = 1e-4))
if ignorefpc && !(isapprox(sum(weights), sampsize; atol = 1e-4)) # Change if ignorefpc functionality changes
error("Sum of sampling weights should be equal to `sampsize` for Simple Random Sample with ignorefpc")
smishr marked this conversation as resolved.
Show resolved Hide resolved
elseif !ignorefpc
@show sum(weights)
error("Sum of sampling weights must be equal to `popsize` for Simple Random Sample")
smishr marked this conversation as resolved.
Show resolved Hide resolved
end
end
# sum of probs must equal popsize for SRS
if !ignorefpc && !isnothing(probs) && !(isapprox(sum(1 ./ probs), popsize; atol = 1e-4))
@show sum(probs)
error("Sum of probability weights must be equal to `popsize` for Simple Random Sample")
if !isnothing(probs) && !(isapprox(sum(1 ./ probs), popsize; atol = 1e-4))
if ignorefpc && !(isapprox(sum(1 ./ probs), sampsize; atol = 1e-4)) # Change if ignorefpc functionality changes
error("Sum of inverse sampling probabilities should be equal to `sampsize` for Simple Random Sample with ignorefpc")
smishr marked this conversation as resolved.
Show resolved Hide resolved
elseif !ignorefpc
@show sum(1 ./ probs)
error("Sum of inverse of sampling probabilities must be equal to `popsize` for Simple Random Sample")
end
end
## Set remaining parts of data structure
# set sampling fraction
@@ -161,7 +170,7 @@ struct SimpleRandomSample{T<:Real, S<:Unsigned} <: AbstractSurveyDesign
end
data[!, :probs] = probs
# Initialise the structure
new{Float64,Unsigned}(data, sampsize, popsize, sampfraction, fpc, ignorefpc)
new(data, sampsize, popsize, sampfraction, fpc, ignorefpc)
end
end