-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add PiecewiseConstantHazardDistribution * Documentation improvements * Add tests * Add LogLogistics
- Loading branch information
Showing
14 changed files
with
249 additions
and
54 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# LogLogistic | ||
|
||
## Definition | ||
|
||
The [LogLogistic](https://en.wikipedia.org/wiki/Log-logistic_distribution) distribution is the probability distribution of a random variable whose logarithm has a logistic distribution. It is similar in shape to the log-normal distribution but has heavier tails. | ||
|
||
It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, as, for example, mortality rate from cancer following diagnosis or treatment. It has also been used in hydrology to model stream flow and precipitation, in economics as a simple model of the distribution of wealth or income, and in networking to model the transmission times of data considering both the network and the software. | ||
|
||
## Examples | ||
|
||
Let us sample a dataset from an Exponentiated Weibull distribution: | ||
|
||
```@example 1 | ||
using SurvivalDistributions, Distributions, Random, Plots, StatsBase | ||
Random.seed!(123) | ||
D = LogLogistic(1,2) | ||
sim = rand(D,1000); | ||
``` | ||
|
||
First, let's have a look at the hazard function: | ||
```@example 1 | ||
plot(t -> hazard(D,t), ylabel = "Hazard", xlims = (0,10)) | ||
``` | ||
|
||
Then, we can verify the coherence of our code by comparing the obtained sample and the true pdf: | ||
```@example 1 | ||
histogram(sim, normalize=:pdf, bins = range(0, 5, length=30)) | ||
plot!(t -> pdf(D,t), ylabel = "Density", xlims = (0,5)) | ||
``` | ||
|
||
We could also compare the empirical and theroetical cdfs: | ||
```@example 1 | ||
ecdfsim = ecdf(sim) | ||
plot(x -> ecdfsim(x), 0, 5, label = "ECDF", linecolor = "gray", linewidth=3) | ||
plot!(t -> cdf(D,t), xlabel = "x", ylabel = "CDF vs. ECDF", xlims = (0,5)) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# Piecewise constant hazard distributions | ||
|
||
## Definition | ||
|
||
The `PiecewiseConstantHazardDistribution` is one of the most simple and yet most usefull distribution provided in this package. These distributions are defined by their hazard functions, which are assumed to be piecewise constant (hence their names). | ||
|
||
While dealing with census data and rate tables, having a survival model defined by a piecewise constant hazard is very common. In particular, random lifes extracted from `RateTable`s from [`RateTables.jl`](https://github.com/JuliaSurv/RateTables.jl) follows this pattern. | ||
|
||
|
||
## Examples | ||
|
||
```@example 1 | ||
using SurvivalDistributions, Distributions, Random, Plots, StatsBase | ||
Random.seed!(123) | ||
∂t = rand(20) | ||
λ = rand(20) | ||
D = PiecewiseConstantHazardDistribution(∂t,λ) | ||
sim = rand(D,1000); | ||
``` | ||
|
||
First, let's have a look at the hazard function: | ||
```@example 1 | ||
plot(t -> hazard(D,t), ylabel = "Hazard", xlims = (0,10)) | ||
``` | ||
|
||
As excepted, it is quite random. | ||
|
||
Then, we can verify the coherence of our code by comparing the obtained sample and the true pdf: | ||
```@example 1 | ||
histogram(sim, normalize=:pdf, bins = range(0, 5, length=30)) | ||
plot!(t -> pdf(D,t), ylabel = "Density", xlims = (0,5)) | ||
``` | ||
|
||
The comparison is not too bad ! We could also compare the empirical and theroetical cdfs: | ||
```@example 1 | ||
ecdfsim = ecdf(sim) | ||
plot(x -> ecdfsim(x), 0, 5, label = "ECDF", linecolor = "gray", linewidth=3) | ||
plot!(t -> cdf(D,t), xlabel = "x", ylabel = "CDF vs. ECDF", xlims = (0,5)) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
abstract type AbstractHazardDistribution <: ContinuousUnivariateDistribution end | ||
@distr_support AbstractHazardDistribution 0.0 Inf | ||
loghazard(X::AbstractHazardDistribution, t::Real) = log(hazard(X,t)) | ||
cumhazard(X::AbstractHazardDistribution, t::Real) = quadgk(u -> hazard(X,u), 0, t)[1] | ||
logccdf( X::AbstractHazardDistribution, t::Real) = -cumhazard(X,t) | ||
ccdf( X::AbstractHazardDistribution, t::Real) = exp(-cumhazard(X,t)) | ||
cdf( X::AbstractHazardDistribution, t::Real) = -expm1(-cumhazard(X,t)) | ||
logcdf( X::AbstractHazardDistribution, t::Real) = log1mexp(-cumhazard(X,t)) | ||
pdf( X::AbstractHazardDistribution, t::Real) = hazard(X,t)*ccdf(X,t) | ||
logpdf( X::AbstractHazardDistribution, t::Real) = loghazard(X,t) - cumhazard(X,t) | ||
function quantile( X::AbstractHazardDistribution, t::Real) | ||
u = log(1-t) | ||
return find_zero(x -> u + cumhazard(X,x), (0.0, Inf)) | ||
end | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,13 @@ | ||
""" | ||
ExponentiatedWeibull(sigma,nu,gamma) | ||
ExponentiatedWeibull(α,θ,γ) | ||
The *ExponentiatedWeibull distribution* with scale `sigma`, shape `nu` and second shape `gamma` has probability density function | ||
The [Exponentiated Weibull distribution](https://en.wikipedia.org/wiki/Exponentiated_Weibull_distribution) is obtain by exponentiating the cdf of the [Weibull distribution](https://en.wikipedia.org/wiki/Weibull_distribution). This simple transformation adds a second shape parameter that, interestingly, induces a lot of flexibility on the hazard function. The hazard function of the Exponentiated Weibull distribution can capture the basic shapes: constant, increasing, decreasing, bathtub, and unimodal, making it appealing for survival models. | ||
```math | ||
f(x; parameters) = ... | ||
``` | ||
More details and examples of usage could be provided in this docstring. | ||
Maybe this distribution could simply be constructed from a transformation of the original Weibull ? | ||
A random variable X follows an `ExponentiatedWeibull(α,θ,γ)` distribution when it has cumulative distribution function ``F_X = F_W^{γ}`` where ``F_W`` is the cumulative distribution function of a `Weibull(α,θ)`. | ||
References: | ||
* [Link to my reference so that people understand what it is](https://myref.com) | ||
* [Exponentiated Weibull distribution](https://en.wikipedia.org/wiki/Exponentiated_Weibull_distribution) | ||
* [Weibull distribution](https://en.wikipedia.org/wiki/Weibull_distribution) | ||
""" | ||
const ExponentiatedWeibull{T} = ExpoDist{Weibull{T}} | ||
ExponentiatedWeibull(sigma,nu,gamma) = ExpoDist(gamma, Weibull(nu,sigma)) | ||
ExponentiatedWeibull(α,θ,γ) = ExpoDist(γ, Weibull(α,θ)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,34 +1,38 @@ | ||
""" | ||
LogLogistic(mu,sigma) | ||
LogLogistic(μ,σ) | ||
To be described... | ||
According to [its wikipedia page](https://en.wikipedia.org/wiki/Log-logistic_distribution), the the log-logistic distribution (known as the Fisk distribution in economics) is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, as, for example, mortality rate from cancer following diagnosis or treatment. It has also been used in hydrology to model stream flow and precipitation, in economics as a simple model of the distribution of wealth or income, and in networking to model the transmission times of data considering both the network and the software. | ||
The log-logistic distribution is the probability distribution of a random variable whose logarithm has a logistic distribution. It is similar in shape to the log-normal distribution but has heavier tails. Unlike the log-normal, its cumulative distribution function can be written in closed form. | ||
It is characterized by its density function as | ||
```math | ||
f(x; parameters) = ... | ||
f(x) = \\frac{(\\frac{β}{α})(\\frac{x}{α})^{β-1} }{(1 + (\\frac{x}{α})^{β})^2}, | ||
``` | ||
where α = e^μ and β = 1/σ. | ||
""" | ||
struct LogLogistic{T<:Real} <: ContinuousUnivariateDistribution | ||
mu::T | ||
sigma::T | ||
function LogLogistic(mu,sigma) | ||
T = promote_type(Float64, eltype.((mu,sigma))...) | ||
return new{T}(T(mu), T(sigma)) | ||
X::Logistic{T} | ||
function LogLogistic(μ,σ) | ||
X = Logistic(μ, σ) | ||
return new{eltype(X)}(X) | ||
end | ||
end | ||
LogLogistic() = LogLogistic(1,1) | ||
params(d::LogLogistic) = (d.mu,d.sigma) | ||
params(d::LogLogistic) = (d.X.μ,d.X.θ) | ||
@distr_support LogLogistic 0.0 Inf | ||
function loghazard(d::LogLogistic, t::Real) | ||
lt = log.(t) | ||
lpdf0 = logpdf.(Logistic(d.mu, d.sigma), lt) .- lt | ||
ls0 = logccdf.(Logistic(d.mu, d.sigma), lt) | ||
return lpdf0 .- ls0 | ||
lt = log(t) | ||
lpdf0 = logpdf(Logistic(d.X.μ, d.X.θ), lt) | ||
ls0 = logccdf(Logistic(d.X.μ, d.X.θ), lt) | ||
return lpdf0 - ls0 - lt | ||
end | ||
function cumhazard(d::LogLogistic,t::Real) | ||
lt = log.(t) | ||
return -logccdf.(Logistic(d.mu, d.sigma), lt) | ||
return -logccdf.(Logistic(d.X.μ, d.X.θ), lt) | ||
end | ||
logpdf(d::LogLogistic, t::Real) = loghazard(d,t) - cumhazard(d,t) | ||
cdf(d::LogLogistic, t::Real) = -expm1(-cumhazard(d,t)) | ||
cdf(d::LogLogistic, t::Real) = -expm1(-cumhazard(d,t)) | ||
rand(rng::AbstractRNG, d::LogLogistic) = exp(rand(rng,d.X)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
struct PiecewiseConstantHazardDistribution <: AbstractHazardDistribution | ||
∂t::Vector{Float64} | ||
λ::Vector{Float64} | ||
end | ||
# the three folowing functions are actually enough i think to be able to sample eficiently for piecewise constant hazard distributions. | ||
function hazard(D::PiecewiseConstantHazardDistribution, t::Real) | ||
u = 0.0 | ||
for i in 1:length(D.∂t) | ||
u += D.∂t[i] | ||
if t < u | ||
return D.λ[i] | ||
end | ||
end | ||
return D.λ[end] | ||
end | ||
function cumhazard(D::PiecewiseConstantHazardDistribution, t::Real) | ||
Λ = 0.0 | ||
u = 0.0 | ||
for j in eachindex(D.∂t) | ||
u += D.∂t[j] | ||
if t > u | ||
Λ += D.λ[j]*D.∂t[j] | ||
else | ||
Λ += D.λ[j]*(t-(u-D.∂t[j])) | ||
return Λ | ||
end | ||
end | ||
# We consider that the last box is in fact infinitely wide (exponential tail) | ||
return Λ + (t-u)*L.λ[end] | ||
end | ||
function quantile(D::PiecewiseConstantHazardDistribution, p::Real) | ||
Λ_target = -log(1-p) | ||
Λ = 0.0 | ||
u = 0.0 | ||
for j in eachindex(D.∂t) | ||
Λ += D.λ[j]*D.∂t[j] | ||
u += D.∂t[j] | ||
if Λ_target < Λ | ||
u -= (Λ - Λ_target) / D.λ[j] | ||
return u | ||
end | ||
end | ||
return u | ||
end | ||
function expectation(L::PiecewiseConstantHazardDistribution) | ||
S = 1.0 | ||
E = 0.0 | ||
for j in eachindex(D.∂t) | ||
if D.λ[j] > 0 | ||
S_inc = exp(-D.λ[j]*D.∂t[j]) | ||
E += S * (1 - S_inc) / D.λ[j] | ||
S *= S_inc | ||
else | ||
E += S * D.∂t[j] | ||
end | ||
end | ||
# This reminder assumes a exponential life time afer the maximuum age. | ||
R = ifelse(D.λ[end] == 0.0, 0.0, S / D.λ[end]) | ||
return E + R | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
7ce0e7d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JuliaRegistrator register
7ce0e7d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Registration pull request created: JuliaRegistries/General/106966
Tip: Release Notes
Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.
To add them here just re-invoke and the PR will be updated.
Tagging
After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.
This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via: