Skip to content
This repository has been archived by the owner on Dec 11, 2022. It is now read-only.

Commit

Permalink
Merge pull request #128 from EcoJulia/tpoisot/issue127
Browse files Browse the repository at this point in the history
Fix the BRT vignette
  • Loading branch information
tpoisot authored Oct 13, 2021
2 parents eb7156d + e51e606 commit 4e54da1
Show file tree
Hide file tree
Showing 2 changed files with 48 additions and 31 deletions.
4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "SimpleSDMLayers"
uuid = "2c645270-77db-11e9-22c3-0f302a89c64c"
authors = ["Timothée Poisot <[email protected]>", "Gabriel Dansereau <[email protected]>"]
version = "0.8.0"
version = "0.8.1"

[deps]
ArchGDAL = "c9ce4bd3-c3d5-55b8-8973-c0e20141b8c3"
Expand All @@ -19,8 +19,8 @@ ZipFile = "a5390f91-8eb1-5f08-bee0-b1d1ffed6cea"

[compat]
ArchGDAL = "0.7"
Colors = "0.12"
ColorBlendModes = "0.2"
Colors = "0.12"
Distances = "0.10"
Downloads = "1.4"
GeometryBasics = "0.4"
Expand Down
75 changes: 46 additions & 29 deletions docs/src/sdm/brt.jl
Original file line number Diff line number Diff line change
Expand Up @@ -108,8 +108,9 @@ distribution

top5_var = importance(model, collect(layernames(WorldClim, BioClim)))[1:5]

# This is an interesting alternative to VIF for variable selection, but also
# shows the very strong impact of the mean temperature in the wettest quarter:
# This is an interesting alternative to VIF for variable selection. Let's
# examine how the most important variable relates to the predicted distribution
# score:

most_important_layer = findfirst(isequal(top5_var[1].first), collect(layernames(WorldClim, BioClim)))
histogram(
Expand Down Expand Up @@ -156,41 +157,57 @@ xaxis!("Prediction score")
# that optimizes Youden's J (Cohen's κ is also a suitable alternative):

cutoff = LinRange(extrema(distribution)..., 500);
J = similar(cutoff);

# The loop to measure everything is fairly simple, as we already know the
# correct positions of presences and absences:
obs = y .> 0

tp = zeros(Float64, length(cutoff));
fp = zeros(Float64, length(cutoff));
tn = zeros(Float64, length(cutoff));
fn = zeros(Float64, length(cutoff));

for (i, c) in enumerate(cutoff)
p_presence = distribution[xy_presence] .>= c
p_absence = distribution[xy_absence] .>= c
tp = sum(p_presence)
fp = length(p_presence) - tp
fn = sum(p_absence)
tn = length(p_absence) - fn
J[i] = tp / (tp + fn) + tn / (tn + fp) - 1
prd = distribution[xy] .>= c
tp[i] = sum(prd .& obs)
tn[i] = sum(.!(prd) .& (.!obs))
fp[i] = sum(prd .& (.!obs))
fn[i] = sum(.!(prd) .& obs)
end

# We can finally replace the `NaN` values by the random estimate, and look at
# the plot:
# From this, we can calculate a number of validation measures:

tpr = tp ./ (tp .+ fn);
fpr = fp ./ (fp .+ tn);
J = (tp ./ (tp .+ fn)) + (tn ./ (tn .+ fp)) .- 1.0;
ppv = tp ./ (tp .+ fp);

# The ROC-AUC is an overall measure of how good the fit is:

dx = [reverse(fpr)[i] - reverse(fpr)[i - 1] for i in 2:length(fpr)]
dy = [reverse(tpr)[i] + reverse(tpr)[i - 1] for i in 2:length(tpr)]
AUC = sum(dx .* (dy ./ 2.0))

# We can pick the value of the cutoff that maximizes J:

thr_index = last(findmax(J))
τ = cutoff[thr_index]

# Let's have a look at the ROC curve:

J[isnan.(J)] .= 0.5
plot(cutoff, J; lab="", fill=(0, 0.5, :grey), c=:grey)
xaxis!(extrema(distribution), "Threshold")
yaxis!((0.5, 1), "Informedness")
plot(fpr, tpr; aspectratio=1, frame=:box, lab="", dpi=600, size=(400, 400))
scatter!([fpr[thr_index]], [tpr[thr_index]]; lab="", c=:black)
plot!([0, 1], [0, 1]; c=:grey, ls=:dash, lab="")
xaxis!("False positive rate", (0, 1))
yaxis!("True positive rate", (0, 1))

# It's fairly noteworthy that the region around 0.5 is relatively smooth, which
# was also clear from the histogram of predicted values. This is a good sign
# that the exact cutoff may vary as a function of, for example, small changes in
# the pseudo-absence locations. This can be an interesting exercise: what
# happens to this curve if there are more pseudo-absences than presences? What
# happens to the cutoff if there are multiple runs of the model on different
# test/train set, or different pseudo-absences? For now, we will assume that the
# correct cutoff τ for a presence is readable directly at the peak of the curve:
# And the precision-recall as well:

τ = cutoff[last(findmax(J))]
plot(tpr, ppv; aspectratio=1, frame=:box, lab="", dpi=600, size=(400, 400))
scatter!([tpr[thr_index]], [ppv[thr_index]]; lab="", c=:black)
plot!([0, 1], [1, 0]; c=:grey, ls=:dash, lab="")
xaxis!("True positive rate", (0, 1))
yaxis!("Positive predictive value", (0, 1))

# We can now map the result using the `distribution` data:
# We can now map the result using τ as a cutoff for the `distribution` data:

range_mask = broadcast(v -> v >= τ, distribution)

Expand Down Expand Up @@ -245,7 +262,7 @@ future_distribution
# The values in `future_distribution` are in the scale of what the BRT returns,
# so we can compare them with the values of `distribution`:

plot(future_distribution - distribution; clim=(-1.1, 1.1), c=:lisbon)
plot(future_distribution - distribution; clim=(-1, 1), c=:broc, frame=:box)

# This shows the area of predicted gain and loss of presence. Because we have
# thresholded our current distribution, we can look at the predicted ranges of
Expand Down

2 comments on commit 4e54da1

@tpoisot
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator register()

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/46639

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.8.1 -m "<description of version>" 4e54da10da7b9509fc4dc7382274445d1d57747e
git push origin v0.8.1

Please sign in to comment.