Bitinformation of masked arrays #30

milankl · 2022-03-21T16:54:10Z

No description provided.

aaronspring · 2022-03-22T12:06:21Z

Great. I can test this PR next week.

milankl · 2022-03-22T21:19:40Z

The last commit 5a7b6fb adds a simple test that for a given array and a mask that has all entries unmasked with bitwise real information is identical. Before I merge this, @aaronspring could you install this branch via ] add https://github.com/milankl/BitInformation.jl#mk/masked and report back in #29 how your results change with and without mask provided?

aaronspring · 2022-03-23T16:18:21Z

src/mutual_information.jl

+in adjacent entries in A along dimension `dim` (optional keyword). Array `A` is masked through
+trues in entries of the mask `mask`. Masked elements are ignored in the bitwise information calculation."""
+function bitinformation(A::AbstractArray{T},
+                        mask::BitArray;


@milankl Would it be possible for bitinformation to guess the mask based on a masking value, such as -9.e+33?

Or I'd appreciate a short help to how to create mask. What the new julia user tries and fails is mask = ncfile.vars[varnames[1]][:,:,:,:] == -9.e+33

While it would be possible to guess the mask, I don't want to do that, as for the moment the number format T can be any bitstype. Hence making the assumption that -9e33 is a mask may work well in one format (e.g. Float64) but not necessarily in others (integer, posits, etc., or a NaN in Float16 isn't necessarily a NaN in BFloat16 etc.). We could define bitinformation(A::AbstractArray{T},mask::T) which creates a mask based on the bit pattern in the scalar mask if that would be helpful. In general, Julia creates a BitArray for any broadcasted comparison, e.g.

julia> A = rand(3,3) 3×3 Matrix{Float64}: 0.119336 0.127636 0.808542 0.439229 0.388266 0.52312 0.899243 0.992992 0.549393 julia> A .< 0.5 3×3 BitMatrix: 1 1 0 1 1 0 0 0 0

what you are missing is the dot, i.e. .== instead of == broadcasting in Julia is a bit more conservative than in python...

bitinformation(::Array{T},masked_value::T) is now defined in #33, such that you can do bitinformation(A,-9f33) and a mask is created internally for all values that are floating-point identical to the second argument.

aaronspring · 2022-03-28T11:33:34Z

@milankl I find differences whether using mask or not, specially the first bits and within 99-100% information:

dim: 4 time
from https://gist.github.com/aaronspring/5de0bc6be5a8d547f3503ff8b1aef8c6

dim: 1 x
from https://gist.github.com/aaronspring/7b0675e36467e5c647f2fe3f546d4bf5

dim analysis:
from https://gist.github.com/aaronspring/52662dd885eebfd6ce4b88939be016c6

milankl · 2022-03-28T12:14:50Z

Thanks Aaron, that looks awesome. Good to see that all these information patches in the exponent bits disappear! For dissicos could you convert your arrays to signed exponent bits? It looks like there's a bunch of exponent bits that simultaneously flips over (which happens when your data covers a range across floating-point 2)

julia> A = rand(Float32,3,3)
3×3 Matrix{Float32}:
 0.408687   0.922863  0.0622634
 0.838222   0.947521  0.141393
 0.0944574  0.991588  0.576514

julia> signed_exponent(A)
3×3 Matrix{Float32}:
 13.078    7.3829   127.515
  6.70577  7.58017   18.0983
 48.3622   7.93271    4.61211

It looks wrong, because Julia will interpret the exponent bits still as being biased (the information that the exponent bits are now to be interpreted differently is not stored), but you can always check that nothing went wrong by applying the inverse biased_exponent. If you use signed_exponent as a preprocessing step, note that also your mask will change

julia> signed_exponent([-9e-33])
1-element Vector{Float64}:
 -4.739053125085073e32

aaronspring · 2022-03-28T13:22:35Z

signed_exponent added makes the black bar in dissicos disappear

milankl · 2022-03-28T13:24:25Z

Can we move this discussion to #29 ?

permutation and insignificant outsourced

4b77558

milankl mentioned this pull request Mar 21, 2022

Bitinformation of masked data #29

Open

docstrings with """

bb21eb0

milankl added 5 commits March 22, 2022 13:13

permutations.jl included

24ce428

permutations with tests

e04fa81

typo

edcb378

bitpair_count, bitinformation with mask

6401689

masked bitinformation simple test added

5a7b6fb

milankl added 2 commits March 23, 2022 12:24

More tests, check number of adjacent entries >0

a639b29

some docstrings added

1ac3c6f

milankl merged commit 05bd9ef into main Mar 23, 2022

milankl deleted the mk/masked branch March 23, 2022 12:55

aaronspring reviewed Mar 23, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bitinformation of masked arrays #30

Bitinformation of masked arrays #30

milankl commented Mar 21, 2022

aaronspring commented Mar 22, 2022

milankl commented Mar 22, 2022 •

edited

Loading

aaronspring Mar 23, 2022 •

edited

Loading

milankl Mar 23, 2022

milankl Mar 28, 2022

aaronspring commented Mar 28, 2022

milankl commented Mar 28, 2022

aaronspring commented Mar 28, 2022

milankl commented Mar 28, 2022

Bitinformation of masked arrays #30

Bitinformation of masked arrays #30

Conversation

milankl commented Mar 21, 2022

aaronspring commented Mar 22, 2022

milankl commented Mar 22, 2022 • edited Loading

aaronspring Mar 23, 2022 • edited Loading

Choose a reason for hiding this comment

milankl Mar 23, 2022

Choose a reason for hiding this comment

milankl Mar 28, 2022

Choose a reason for hiding this comment

aaronspring commented Mar 28, 2022

milankl commented Mar 28, 2022

aaronspring commented Mar 28, 2022

milankl commented Mar 28, 2022

milankl commented Mar 22, 2022 •

edited

Loading

aaronspring Mar 23, 2022 •

edited

Loading