use of bitinformation(dim) #31

aaronspring · 2022-03-24T11:21:11Z

I don't quite understand the dim argument in bitinformation and its implications. Can I just ignore it and use the default dim=1?

BitInformation.jl/test/information.jl

Lines 37 to 39 in 05bd9ef

    
           @test bi1[i] ≈ bi2[i] atol=1e-3 
        
           @test bi2[i] ≈ bi3[i] atol=1e-3 
        
           @test bi1[i] ≈ bi3[i] atol=1e-3

seems like dim only matters for sorted dimensions, i.e. dim doesnt matter on raw data.

Your example plots in https://doi.org/10.24433/CO.8682392.v1 are using dim=1 meaning longitude. I have data along dimensions longitude, latitude and time and somehow intuitively would run the analysis along time.

The text was updated successfully, but these errors were encountered:

milankl · 2022-03-24T11:31:12Z

That test is indeed confusing. As the array A is not sorted, every entry is independent of the next hence all those tests just check that the information is zero.

julia> using BitInformation
julia> A = rand(Float32,30,40,50);
julia> bi1 = bitinformation(A,dim=1);
julia> bi2 = bitinformation(A,dim=2);
julia> bi3 = bitinformation(A,dim=3);
julia> hcat(bi1,bi2,bi3)
32×3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 ⋮

However, if you sort the array in a given dimension then you artificially introduce some information, which is highest in that dimension

julia> sort!(A,dims=1);
julia> bi1 = bitinformation(A,dim=1);
julia> bi2 = bitinformation(A,dim=2);
julia> bi3 = bitinformation(A,dim=3);
julia> hcat(bi1,bi2,bi3)
32×3 Matrix{Float64}:
 0.0          0.0          0.0
 0.0          0.0          0.0
 0.0          0.0          0.0
 0.0          0.0          0.0
 0.0          0.0          0.0
 0.0067747    0.00508132   0.00538892
 0.292094     0.182393     0.187531
 0.550684     0.265361     0.271625
 0.371526     0.114251     0.118072
 0.237596     0.0441321    0.0441709
 ⋮                         
 0.0          0.0          9.3149e-5
 0.0          0.0          0.0
 0.0          0.0          0.0
 0.0          0.0          0.0
 0.0          0.0          0.000280003
 0.000749177  0.000946589  0.000850585
 0.00515332   0.00430802   0.00508684
 0.0233246    0.0177343    0.0185884
 0.061388     0.0458432    0.0484664

bi1 will have the highest information in the exponent/mantissa bits, but sorting along 1 dimension also influences the other (with smaller information though). The information in the last mantissa bits is due to the poor sampling of rand (see the randfloat function in JuliaRandom/RandomNumbers.jl as an alternative).

milankl · 2022-03-24T11:37:14Z

I have data along dimensions longitude, latitude and time and somehow intuitively would run the analysis along time.

You can run the analysis along any dimension you like. You can also add the information. The first dimension is usually just the default because that's also how the data is layed out in memory/on disk. Things can change along different dimensions, depending on the resolution. Check the supplement of our paper for some examples.

aaronspring · 2022-03-24T12:42:52Z

is it also possible to run bitinformation on all dimensions and does that make sense?

milankl · 2022-03-24T14:02:39Z

Yes, that's the same as running it in all dimensions separately and averaging the information. As it's an arithmetic mean you'll end up in the situation that if the information is high in one dimension but low in another that you may cut off too many bits for that high-information dimension. So what I often just went for is using longitude alone. Rule of thumb that I found in our data is information is highest in longitude/time then latitude then vertical then ensemble. But that obviously depends on the spatio-temporal resolution...

aaronspring · 2022-03-24T14:25:41Z

thank you

aaronspring closed this as completed Mar 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use of bitinformation(dim) #31

use of bitinformation(dim) #31

aaronspring commented Mar 24, 2022 •

edited

Loading

milankl commented Mar 24, 2022

milankl commented Mar 24, 2022

aaronspring commented Mar 24, 2022

milankl commented Mar 24, 2022

aaronspring commented Mar 24, 2022

use of bitinformation(dim) #31

use of bitinformation(dim) #31

Comments

aaronspring commented Mar 24, 2022 • edited Loading

milankl commented Mar 24, 2022

milankl commented Mar 24, 2022

aaronspring commented Mar 24, 2022

milankl commented Mar 24, 2022

aaronspring commented Mar 24, 2022

aaronspring commented Mar 24, 2022 •

edited

Loading