Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatting for Simplical Complex #15

Open
chaxor opened this issue Oct 11, 2019 · 5 comments
Open

Formatting for Simplical Complex #15

chaxor opened this issue Oct 11, 2019 · 5 comments

Comments

@chaxor
Copy link

chaxor commented Oct 11, 2019

I have been attempting to compute PH for certain types of dynamic networks and have run into some difficulty in producing the input for Eirene.
More specifically, I am somewhat interested in a type of filtration referred to as a 'node-ordered filtration complex'.
I would like to be able to utilize the sparse complex formats ('dv'/'ev'); however, the "simple" formatting option 'sp' has not yet worked out.

Using Eirene with 'vr' or 'pc' options does work without error - this is more specifc to understanding the format.

Here is a minimal, workable example, taken directly from Figure 2 of [1], which has associated Betti curves such that the result can be verified.

using DelimitedFiles
using LightGraphs
using Eirene

function simple_format_file(B, node_times, file_name)
    # B is a binary adjacency matrix of a network
    # node_times is a vector to store the time in which each node appears within the network
    #     (network is dynamic and growing)
    
    # Construct a graph structure from binary adjacency matrix
    G = SimpleGraph(B)

    # Get the cliques of the network at the last time step
    X = maximal_cliques(G)
    
    
    # This may be separate from a simple counter of the nodes,
    # as several nodes may show up simultaneously
    nodes = collect(1:size(Badj)[1])
    
    # Construct the "Simple Format" for a complex
    # as per instructions in Eirene's documentation
    
    # List of each line to add to a file for the specific formatting:
    fmt = [] 
    # Grow the network from the first node,
    # considering only a portion of the adjacency matrix at each step
    for (n, t) in zip(nodes, node_times)
        # TODO:
        # groupby times such that nodes can appear simultaneously
        
        # Adjacency matrix up to that node (time)
        B = Badj[1:n,1:n]
        G = SimpleGraph(B)
        X = maximal_cliques(G)

        # For each clique in the clique complex, X,
        # add a line as per the instruction in Eirene's documentation
        for x in X
            cell_dim = length(x) - 1
            time = t
            complex = x
            this_line = append!([cell_dim, time], complex)
            append!(fmt, [this_line])
        end
    end
    
    # Write the file
    writedlm(file_name, fmt, ',')
    
    return file_name
end

##############
# Example Data
##############

# Example binary adjacency matrix 
# Copied directly from Figure 2 of Reference 1
Badj = convert(Array{Bool, 2},
    [0 1 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0;
     1 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0;
     1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0;
     0 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0;
     1 1 0 1 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0;
     0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0;
     0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0;
     0 1 1 0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0;
     0 0 0 0 0 1 0 0 0 1 1 1 1 1 1 0 0 0 0 0;
     1 0 1 0 1 1 0 1 1 0 1 0 1 0 0 0 0 0 0 0;
     0 0 0 0 0 0 0 0 1 1 0 1 1 0 1 0 0 0 0 0;
     0 0 0 0 0 0 0 1 1 0 1 0 0 1 1 0 0 0 0 0;
     0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0;
     0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0;
     0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0;
     0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0;
     0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0;
     1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1;
     1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0;
     0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0])

# A vector to store the time in which each node appears within the network
# (network is dynamic and growing)
node_times = collect(1:size(Badj)[1])

NOFC_file_name = "node_ordered_filter_complex.csv"


#############################################
# Compute Persistent Homology / Betti Curves
#############################################

# Create the file
simple_format_file(Badj, node_times, NOFC_file_name)

C = eirene(NOFC_file_name,
            model = "complex",
            entryformat = "sp",
            maxdim = 3)

The error I am recieving is:

    BoundsError: attempt to access 2-element Array{Int64,1} at index [3]

References:

  1. https://arxiv.org/pdf/1709.00133.pdf
@henselman-petrusek
Copy link
Owner

Thanks for the issue report! Just tried running the code. One issue is that rows need to appear in ascending order, by dimension. In practice this means that the entries of the first column should appear in sorted order (low to high). You can find the reference to this in the instructions where it says to number the cells 1, ..., N in ascending order, according to dimension. However, this is easy to miss. I've added a note to the documentation to help make it more visible. If this or any other problem persists, please let us know. Thanks!

@chaxor
Copy link
Author

chaxor commented Nov 23, 2019

I sorted the dimensions by placing the line fmt = sort(fmt) right before writing the data to a file ( writedlm(file_name, fmt, ',')), but this, unfortunately, didn't resolve the issue for me.

@henselman-petrusek
Copy link
Owner

hi @chaxor, thanks for the update. question: if you add keyword argument <maxdim = 4> to the function call, do you still get an error?

@chaxor
Copy link
Author

chaxor commented Dec 11, 2019

I have tried this and it still does not work? Does that work for you?

@henselman-petrusek
Copy link
Owner

hi @chaxor
sorry for the long delay. there are several things to consider. (1) the convention is to refer to a cell by its row number in the input file. after sorting the rows, it's therefore necessary to change the face numbers accordingly. (2) if you relabel after sorting, the first row of your new file will be 0, 1, 1. this cannot be, as at means that cell #1 is a codimenion-1 face of itself (in general, a cell of dimension d can only be a codimension-1 face of a cell of dimension d+1). so i suspect there may be an issue with the underlying construction. once those things are sorted, it will be much easier to trouble shoot. thanks again for the report, and good luck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants