Suggestion to improve efficiency in the unicorn notebook #2

abouelkhair5 · 2024-04-09T16:52:29Z

In the unicorn notebook and specially in the prepare_graph function you call nodes.keys and index function twice and those are expensive calls that result in the prepare_graph call taking over 2 hours on a very strong machine

edge_index = [[], []]
for src, dst in edges:
    src_index = list(nodes.keys()).index(src)
    dst_index = list(nodes.keys()).index(dst)
    edge_index[0].append(src_index)
    edge_index[1].append(dst_index)

an alternative would be to precompute all the indencies and store them in a hashmap and compute the graph in a few seconds
for example:

node_index_map = {node: i for i, node in enumerate(nodes.keys())}
for src, dst in tqdm(edges):
    src_index = node_index_map[src]
    dst_index = node_index_map[dst]
    edge_index[0].append(src_index)
    edge_index[1].append(dst_index)

The text was updated successfully, but these errors were encountered:

abouelkhair5 · 2024-04-17T18:38:42Z

I validated that the proposed change to the prepare graph function doesn't change the output as it matches what the output should be for unicorn95.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion to improve efficiency in the unicorn notebook #2

Suggestion to improve efficiency in the unicorn notebook #2

abouelkhair5 commented Apr 9, 2024

abouelkhair5 commented Apr 17, 2024

Suggestion to improve efficiency in the unicorn notebook #2

Suggestion to improve efficiency in the unicorn notebook #2

Comments

abouelkhair5 commented Apr 9, 2024

abouelkhair5 commented Apr 17, 2024