Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Universe serialization performance #3721

Closed
yuxuanzhuang opened this issue Jun 19, 2022 · 0 comments · Fixed by #3722 or #3724
Closed

Improve Universe serialization performance #3721

yuxuanzhuang opened this issue Jun 19, 2022 · 0 comments · Fixed by #3722 or #3724

Comments

@yuxuanzhuang
Copy link
Contributor

yuxuanzhuang commented Jun 19, 2022

Is your feature request related to a problem?

The speed and memory footprint of Universe serialization greatly influences the performance of parallel analysis especially when the analysis block is small. This Issue serves as an (ongoing) collection of possible improvements for the performance.

Benchmark

 # 560k atom, 125880 residues
u = mda.Universe('test.pdb')

dump_times = np.zeros(10)
load_times = np.zeros(10)

for rep in range(10):
    start_time = time.time()

    u_dumped = pickle.dumps(u)
    dump_finish_time = time.time()
    dump_times[rep] = dump_finish_time - start_time
    
    u_new = pickle.loads(u_dumped)
    load_finish_time = time.time()
    load_times[rep] = load_finish_time - dump_finish_time

print(f"dump time: {dump_times.mean():.2f} \u00B1 {dump_times.std():.2f} s")
print(f"load time: {load_times.mean():.2f} \u00B1 {load_times.std():.2f} s")

> dump time: 0.45 ± 0.00 s
> load time: 1.48 ± 0.06 s

> dump only trajectory: 0.00 ± 0.00 s
> load only trajectory: 1.14 ± 0.07 s

> dump only topology: 0.46 ± 0.03 s
> load only topology: 0.19 ± 0.01 s
with open('universe.pkl', 'wb') as f:
    pickle.dump(u, f)
!du -sh ./universe.pkl
> 90M	./universe.pkl

> 11M: trajectory
> 79M: topology
<MDAnalysis.core.topologyattrs.Atomnames object at 0x7fd41e2e2770>
5,4M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.AltLocs object at 0x7fd41e2e2b30>
5,4M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.ChainIDs object at 0x7fd41e2e2a10>
5,4M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.RecordTypes object at 0x7fd41e2e3520>
5,4M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Atomids object at 0x7fd41e2e3160>
4,3M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Tempfactors object at 0x7fd41e2e3e20>
4,3M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Occupancies object at 0x7fd41b705c00>
4,3M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Atomtypes object at 0x7fd48888f4f0>
5,4M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Elements object at 0x7fd41b706440>
5,4M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Masses object at 0x7fd41b705b40>
4,3M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Resnums object at 0x7fd41b706410>
984K	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Resids object at 0x7fd41b705ba0>
984K	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Resnums object at 0x7fd41b705bd0>
984K	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.ICodes object at 0x7fd41b7064d0>
1,3M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Resnames object at 0x7fd41b705c30>
1,3M	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Segids object at 0x7fd41b706500>
4,0K	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Atomindices object at 0x7fd41b706770>
4,0K	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Resindices object at 0x7fd41b705ea0>
4,0K	./dump_attr.pkl
<MDAnalysis.core.topologyattrs.Segindices object at 0x7fd41b705e70>
4,0K	./dump_attr.pkl

No need to redirect to the original frame during deserialization self[self.ts.frame] PR #3722

# with PR #3722
> dump time: 0.47 ± 0.03 s
> load time: 0.22 ± 0.01 s

90M	./universe.pkl

lazily-building Universe._topology.tt. _RA, _SR and refactor make_downshift_arrays PR #3724

  • Universe._topology.tt. _RA, _SR are huge nested arrays that take a long time to serialize.
  • for a 560k atom PDB topology, the serialization/deserialization time further decreases from 0.7s to 0.2s.
# with PR #3724
> dump time: 0.11 ± 0.00 s
> load time: 1.37 ± 0.06 s

80M	./universe.pkl

Decrease topology memory footprint

  • nmidx and values are pair-wise mapped. any reason to keep both inside the memory? (I know it's useful when one tries to modify the topology)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment