-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing Rust code into Biotite #688
Comments
I am not worried about the increased complexity of another programming language. Speed is a major selling point of Biotite - therefore moving from Cython to C++/Rust bindings seems to be a logical next step. When deciding between C++ and Rust, given we do not have preexisting legacy code in either langugage, Rust, due to its memory safety, seems to be the logical choice. I think keeping the build complexity for all user groups low is also quite important. However, PyO3/maturin#2177 seems to be quite active. So given that it is merged, I would vote for adding Rust code to Biotite. |
Hi, I haven't contributed anything on the Rust side so far, but the arguments Do you plan to still accept Cython in new contributions in the long-term? |
Yes, at least for the foreseeable future. However, I think @JacobAnter and me were the only ones contributing Cython code to the project so far. So realistically, if at some point most Cython code would be migrated to Rust code, we could discuss again, whether we discourage new Cython modules. |
To achieve high performance where vectorization with NumPy is not possible, Biotite currently uses Cython code. However, there are some limitations in Cython:
Hence, this issue should initiate the discussion if Rust code using PyO3 should be allowed in Biotite, as it has become quite mature in recent years. This would address all the issues mentioned above. More specifically, these are the places in the code base where the limitations become quite clear:
PDBFile
: There already isfastpdb
written in Rust, but it needs to be maintained separately in the moment.connect_via_residue_names()
: The Python string operations in this Cython function makes it quite slow and it is actually the bottleneck inpdbx.get_structure()
, when the input is aBinaryCIFFile
.CIFFile
: It has been addressed multiple times now to make it faster (Handle embedded quote in mmcif #619, Update _split_one_line and remove whitespace parameter #686), but this short function is still the bottleneck when reading CIF files.Probably even more places in Biotite would benefit from routines written in Rust.
However there would also be a few disadvantages:
I lean towards accepting Rust in Biotite (otherwise I would not have opened this issue 😉), but I really like to hear your opinion about this @t0mdavid-m @JHKru @MaxGreil and other contributors/users with an opinion on this topic.
The text was updated successfully, but these errors were encountered: