Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intent to rewrite in Rust #110

Open
indygreg opened this issue Jun 13, 2020 · 7 comments
Open

Intent to rewrite in Rust #110

indygreg opened this issue Jun 13, 2020 · 7 comments

Comments

@indygreg
Copy link
Owner

After learning Rust and seeing its obvious benefits for safety/security and ease of maintenance over C, I would prefer to not author any new C code wherever possible.

I believe implementing this Python extension in Rust instead of C has long-term benefits to the maintainability of the project.

Once #109 lands and Python 2.7 is no longer supported, my intent is to start rewriting C code in this project in Rust. (Getting Rust to work with Python 2.7 on Windows is a massive plain and effectively blocks writing Python extensions in Rust.)

I'm filing this issue toprovide a sounding board for people to raise concerns/support for this initiative.

@mgorny
Copy link
Contributor

mgorny commented Jun 14, 2020

I hereby raise concern that Rust is PITA to many people, starting with the fact that it isn't supported on many architectures (coverage is much smaller than zstandard use on Gentoo right now), and ending with the fact that its upstream still haven't managed to reach an agreement with LLVM upstream and instead keep using outdated forks of LLVM.

@indygreg
Copy link
Owner Author

I'm working under the assumption that most people will install this Python package via pip install zstandard. And the overwhelming majority of those people will install a pre-built binary wheel published in PyPI (https://pypi.org/project/zstandard/#files). People installing Python binary wheels will have a pre-built shared library / extension and Rust doesn't enter the picture at all.

The only people that need to be burdened by the existence of Rust are:

  1. Developers of this project
  2. Packagers of this project
  3. People who can't install a pre-built binary wheel (perhaps they are running a more esoteric OS/platform)

1 is basically just me.

I'm very empathetic towards end-users who can't install pre-built binary wheels. For them, I'll need to make the code in setup.py as solid as possible. Or maybe we give up on Rust and fall back to the CFFI backend if Rust isn't "obviously" available.

I have mixed feelings towards packagers. I want to support them fully. My assumption is the state of Rust packaging is solid enough in 2020 that this shouldn't be a problem. (Firefox is using Rust heavily and nearly everyone can package Firefox...) But packagers irk me in that they often make their job 10x harder by imposing arbitrary rules on themselves. So I anticipate packagers potentially taking issue with this change because it makes their lives harder. I dunno.

Do you have a specific objection, @mgorny?

@mgorny
Copy link
Contributor

mgorny commented Jun 15, 2020

You're entirely missing group 4., users of source-based distributions that get all their packages built locally.

As for binary wheels, I have very bad experience with them on non-x86 architectures. I can't imagine anyone using binary wheels, say, on ARM where you have to support a lot of different processors. Yeah, building C without a second machine to cross-compile is not easy but it's possible. I've seen pip install packages (I think it was on SPARC) that segfaulted right away — and I honestly doubt somebody built binary packages for that architecture, so either pip installed something wrong entirely or failed to build it properly from source.

@mgorny
Copy link
Contributor

mgorny commented Jun 15, 2020

Oh, and CFFI would certainly be a better solution. Probably fast enough too.

@ghost
Copy link

ghost commented Aug 7, 2020

IMHO this project has a high degree of completion, if make some changes, maybe it can be included into Python's stdlib, which is written by C.

Some people hope that Python's stdlib includes a zstd module, see issue37095.
(I'm not CPython's core developer)

its obvious benefits for safety/security and ease of maintenance over C

IMHO, In the long run, this project will be stable at some point, then these will not be a problem.

@indygreg
Copy link
Owner Author

The main branch now has a Rust backend passing all tests! You can build + test it by doing something like:

python setup.py --rust-backend build_ext -i
PYTHON_ZSTANDARD_IMPORT_POLICY=rust pytest

There are currently a number of caveats.

The Rust backend was mostly implemented using the CFFI backend as a source, so certain patterns from the C backend may not be preserved. This includes releasing of the GIL.

The Rust backend doesn't yet perform 0-copy in many call sites, so performance is likely slower than the C and even CFFI backends.

There are intermittent fuzzing test failures in CI due to timeouts (possibly due to current performance overhead of Rust backend).

The multi_compress_to_buffer and multi_decompress_to_buffer implementations are grossly inefficient and likely much slower than the C implementation.

We do not yet include the Rust backend in built wheels.

There hasn't been an extensive auditing of the code: there are likely bugs lingering.

Before I call the Rust backend stable, I plan to extensively audit the Rust code against the C code and fix any performance/correctness gaps.

@mgorny
Copy link
Contributor

mgorny commented Feb 17, 2021

So we've been revisiting Rust recently because of cryptography, and it's still a horrible choice. It's still far from mature, officially supports only x86 and arm64. A few of the platforms that python-zstandard currently runs on are not supported at all (and would require tons of work which the few volunteers simply can't afford to do), the majority that are unofficially supported are very hard to bootstrap and upstream simply ignores requests to help with that.

So we'd really appreciate some formal backup plan for us. Something better than 'remove python-zstandard from most of your supported platforms' or 'keep patching the old version yourself'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants