Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cross-testing #691

Open
real-or-random opened this issue Nov 9, 2019 · 15 comments
Open

cross-testing #691

real-or-random opened this issue Nov 9, 2019 · 15 comments

Comments

@real-or-random
Copy link
Contributor

In #635 I was convinced that cross-testing is a good thing. It would be nice to have cross tests for

@elichai
Copy link
Contributor

elichai commented Nov 9, 2019

Just linking to #641 (which I stopped because there was no more discussion around if it's wanted and how to vendor the code)

@real-or-random
Copy link
Contributor Author

@gmaxwell suggests micro-ecc in #716. I think that's by far the simplest alternative (simple, mostly portable C), see #716 for a detailed discussion. I'd prefer this over the other mentioned projects. What do people think?

@real-or-random
Copy link
Contributor Author

Now after the OpenSSL tests are gone, we should make this a priority.

Here are some possibilities:

I think micro-ecc is a nice option and a good first choice. @elichai would you still interested in getting this in?

#641 got stuck because there was no clear commitment to using Rust. I wouldn't opposed to other languages than C in the test but here's a suggestion that I find very attractive: We could compile the Rust down to (ugly) C using mrustc and vendor this C code. This would mean that we get the best of both worlds: One can compare to an independent simple implementation with the same API and one does not need a Rust compiler to run the tests. The same is true for secp256kfun but I don't know how independent that implementation is. (cc @LLFourn)

But before we decide on the way forward, here's another very interesting direction: A much more sophisticated approach to this kind of testing is cryptofuzz by @guidovranken, which does differential testings of all kinds of algorithms and libraries. For example, it's used to test libsecp256k1 against Botan on oss-fuzz (google/oss-fuzz#5717).

@guidovranken Were you aware of our OpenSSL cross tests (removed in #983)? I really wonder how crytofuzz compares against these. I can imagine it would be superior to simply use cryptofuzz and maybe integrate more libraries such as some of the ones mentioned above.

@guidovranken
Copy link

Apart from testing sign/verify/recover, it is also a good idea to test point validation, addition and multiplication.

@real-or-random
Copy link
Contributor Author

Thanks, that's a helpful overview! Unfortunately the Wycheproof tests do not match our semantics entirely, see C2SP/wycheproof#70 .

@guidovranken Can you also comment on the high-level ideas here? For example, when it comes to functionatily, do you think cryptofuzz is superior in every aspect, or do you think it would still make sense for us to maintain our own cross tests?

Of course there are also other considerations besides functionality, e.g., build integration etc. For example, if we think we stick to maintain own tests because it's easier to ship them to the user, I imagine we could still have a short run of cryptofuzz on CI.

@LLFourn
Copy link

LLFourn commented Oct 18, 2021

The same is true for secp256kfun but I don't know how independent that implementation is. (cc @LLFourn)

The best rust impl to try this approach for would be k256 since it is a completely independent code base and even uses different algorithms for ECC (it uses projective addition formula from https://eprint.iacr.org/2015/1060.pdf). I was going to move secp256kfun over to using it as a backend eventually anyway.

@guidovranken
Copy link

@guidovranken Can you also comment on the high-level ideas here? For example, when it comes to functionatily, do you think cryptofuzz is superior in every aspect, or do you think it would still make sense for us to maintain our own cross tests?

A notable shortcoming of CF is that it only compares the results produced by various crypto libraries, but not the absence of results. For example it doesn't confirm the absence of a result in these cases:

  • Point multiplication by 0 is an illegal operation and should not produce a result.
  • Point addition/multiplication involving invalid points should not produce a result.
  • ECDSA recovery that yields a point at infinity pubkey should not produce a result.
  • ECDSA verification involving a pubkey or signature which are not valid points may result false or it may not produce a result at all.

Technically, each operation run in a crypto library returns a std::optional<Result>. It returns std::nullopt if it cannot compute a result for any reason, and it returns a Result if it can. Once the operation has been run in every crypto library, CF filters out all results which are std::nullopt. The remaining set is then compared for differences.

So in Python-pseudocode:

results = []

# Run operation in each crypto library
for module in modules:
    results += [ module.Run(operation) ]

# Filter out nullopt results
results = filter(lambda r: r != None, results)

# Detect discrepancies
if len(set(results)) != 1:
    crash()

This logic exists for a good reason in CF but the corollary is that it can miss certain types of bugs. I do want to address this at some point in the future but for now having a separate test suite is still valuable.

@elichai
Copy link
Contributor

elichai commented Oct 19, 2021

The best rust impl to try this approach for would be k256 since it is a completely independent code base and even uses different algorithms for ECC (it uses projective addition formula from https://eprint.iacr.org/2015/1060.pdf).

I'm not sure how true this is, See for example: RustCrypto/elliptic-curves#59, RustCrypto/elliptic-curves#82

@LLFourn
Copy link

LLFourn commented Oct 19, 2021

The best rust impl to try this approach for would be k256 since it is a completely independent code base and even uses different algorithms for ECC (it uses projective addition formula from https://eprint.iacr.org/2015/1060.pdf).

I'm not sure how true this is, See for example: RustCrypto/elliptic-curves#59, RustCrypto/elliptic-curves#82

Right. To be clear I think they use different ECC addition/doubling formula -- I guess field/scalar arithmetic could be pretty similar. I wouldn't be the best judge of that. Still I think it's the best candidate if you want to try cross testing against a rust codebase.

@real-or-random
Copy link
Contributor Author

real-or-random commented Feb 11, 2022

This logic exists for a good reason in CF but the corollary is that it can miss certain types of bugs. I do want to address this at some point in the future but for now having a separate test suite is still valuable.

Can you elaborate what these good reasons are? Maybe we could help.

edit: Let me clarify that even if cryptofuzz would be perfect, I still believe it's useful to have simple cross tests in our test suite. cryptofuzz is nothing we will integrate in make check (naturally a lot of dependencies, etc). I think it's good to have something which is easy enough to be run by not only by us but also by our users, packagers, etc. It should ideally just work out of the box without any dependencies.

@guidovranken
Copy link

The "good reason" is that I allow every library to fail executing an operation for any or no reason. Reasons include:

  • In some libraries I induce random allocation failures, to test its resiliency. If a malloc() fails, then the operation should be allowed to fail.
  • In some libraries I hook the PRNG that a library is using (for example for ECDSA signing). For particular outputs of the PRNG, it should be allowed to fail.
  • Some libraries have fixed constraints, such as a predefined bignum size. Therefore it is normal that some operations, like multiplication, will fail.
  • A library might be implementing only a subset of a primitive in a way that is adequate for its intended usage domain.

In some cases it is dubious whether a certain function failing is signalling some specific information.
For example, if a function which computes a modular inverse fails, is it due to an internal error (e.g. malloc failure) or is it signalling that the inverse does not exist?
I need to review these on a case-by-case basis, so I cannot simply interpret failure as a bug by default.

Currently, the ECC point operations all return a std::optional<component::ECC_Point>. The component::ECC_Point struct is something like this:

struct {
	std::string X;
	std::string Y;
};

In hindsight it would have been better if this struct also included a field bool is_infinity. I could change this, but I would have to do it for all modules, otherwise the whole thing won't compile anymore, so that will take some time.

As a workaround I've started to add asserts to various libraries to ensure certain operations conform to certain expectations.

For example, in secp256k1, I now have:

I want to add similar asserts for point addition and point doubling, where the logic will be:

  • Addition of two valid points must produce a valid point UNLESS both points are negations of eachother, in which case it must produce point at infinity
  • Point doubling of a valid point must always succeed

Feel free to specify additional invariants for any secp256k1 function/operation and I will implement it in the fuzzer.

@real-or-random
Copy link
Contributor Author

@guidovranken Makes sense. Here's another random thought (that may be stupid): Did you ever consider adding an old version of libsecp256k1? We have done so many changes that this may be interesting. On the other hand, I don't know if this adds much to the diversity of implementations. (Maybe it's useful to cross-test new libsecp256k1 against old libsecp256k1 but not if you're anyway testing in parallel against all the other libraries?)

@guidovranken
Copy link

@real-or-random

Testing two versions of the same library in the same fuzzer binary is difficult due to symbol collisions, though it is possible by launching an external process.

After your suggestion I made some changes to the harness to allow compiling it with older versions of libsecp256k1 (specifically at the last commit of 2019 and the last commit of 2018 -- these were arbitrary choices): https://github.com/guidovranken/cryptofuzz/blob/master/modules/secp256k1/README.md

I ran the fuzzer on those and did not find anything amiss.

Are there any specific older versions that you think are worthwhile to test? Maybe specific releases of the library that are still in widespread use (for example by Ethereum)? I suppose it would be valuable to discover bugs in older versions even if they are fixed in the current version. Sometimes software projects fix bugs while addressing something else, without ever knowing there was a bug, so this can be meaningful. I can even add an additional fuzzer for an older libsecp256k1 version to the OSS-Fuzz project.

@apoelstra
Copy link
Contributor

@guidovranken one potential source of old versions might be the bundled copies of libsecp256k1 in rust-secp256k1. These are in the directory secp256k1-sys/depend/secp256k1; the revision used is recorded in the file secp256k1-sys/depend/secp256k1-HEAD-revision.txt; and all of the externally linked symbols have been conveniently renamed to include version numbers.

Not commenting on whether this is a useful direction to go in or not, but if you did want to do it, here is a corpus.

@real-or-random
Copy link
Contributor Author

Are there any specific older versions that you think are worthwhile to test?

I didn't have anything specific in mind. My suggestion was simply based on the observation that a lot of code has been changed during the keys, including replacement of entire internal algorithms (safegcd for modinv, or 557b31f for a very recent example, ...)

I suppose it would be valuable to discover bugs in older versions even if they are fixed in the current version. Sometimes software projects fix bugs while addressing something else, without ever knowing there was a bug, so this can be meaningful.

Indeed, this was my thinking.

Maybe specific releases of the library that are still in widespread use (for example by Ethereum)?

Yeah, those are an obvious choice. A lot of projects vendor our code including (old releases of) Bitcoin Core, and lightning implementations, many many altcoins. I think these are good choices for secp256k1, as are the versions bundled by rust-secp256k1 or other bindings as mentioned by Andrew.

In the end I think there are so many valid choices that I think I would just pick one or two more or less arbitrary. And covering "time" as much as possible is useful and your approach of taking the last commit in a year does the job. I think a mix of old and more recent versions is interesting: A very old version still in use is good as a target to find bugs and more recent versions are good as reference because they tend to have the same API (and support Schnorrsig etc).

I think if you run cryptofuzz with libsecp256k1 plus 10 other libraries, then I'd suppose adding an old version does not add much if you want to find bugs in the current version. But if you want to find bugs in older versions of libsecp256k1, then of course it makes a lot of sense to run the old version.

I can even add an additional fuzzer for an older libsecp256k1 version to the OSS-Fuzz project.

(With my limited knowledge of the available resources there etc.,) this sounds interesting at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants