Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuzz on maxBlockSize #3418

Merged
merged 8 commits into from
Jan 19, 2023

Conversation

daniellerozenblit
Copy link
Contributor

This PR extends our fuzz testing in response to a recently found bug in our repcode update logic (#3127). This bug occurred in scenarios where both offset_1 and offset_2 were invalid going into a block, and no matches were found. This bug was latent and luckily could not cause corruption, but we want to ensure that our fuzzer tests scenarios where both repcodes are invalidated to protect against these sorts of edge cases.

In order to test these scenarios, we want to generate cases with several blocks where block-size < window-size. It is currently not possible to do this in our non-streaming tests without generating large inputs. Testing with random flushing in streaming mode (which we currently do) should generate these scenarios, but we want to extend this to our non-streaming tests.

This PR includes a new cctx parameter, maxBlockSize which can be used to override the default ZSTD_BLOCKSIZE_MAX (128KB). This parameter is currently only meant to be used for testing purposes and is bounded by [1KB, ZSTD_BLOCKSIZE_MAX].

@daniellerozenblit daniellerozenblit marked this pull request as ready for review January 11, 2023 20:07
@Cyan4973
Copy link
Contributor

Can you imagine one (or several) test which would be able to check that maxBlockSize works as intended ?

ZSTD_freeCCtx(cctx);
free(checkBuf);
}
DISPLAYLEVEL(3, "OK \n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Great !

One more test I would like to see is what happens when maxBlockSize > windowSize.
In such case, we want to be sure that the code will not produce blocks larger than windowSize.
I think your code is fine and will respect this condition.
The purpose of the test is to ensure it remains this way, a future modification could break this condition.

@@ -964,6 +982,11 @@ size_t ZSTD_CCtxParams_setParameter(ZSTD_CCtx_params* CCtxParams,
CCtxParams->enableMatchFinderFallback = value;
return CCtxParams->enableMatchFinderFallback;

case ZSTD_c_maxBlockSize:
BOUNDCHECK(ZSTD_c_maxBlockSize, value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you set the value to 0?

I think we can leave the bounds as-is, but also allow setting to zero to use the default maxBlockSize. And we should also call that out in the docs.

lib/zstd.h Outdated
Comment on lines 2074 to 2076
*
* Default is ZSTD_BLOCKSIZE_MAX.
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a little more detail here. Say what this parameter controls, what the min/max are (and why), and explain what setting to 0 means.

Copy link
Contributor

@terrelln terrelln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor nit in the zstd.h header, otherwise looks good!

lib/zstd.h Outdated
@@ -141,6 +141,7 @@ ZSTDLIB_API const char* ZSTD_versionString(void);

#define ZSTD_BLOCKSIZELOG_MAX 17
#define ZSTD_BLOCKSIZE_MAX (1<<ZSTD_BLOCKSIZELOG_MAX)
#define ZSTD_BLOCKSIZE_MAX_MIN (1 << 10) /* The minimum valid max blocksize. Maximum blocksizes smaller than this make compressBound() inaccurate. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this down below into the static linking only section

Putting this constant here means that it will always be present, and we don't like to do that until it has baked in the unstable section.

Copy link
Contributor

@terrelln terrelln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed some bugs we'll have to fix. They're not your fault, our CCtxParams have grown significantly, and now a lot of them need to be "resolved" from some default value. And this resolution has to happen in many different places. We need to refactor the cctx params, so we can centralize all the logic for resolving & copying parameters.

Can you add a test case for the estimation functions that sets the maxBlockSize to 0? I think that we don't actually have any tests for these functions, because they should be failing right now.

You can add them to the test cases for the other static cctx functions here: https://github.com/facebook/zstd/blob/dev/tests/fuzzer.c#L1646

Basically you'd:

  • Create some ZSTD_CCtxParams, using the default values
  • Call ZSTD_estimateCCtxSize_usingCCtxParams()
  • Pass a buffer of that size to ZSTD_initStaticCCtx()
  • Do a compression & decompress the data to make sure it round trips
  • Do the same thing but with ZSTD_estimateCStreamSize_usingCCtxParams()
  • Do the same tests, but with an explicitly set maxBlockSize

You should write the test case before making the fix, so you can make sure that the test case exposes the bugs.

@@ -1944,7 +1972,7 @@ static size_t ZSTD_resetCCtx_internal(ZSTD_CCtx* zc,
}

{ size_t const windowSize = MAX(1, (size_t)MIN(((U64)1 << params->cParams.windowLog), pledgedSrcSize));
size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, windowSize);
size_t const blockSize = MIN(params->maxBlockSize, windowSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an assert that params->maxBlockSize != 0 up by the group of asserts on lines 1964-1966?

@@ -1651,7 +1679,7 @@ size_t ZSTD_estimateCStreamSize_usingCCtxParams(const ZSTD_CCtx_params* params)
RETURN_ERROR_IF(params->nbWorkers > 0, GENERIC, "Estimate CCtx size is supported for single-threaded compression only.");
{ ZSTD_compressionParameters const cParams =
ZSTD_getCParamsFromCCtxParams(params, ZSTD_CONTENTSIZE_UNKNOWN, 0, ZSTD_cpm_noAttachDict);
size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << cParams.windowLog);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think params->maxBlockSize could be equal to zero here, you have to resolve it.

{
size_t const windowSize = (size_t) BOUNDED(1ULL, 1ULL << cParams->windowLog, pledgedSrcSize);
size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, windowSize);
size_t const blockSize = MIN(maxBlockSize, windowSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maxBlockSize could be equal to zero here, you have to resolve it.

@daniellerozenblit
Copy link
Contributor Author

I've added some tests for ZSTD_estimateCCtxSize_usingCCtxParams() and ZSTD_estimateCStreamSize_usingCCtxParams() and have confirmed that they fail before resolving maxBlockSize in the respective methods.

CHECK_Z(ZSTD_decompress(checkBuf, srcSize, dst1, size1));
CHECK(memcmp(src, checkBuf, srcSize) != 0, "Corruption!");

/* maxBlockSize = 3KB, windowLog = 10 */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@terrelln terrelln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@daniellerozenblit daniellerozenblit merged commit dc1c6cc into facebook:dev Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants