Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRC32 calculation for upstream subrequests #90

Open
mdineen opened this issue Jun 1, 2022 · 10 comments
Open

CRC32 calculation for upstream subrequests #90

mdineen opened this issue Jun 1, 2022 · 10 comments
Labels

Comments

@mdineen
Copy link
Contributor

mdineen commented Jun 1, 2022

Adding files from block storage with - as checksum causes the module to calculate the crc-32. Adding the same file from an upstream results in 000000 for the checksum, which causes validation errors in some Windows file browsers.

Can we have the module calculate the crc-32 for upstream files?

@mdineen
Copy link
Contributor Author

mdineen commented Jun 1, 2022

Solved by updating

sr_ctx = ngx_http_get_module_ctx(r, ngx_http_zip_module);

366 -    sr_ctx = ngx_http_get_module_ctx(r, ngx_http_zip_module);
366 +    sr_ctx = ngx_http_get_module_ctx(r->main, ngx_http_zip_module);

I'll send a pr

@mdineen
Copy link
Contributor Author

mdineen commented Jun 3, 2022

The fix in #91 didn't work. By using s3fs the module calculates the crc32 properly.

@evanmiller
Copy link
Owner

The subrequest context is supposed to be set here:

sr_ctx->requesting_file = piece->file;
ngx_http_set_ctx(sr, sr_ctx, ngx_http_zip_module);

I wonder if that's not happening for some reason.

@mecampbellsoup
Copy link

+1 having same issue

@arsenetar
Copy link

When serving the files going into the zip directly from nginx (and not pre-calculating the crc32) it seems the crc32 does not get calculated correctly. Upon looking into this, it seems it is due to in when the call:

ngx_http_zip_subrequest_update_crc32(in, sr_ctx->requesting_file);

happens both p and len values are zero here, which causes the ngx_crc32_update() to run with length 0.
p = cl->buf->pos;
len = cl->buf->last - p;

The value of sr_ctx does appear to be correct.

Bit more background nginx set up as reverse proxy with upstream serving requests for the zip listings so the upstream will return items such as

- 19 %2Fcontent%2Fa.txt a.txt
- 15 %2Fcontent%2Fb.txt b.txt

Those files are then served by the reverse proxy itself (/contents, is mapped to path on the reverse proxy instance). All the crc32s are correctly identified as missing by mod_zip, however due to the previously mentioned issue they are sent as 0.

If instead of serving the /contents directly with a location section in the reverse proxy, an additional server section is added to the reverse proxy nginx instance /contents is served from that "local upstream" instead then the crc32 values are calculated correctly. Which means the following does not work:

    location /content {
        internal;
        root /mnt/data/content;
    }

However this does:

    location /content {
        internal;
        rewrite ^/content/(.*)$ /$1 break;
        proxy_pass http://127.0.0.1:8081;
    }
...
server {
    listen 8081;

    location /content {
        root /mnt/data/content;
    }
}

@evanmiller evanmiller added the bug label Dec 17, 2022
@mdineen
Copy link
Contributor Author

mdineen commented Mar 2, 2023

The subrequest context is supposed to be set here:

sr_ctx->requesting_file = piece->file;
ngx_http_set_ctx(sr, sr_ctx, ngx_http_zip_module);

I wonder if that's not happening for some reason.

When I debug that out, the pointer is 0000000000000000. If I set up debugs in ngx_http_zip_send_file_piece:

    ngx_http_set_ctx(sr, sr_ctx, ngx_http_zip_module);
    ngx_http_zip_sr_ctx_t *my_ctx;
    my_ctx = ngx_http_get_module_ctx(sr, ngx_http_zip_module);
    ngx_log_debug3(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, "mod_zip: 1 setting sr, sr_ctx, myctx: %p %p %p", sr, sr_ctx, my_ctx);

and ngx_http_zip_subrequest_body_filter

    ngx_http_zip_sr_ctx_t *sr_ctx;
    sr_ctx = ngx_http_get_module_ctx(r, ngx_http_zip_module);
    if (shown == 0) {
        ngx_log_debug2(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, "mod_zip: 3 getting r, sr_ctx : %p %p", r, sr_ctx);
        shown = 1;
    }

I get this output:

2023/03/21 16:45:41 [debug] 17630#0: *1 mod_zip: 1 setting sr, sr_ctx, myctx: 0000000000E9D7D8 0000000000E86518 0000000000E86518
2023/03/21 16:45:41 [debug] 17630#0: *1 mod_zip: 3 getting r, sr_ctx : 0000000000E9D7D8 0000000000000000
2023/03/21 16:45:41 [debug] 17630#0: *1 mod_zip: 1 setting sr, sr_ctx, myctx: 0000000000EA7CF0 0000000000E99618 0000000000E99618
2023/03/21 16:45:42 [debug] 17630#0: *1 mod_zip: 3 getting r, sr_ctx : 0000000000EA7CF0 0000000000000000
2023/03/21 16:45:42 [debug] 17630#0: *1 mod_zip: 1 setting sr, sr_ctx, myctx: 0000000000EAA730 0000000000EA26A8 0000000000EA26A8
2023/03/21 16:45:42 [debug] 17630#0: *1 mod_zip: 3 getting r, sr_ctx : 0000000000EAA730 0000000000000000

Which shows that the request setting and retrieing the context is the same, and that the pointer to the sr_ctx is retrievable right after it's set, but that trying to retrieve the same context in the body filter points to zeros.

To me this suggests the context is being nullified elsewhere, before we get to the body filter.

@mdineen
Copy link
Contributor Author

mdineen commented Mar 15, 2023

When I create sr_ctx with a global variable and set it directly the problem goes away. Something is clearing the module context between the calls to ngx_http_zip_send_file_piece and ngx_http_zip_subrequest_body_filter.

diff --git a/ngx_http_zip_module.c b/ngx_http_zip_module.c
index cbd448e..fcfd4fc 100644
--- a/ngx_http_zip_module.c
+++ b/ngx_http_zip_module.c
@@ -11,6 +11,7 @@
 #include "ngx_http_zip_file.h"
 #include "ngx_http_zip_headers.h"
 
+static ngx_http_zip_sr_ctx_t *global_sr_ctx;
 static ngx_chain_t *ngx_chain_last_link(ngx_chain_t *chain_link);
 static ngx_int_t ngx_http_zip_discard_chain(ngx_http_request_t *r,
         ngx_chain_t *in);
@@ -334,7 +335,7 @@ ngx_http_zip_subrequest_body_filter(ngx_http_request_t *r, ngx_chain_t *in)
 {
     ngx_http_zip_sr_ctx_t *sr_ctx;
 
-    sr_ctx = ngx_http_get_module_ctx(r, ngx_http_zip_module);
+    sr_ctx = global_sr_ctx;
 
     if (in && sr_ctx && sr_ctx->requesting_file->missing_crc32) {
         uint32_t old_crc32 = sr_ctx->requesting_file->crc32;
@@ -542,7 +543,8 @@ ngx_http_zip_send_file_piece(ngx_http_request_t *r, ngx_http_zip_ctx_t *ctx,
 
     sr_ctx->requesting_file = piece->file;
 
-    ngx_http_set_ctx(sr, sr_ctx, ngx_http_zip_module);
+    global_sr_ctx = sr_ctx;
+    
     if (ctx->wait) {
         ngx_log_error(NGX_LOG_ERR, r->connection->log, 0,
                 "mod_zip : only one subrequest may be waited at the same time; ");

I don't think this is the right solution, but it does work.

This fails when there are >1 requests with subrequests happening, causing all subrequests' chunks to add on to a single running crc32 value.

@dleonov
Copy link

dleonov commented May 9, 2024

I discovered that disabling buffering for files that are to be zipped resolves the incorrect CRC32 checksum issue. To fix this, add proxy_buffering off; to the NGINX location block that serves the files specified in the mod_zip manifest.

@mdineen
Copy link
Contributor Author

mdineen commented May 9, 2024

#111 fixes the problem at the source.

@adamantike
Copy link

@mdineen, even with that commit applied, I am still getting zero values for CRC-32.

@arsenetar's approach to add a different server instead of serving from an internal location kind of worked, but it generates a wrong CRC-32 value for the first file in the response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants