Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Etag middleware #692

Closed
alextes opened this issue Sep 17, 2022 · 8 comments
Closed

Etag middleware #692

alextes opened this issue Sep 17, 2022 · 8 comments

Comments

@alextes
Copy link

alextes commented Sep 17, 2022

I was quite surprised to find nothing about etag middleware anywhere. No blog posts, no crates, no builtin middleware. Etags, although far from necessary, are quite important to help apps not download data over and over, but I'm sure folks in this repo know more about that than me 😄 .

So instead of saying Etag middleware is important! and then complaining no one has written it yet, I did the nicer thing and tried to write it myself. I'm new to Rust so not the ideal situation but that's life.

requirements:

  • works with any body. i.e. for strong etags, we should hash the byte-for-byte body, so we can't run before compression middleware.

This is the (monster) I came up with 😅 .

async fn etag_middleware<B: std::fmt::Debug>(
    req: Request<B>,
    next: Next<B>,
) -> Result<Response, StatusCode> {
    let if_none_match_header = req.headers().get(header::IF_NONE_MATCH).cloned();
    let path = req.uri().path().to_owned();
    let res = next.run(req).await;
    let (mut parts, mut body) = res.into_parts();

    let bytes = {
        let mut body_bytes = vec![];

        while let Some(inner) = body.data().await {
            let bytes = inner.unwrap();
            body_bytes.put(bytes);
        }

        body_bytes
    };

    match bytes.len() == 0 {
        true => {
            event!(Level::TRACE, path, "response without body, skipping etag");
            Ok(parts.into_response())
        }
        false => match if_none_match_header {
            None => {
                let etag = EntityTag::from_data(&bytes);

                parts.headers.insert(
                    header::ETAG,
                    HeaderValue::from_str(&etag.to_string()).unwrap(),
                );

                event!(
                    Level::TRACE,
                    path,
                    etag = etag.to_string(),
                    "no if-none-match header"
                );

                Ok((parts, bytes).into_response())
            }
            Some(if_none_match) => {
                let if_none_match_etag = if_none_match.to_str().unwrap().parse::<EntityTag>();
                match if_none_match_etag {
                    Err(ref err) => {
                        error!("{} - {:?}", err, &if_none_match_etag);
                        let etag = EntityTag::from_data(&bytes);
                        parts.headers.insert(
                            header::ETAG,
                            HeaderValue::from_str(&etag.to_string()).unwrap(),
                        );
                        Ok((parts, bytes).into_response())
                    }
                    Ok(if_none_match_etag) => {
                        let etag = EntityTag::from_data(&bytes);

                        parts.headers.insert(
                            header::ETAG,
                            HeaderValue::from_str(&etag.to_string()).unwrap(),
                        );

                        let some_match = etag.strong_eq(&if_none_match_etag);

                        event!(
                            Level::TRACE,
                            path,
                            etag = etag.to_string(),
                            some_match,
                            "if-none-match" = if_none_match_etag.to_string()
                        );

                        if some_match {
                            Ok((StatusCode::NOT_MODIFIED, parts).into_response())
                        } else {
                            Ok((parts, bytes).into_response())
                        }
                    }
                }
            }
        },
    }
}

Having to write this came at the worst possible time and ended up with the predictable late-hour debugging of caching issues due to this piece of code (who knew you need to send the hash even on a 304 🙈 ). Luckily, bugs were fixed and our big day was still successful 🥳 .

I'd like to save others from writing their own, but I could really use some help.

The blob above works with axum's middleware::from_fn. Would the first step be rewriting it to a tower Service?

Thanks for the great foundation Axum appears to be built on top of 💛 !

@alextes
Copy link
Author

alextes commented Sep 17, 2022

Maybe good to note, I'm aware that in the code above we:

  • buffer the response for everything
  • are manually pulling the bytes out of the body
  • make several unsafe unwraps
  • littered a bunch of logs

All these should be resolved but I thought I'd start by checking with more experienced people what the first steps would be while at the same time posting something that has handled a few million requests successfully (:

@LucioFranco
Copy link
Member

Hi! I think this type of middleware probably could exist in tower-http or in its own crate if that is too controversial.

@davidpdrsn
Copy link
Member

I think realistically it would be easier to iterate outside of tower-http and then merge it when it has matured a bit.

@alextes
Copy link
Author

alextes commented Sep 28, 2022

Works for me, I can close this. Where could I get help on tower? It's the main reason I opened here. I know a bunch about etag middleware now, but writing tower middleware / Rust is still hard for me. Especially if we're talking making zero allocations etc. But first step would be stuff like not buffering the bytes in a clumsy loop and I'm not sure where to learn about that. Been writing Rust for a couple months only and am solo on my team.

@LucioFranco
Copy link
Member

@alextes we have a #tower channel on the https://discord.gg/tokio that we can help you out in for general tower stuff.

@alextes alextes closed this as completed Sep 29, 2022
@billythedummy
Copy link

was development of this ETag middleware moved to some other repo?

@davidpdrsn
Copy link
Member

I don’t know of any.

@billythedummy
Copy link

billythedummy commented Oct 3, 2023

I've started on an implementation at https://github.com/billythedummy/tower-etag-cache/tree/master/tower-etag-cache. The implemented ConstLruProvider probably won't work well for streaming responses due to tower-rs/tower-http#324 (comment) but the generic traits exported by the crate (hopefully) allows for better implementations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants