-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
actix-multipart: Feature: Add typed multipart form extractor #2883
Conversation
Looks good! What is the purpose of using |
@JSH32 yep that is a good question. We want to allow reading into not just String itself, but also things like integers, floats, enums, i.e. Trait ConflictsBut we can't simultaneously implement Choice 1 (what I have implemented) #[derive(MultipartForm)]
struct Upload {
numbers: Vec<Text<i64>>,
} Choice 2 (we would have to use Vec and Option wrapper types) #[derive(MultipartForm)]
struct Upload {
numbers: VecWrapper<i64>,
} This is because of conflicting trait implementations (although one day this might be solved by specialization)
(Technically we could implement directly for String itself (rather than generic T), but it seems pointless since we would still need Deserialization AmbiguityThe multipart standard doesn't define any serialization format for the data within the fields themselves. So even though we may want to use Instead by using the For example: #[derive(MultipartForm)]
struct DeserializationMethods {
json: Json<HashMap<String, String>>,
plain: Text<String>,
}
async fn send() {
let mut form = multipart::Form::default();
// We can send the exact same input, but it is up to the server to choose how to deserialize it
form.add_text("json", "{\"key1\": \"value1\", \"key2\": \"value2\"}");
form.add_text("plain", "{\"key1\": \"value1\", \"key2\": \"value2\"}");
...
} Compatibility
I'm not 100% sure what you mean by this - but the idea is that If you wanted to use a more complex type e.g. arbitrary structs & complex enums you could use If you wanted to implement your own field type, then you just need to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly nitpicks form me. great work, I'm excited to use this! ❤️
@robjtede when you have time, please can you review this PR - and let me know what you think? |
actix-multipart/src/form/mod.rs
Outdated
.or_insert_with(|| T::limit(field.name())); | ||
limits.field_limit_remaining = entry.to_owned(); | ||
|
||
T::handle_field(&req, field, &mut limits, &mut state).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this might be difficult with the &mut limits and &mut state in here, but would there be any way to race handle_field
against payload.try_next()
In my crate, actix-form-data, I have a FuturesUnordered that I push the field handler futures into. This allows field handlers to drop their Field stream upon completion and make the remaining work concurrent with the next field handlers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm... that is interesting I never even thought about that. I think to do that we would have to synchronise on the limits & state map, which is do-able.
I do wonder though if it actually brings any noticable performance improvement - since the multipart as a whole is being received as a single stream, presumably the next part can't be received until all of the previous part has been buffered - I guess it would depend on how much buffering actix-multipart does internally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a handler does something like
let bytes = Vec::new();
while let Some(res) = field.next().await {
bytes.extend_from_slice(&res?);
}
drop(field);
let result = send_these_bytes_somewhere(bytes).await;
then the send_these_bytes_somewhere
could be run concurrently with the next field handler, since the field was completely read and dropped before that is executed.
I use this in my application pict-rs to better allow concurrently uploading files to object storage: https://git.asonix.dog/asonix/pict-rs/src/branch/main/src/store/object_store.rs#L246
This allows making additional requests to object storage without waiting for the existing in-flight requests to complete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see - you're effectively adding your own internal buffer to the handler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this mean your server memory usage is only bounded by the overall form limit, e.g. if send_these_bytes_somewhere
/ your object storage backend is running quite slowly, whilst the multipart upload is very fast, then nearly all of the contents could be be read into memory before getting uploaded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is true. actix-form-data has per-field size limits and total field count limits, so the maximum possible memory usage is max_field_size * max_field_count
, and in theory that limit could be hit by slow object storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done some thinking on this and I think I can get away with using spawn
to make this concurrent without changing the design here.
Sorry for the mega delay; finally getting around to reviewing this. Excited by what I've gone through so far :) |
Thanks for looking at this @robjtede - I'm glad you figured out some improvements to the trait name 😆 ! |
What is the status of this PR? This is excellent work and I'd really love to have an "official" way to handle forms that have text fields as well as binary files. I am currently evaluating actix-easy-multipart v3.0.0 but I'd rather use actix-multipart with the same features. Thanks @jacob-pro for the hard work! |
/// Unknown field | ||
#[display(fmt = "Unsupported field `{}`", _0)] | ||
#[from(ignore)] | ||
UnsupportedField(#[error(not(source))] String), | ||
} | ||
|
||
/// Return `BadRequest` for `MultipartError` | ||
impl ResponseError for MultipartError { | ||
fn status_code(&self) -> StatusCode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robjtede you may also want to add an override to the error_response()
function as well
Any status update? |
Thanks for the nudge @DuckyBlender. Added some trybuild tests. Once CI passes we'll get this merged and released 🎉 |
PR Type
Feature
PR Checklist
Overview
This introduces a new multipart form/data extractor for recieving a multipart upload into a struct.
I believe this should be enough to cover all the various features of the existing similar crates as per the discussion here: #2849
An example form looks like:
The key feature is that each field just needs to implement the
FieldReader
trait. This trait allows the user to provide their own abitarary async handler for processing a field, for example they may want to stream the data to S3.I look forward to your feedback! @robjtede @asonix @JSH32 @e-rhodes
List of features:
Update: If anyone wants to use these features right now I have backported them to my existing libary actix-easy-multipart v3.0.0