Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add checksum algorithm to ListObjectsV2 response #1086

Merged
merged 7 commits into from
Oct 29, 2024
8 changes: 8 additions & 0 deletions mountpoint-s3-client/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@
* Add parameter to request checksum information as part of a `HeadObject` request.
If specified, the result should contain the checksum for the object if available in the S3 response.
([#1083](https://github.com/awslabs/mountpoint-s3/pull/1083))
* Expose checksum algorithm in `ListObjectsResult`'s `ObjectInfo` struct.
([#1086](https://github.com/awslabs/mountpoint-s3/pull/1086))
* `ChecksumAlgorithm` has a new variant `Unknown(String)`,
passaro marked this conversation as resolved.
Show resolved Hide resolved
to accomodate algorithms not recognized by the client should they be added in future.
([#1086](https://github.com/awslabs/mountpoint-s3/pull/1086))

### Breaking changes

Expand All @@ -17,6 +22,9 @@
* `head_object` method now requires a `HeadObjectParams` parameter.
The structure itself is not required to specify anything to achieve the existing behavior.
([#1083](https://github.com/awslabs/mountpoint-s3/pull/1083))
* Both `ObjectInfo` and `ChecksumAlgorithm` structs are now marked `non_exhaustive`, to indicate that new fields may be added in the future.
`ChecksumAlgorithm` no longer implements `Copy`.
([#1086](https://github.com/awslabs/mountpoint-s3/pull/1086))

## v0.11.0 (October 17, 2024)

Expand Down
27 changes: 27 additions & 0 deletions mountpoint-s3-client/src/mock_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,7 @@ impl MockClient {
etag: object.etag.as_str().to_string(),
storage_class: object.storage_class.clone(),
restore_status: object.restore_status,
checksum_algorithm: object.checksum.algorithm(),
});
}
}
Expand Down Expand Up @@ -317,6 +318,7 @@ impl MockClient {
etag: object.etag.as_str().to_string(),
storage_class: object.storage_class.clone(),
restore_status: object.restore_status,
checksum_algorithm: object.checksum.algorithm(),
});
}
next_continuation_token += 1;
Expand Down Expand Up @@ -1585,6 +1587,31 @@ mod tests {
assert_eq!(objects, expected_objects);
}

#[tokio::test]
async fn list_objects_checksum() {
let client = MockClient::new(MockClientConfig {
bucket: "test_bucket".to_string(),
..Default::default()
});

client.add_object("a.txt", MockObject::constant(0u8, 5, ETag::for_tests()));
let mut object_b = MockObject::constant(1u8, 5, ETag::for_tests());
object_b.set_checksum(Checksum {
checksum_crc32: None,
checksum_crc32c: None,
checksum_sha1: Some(String::from("QwzjTQIHJO11oZbfwq1nx3dy0Wk=")),
checksum_sha256: None,
});
client.add_object("b.txt", object_b);

let result = client
.list_objects("test_bucket", None, "/", 1000, "")
.await
.expect("should not fail");
assert_eq!(result.objects[0].checksum_algorithm, None);
assert_eq!(result.objects[1].checksum_algorithm, Some(ChecksumAlgorithm::Sha1));
}

#[tokio::test]
async fn test_put_object() {
let mut rng = ChaChaRng::seed_from_u64(0x12345678);
Expand Down
67 changes: 67 additions & 0 deletions mountpoint-s3-client/src/object_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -622,6 +622,7 @@ pub enum RestoreStatus {
/// See [Object](https://docs.aws.amazon.com/AmazonS3/latest/API/API_Object.html) in the *Amazon S3
/// API Reference* for more details.
#[derive(Debug, Clone)]
#[non_exhaustive]
pub struct ObjectInfo {
/// Key for this object.
pub key: String,
Expand All @@ -643,6 +644,9 @@ pub struct ObjectInfo {

/// Entity tag of this object.
pub etag: String,

/// The algorithm that was used to create a checksum of the object.
pub checksum_algorithm: Option<ChecksumAlgorithm>,
}

/// All possible object attributes that can be retrived from [ObjectClient::get_object_attributes].
Expand Down Expand Up @@ -707,6 +711,26 @@ impl Checksum {
checksum_sha256: None,
}
}

/// Provide [ChecksumAlgorithm] for the [Checksum], if set and recognized.
///
/// This method assumes that at most one checksum will be set and will return the first matched.
pub fn algorithm(&self) -> Option<ChecksumAlgorithm> {
let Self {
checksum_crc32,
checksum_crc32c,
checksum_sha1,
checksum_sha256,
} = &self;

match (checksum_crc32, checksum_crc32c, checksum_sha1, checksum_sha256) {
(Some(_), _, _, _) => Some(ChecksumAlgorithm::Crc32),
(_, Some(_), _, _) => Some(ChecksumAlgorithm::Crc32c),
(_, _, Some(_), _) => Some(ChecksumAlgorithm::Sha1),
(_, _, _, Some(_)) => Some(ChecksumAlgorithm::Sha256),
(None, None, None, None) => None,
}
}
}

/// Metadata about object parts from GetObjectAttributes API.
Expand Down Expand Up @@ -749,3 +773,46 @@ pub struct ObjectPart {
/// Size of the part in bytes
pub size: usize,
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_checksum_algorithm_one_set() {
let checksum = Checksum {
checksum_crc32: None,
checksum_crc32c: None,
checksum_sha1: Some("checksum_sha1".to_string()),
checksum_sha256: None,
};
assert_eq!(checksum.algorithm(), Some(ChecksumAlgorithm::Sha1));
}

#[test]
fn test_checksum_algorithm_none_set() {
let checksum = Checksum {
checksum_crc32: None,
checksum_crc32c: None,
checksum_sha1: None,
checksum_sha256: None,
};
assert_eq!(checksum.algorithm(), None);
}

#[test]
fn test_checksum_algorithm_many_set() {
// Amazon S3 doesn't support more than one algorithm, but just in case... let's show we don't panic.
let checksum = Checksum {
checksum_crc32: None,
checksum_crc32c: Some("checksum_crc32c".to_string()),
checksum_sha1: Some("checksum_sha1".to_string()),
checksum_sha256: None,
};
let algorithm = checksum.algorithm().expect("checksum algorithm must be present");
assert!(
[ChecksumAlgorithm::Crc32c, ChecksumAlgorithm::Sha1].contains(&algorithm),
"algorithm should match one of the algorithms present in the struct",
);
}
}
22 changes: 21 additions & 1 deletion mountpoint-s3-client/src/s3_crt_client/list_objects.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ use time::OffsetDateTime;
use tracing::error;

use crate::object_client::{
ListObjectsError, ListObjectsResult, ObjectClientError, ObjectClientResult, ObjectInfo, RestoreStatus,
ChecksumAlgorithm, ListObjectsError, ListObjectsResult, ObjectClientError, ObjectClientResult, ObjectInfo,
RestoreStatus,
};
use crate::s3_crt_client::{S3CrtClient, S3Operation, S3RequestError};

Expand Down Expand Up @@ -114,6 +115,22 @@ fn parse_restore_status(element: &xmltree::Element) -> Result<Option<RestoreStat
}))
}

fn parse_checksum_algorithm(element: &xmltree::Element) -> Result<Option<ChecksumAlgorithm>, ParseError> {
let Some(checksum_algorithm) = get_field(element, "ChecksumAlgorithm").ok() else {
return Ok(None);
};

let checksum_algorithm = match checksum_algorithm.as_str() {
"CRC32" => ChecksumAlgorithm::Crc32,
"CRC32C" => ChecksumAlgorithm::Crc32c,
"SHA1" => ChecksumAlgorithm::Sha1,
"SHA256" => ChecksumAlgorithm::Sha256,
_ => ChecksumAlgorithm::Unknown(checksum_algorithm.clone()),
};

Ok(Some(checksum_algorithm))
}

fn parse_object_info_from_xml(element: &xmltree::Element) -> Result<ObjectInfo, ParseError> {
let key = get_field(element, "Key")?;

Expand All @@ -134,13 +151,16 @@ fn parse_object_info_from_xml(element: &xmltree::Element) -> Result<ObjectInfo,

let etag = get_field(element, "ETag")?;

let checksum_algorithm = parse_checksum_algorithm(element)?;

Ok(ObjectInfo {
key,
size,
last_modified,
storage_class,
restore_status,
etag,
checksum_algorithm,
})
}

Expand Down
54 changes: 54 additions & 0 deletions mountpoint-s3-client/tests/list_objects.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
#![cfg(feature = "s3_tests")]

use aws_sdk_s3::primitives::ByteStream;
use aws_sdk_s3::types::ChecksumAlgorithm;
use bytes::Bytes;
use test_case::test_case;

pub mod common;

use common::*;
Expand Down Expand Up @@ -166,3 +171,52 @@ async fn test_interesting_keys() {
assert_eq!(result.objects.len(), 2);
assert!(result.next_continuation_token.is_none());
}

#[test_case(ChecksumAlgorithm::Crc32)]
#[test_case(ChecksumAlgorithm::Crc32C)]
#[test_case(ChecksumAlgorithm::Sha1)]
#[test_case(ChecksumAlgorithm::Sha256)]
#[tokio::test]
async fn test_checksum_attribute(upload_checksum_algorithm: ChecksumAlgorithm) {
let sdk_client = get_test_sdk_client().await;
let (bucket, prefix) = get_test_bucket_and_prefix("test_checksum_attribute");

let key = format!("{prefix}hello.txt");
let body = b"hello world!";
sdk_client
.put_object()
.bucket(&bucket)
.key(&key)
.body(ByteStream::from(Bytes::from_static(body)))
.checksum_algorithm(upload_checksum_algorithm.clone())
.send()
.await
.unwrap();

let client: S3CrtClient = get_test_client();

let result = client
.list_objects(&bucket, None, "/", 1000, &prefix)
.await
.expect("ListObjectsV2 should succeed");

assert!(
result.next_continuation_token.is_none(),
"there should be no continuation token",
);
assert_eq!(result.objects.len(), 1, "there should be exactly one object");
assert_eq!(result.common_prefixes.len(), 0, "there should be no common prefixes");

let object = &result.objects[0];
assert_eq!(object.key, format!("{}{}", prefix, "hello.txt"));

let expected_checksum_algorithm = match upload_checksum_algorithm {
ChecksumAlgorithm::Crc32 => mountpoint_s3_client::types::ChecksumAlgorithm::Crc32,
ChecksumAlgorithm::Crc32C => mountpoint_s3_client::types::ChecksumAlgorithm::Crc32c,
ChecksumAlgorithm::Sha1 => mountpoint_s3_client::types::ChecksumAlgorithm::Sha1,
ChecksumAlgorithm::Sha256 => mountpoint_s3_client::types::ChecksumAlgorithm::Sha256,
_ => todo!("update with new checksum algorithm should one come available"),
};

assert_eq!(Some(expected_checksum_algorithm), object.checksum_algorithm);
}
8 changes: 7 additions & 1 deletion mountpoint-s3-crt/src/s3/client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1445,7 +1445,8 @@ impl ChecksumConfig {
}

/// Checksum algorithm.
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
#[non_exhaustive]
pub enum ChecksumAlgorithm {
/// Crc32c checksum.
Crc32c,
Expand All @@ -1455,6 +1456,11 @@ pub enum ChecksumAlgorithm {
Sha1,
/// Sha256 checksum.
Sha256,
/// Checksum of a type unknown to this S3 client.
///
/// This type will be used if Mountpoint ever encounters a checksum algorithm it doesn't recognize.
/// This should allow Mountpoint to continue with most file operations which don't depend on the checksum algorithm.
Unknown(String),
}

impl ChecksumAlgorithm {
Expand Down
Loading