From 41a813557393d4f03e82537c1fc6434daf24bd69 Mon Sep 17 00:00:00 2001 From: Daniel Carl Jones Date: Mon, 30 Oct 2023 13:52:17 +0000 Subject: [PATCH] Add initial caching documentation Signed-off-by: Daniel Carl Jones --- doc/CONFIGURATION.md | 13 +++++++++++++ doc/SEMANTICS.md | 9 +++++++-- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/doc/CONFIGURATION.md b/doc/CONFIGURATION.md index c390f11b1..3e8f12c24 100644 --- a/doc/CONFIGURATION.md +++ b/doc/CONFIGURATION.md @@ -164,6 +164,19 @@ By default, Mountpoint allows creating new files, and does not allow deleting ex You cannot currently use Mountpoint to overwrite existing objects. However, if you use the `--allow-delete` flag, you can first delete the object and then create it again. +### Caching + +Mountpoint now offers caching of metadata and object content allowing for reduced requests when reading files. +This is particularly useful when reading the same files many times for the same Mountpoint filesystem. + +To enable caching, use the `--caching` command-line flag. +This alone will enable caching of metadata using a default time-to-live (TTL) of 60 minutes. +Caching of object/file content on disk can also be enabled +by providing a caching location using the `--data-cache-directory ` command-line flag. + +Enabling caching relaxes the strong read-after-write consistency offered by Mountpoint with default configuration. +Please read more in the [consistency and concurrency section of the semantics documentaton](./SEMANTICS.md#consistency-and-concurrency). + ### S3 storage classes Amazon S3 offers a [range of storage classes](https://aws.amazon.com/s3/storage-classes/) that you can choose from based on the data access, resiliency, and cost requirements of your workloads. When creating new files with Mountpoint, you can control which storage class the corresponding objects are stored in. By default, Mountpoint uses the S3 Standard storage class, which is appropriate for a wide variety of use cases. To store new objects in a different storage class, use the `--storage-class` command-line flag. Possible values for this argument include: diff --git a/doc/SEMANTICS.md b/doc/SEMANTICS.md index 6a2e7d583..7113a77ad 100644 --- a/doc/SEMANTICS.md +++ b/doc/SEMANTICS.md @@ -63,9 +63,14 @@ Mountpoint has limited support for other file and directory metadata, including ## Consistency and concurrency -Amazon S3 provides [strong read-after-write consistency](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html#ConsistencyModel) for PUT and DELETE requests of objects in your S3 bucket. Mountpoint provides strong read-after-write consistency for file writes, directory listing operations, and new object creation. For example, if you create a new object using another S3 client, it will be immediately accessible with Mountpoint. Mountpoint also ensures that new file uploads to a single key are atomic. If you modify an existing object in your bucket with another client while also reading that object through Mountpoint, the reads will return either the old data or the new data, but never partial or corrupt data. To guarantee your reads see the newest object data, you can re-open the file after modifying the object. +Amazon S3 provides [strong read-after-write consistency](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html#ConsistencyModel) for PUT and DELETE requests of objects in your S3 bucket. +By default, Mountpoint provides strong read-after-write consistency for file writes, directory listing operations, and new object creation. For example, if you create a new object using another S3 client, it will be immediately accessible with Mountpoint. Mountpoint also ensures that new file uploads to a single key are atomic. If you modify an existing object in your bucket with another client while also reading that object through Mountpoint, the reads will return either the old data or the new data, but never partial or corrupt data. To guarantee your reads see the newest object data, you can re-open the file after modifying the object. -However, Mountpoint may return stale metadata for an existing object within 1 second of the object being modified or deleted in your S3 bucket by another client. This occurs only if the object was accessed through Mountpoint immediately before being modified or deleted in your S3 bucket. The stale metadata will only be visible through metadata operations such as `stat` on individual files. Directory listings will never be stale and always reflect the current metadata. These cases do not apply to newly created objects, which are always immediately visible through Mountpoint. Stale metadata can be refreshed by either opening the file or listing its parent directory. +Mountpoint also offers metadata and object content caching which can be enabled using CLI flags: see the [caching section of the configuration document](./CONFIGURATION.md#caching) for more information. +When opting into caching, the consistency model is relaxed and you may see stale entries until they have expired. +Stale entries may live as long as the cache's metadata time-to-live (TTL). + +However even with caching off, Mountpoint may return stale metadata for an existing object within 1 second of the object being modified or deleted in your S3 bucket by another client. This occurs only if the object was accessed through Mountpoint immediately before being modified or deleted in your S3 bucket. The stale metadata will only be visible through metadata operations such as `stat` on individual files. With caching mode off, directory listings will never be stale and always reflect the current metadata. These cases do not apply to newly created objects, which are always immediately visible through Mountpoint. Stale metadata can be refreshed by either opening the file or listing its parent directory. Mountpoint allows multiple readers to access the same object at the same time. However, a new file can only be written to sequentially and by one writer at a time. New files that are being written are not available for reading until the writing application closes the file and Mountpoint finishes uploading it to S3. If you have multiple Mountpoint mounts for the same bucket, on the same or different hosts, there is no coordination between writes to the same object. We recommend that your application does not write to the same object from multiple instances at the same time.