Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs | Bucket Types #1448

Merged
merged 1 commit into from
Dec 6, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions doc/bucket-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
[NooBaa Operator](../README.md) /

# Bucket Types in NooBaa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is great contribution, and I think it can be even better in noobaa-core

NooBaa supports two types of buckets - most commonly referred to as `namespace` and `data`. While all NooBaa buckets are accessible via the S3 API and appear the same to the user, they have different uses and implementations under the hood.
This document will explaing the meaning of each, as well as the differences between the two.

## Data Buckets
Data buckets are the 'classic' type of bucket in NooBaa. In a classic NooBaa deployment, the product creates several resources by default - a backingstore (`noobaa-default-backing-store`), a bucketclass (`noobaa-default-bucket-class`) and a bucket (`first.bucket`) as part of the product's initialization process. When data is written to data buckets, it is first processed by NooBaa - the data is compressed, deduplicated, encrypted, and split into chunks. These chunks are then stored in the backingstore that the bucket is connected to.
The only way to access the objects on data buckets is through NooBaa - the chunks cannot be used or deciphered without the system.
Neon-White marked this conversation as resolved.
Show resolved Hide resolved

## Namespace Buckets
Namespace buckets work differently than data buckets, since they try¹ to not apply any processing on the objects that are uploaded to them, and act as more of a 'passthrough'. In cases where a non-S3 storage provider API is used, NooBaa takes care of bridging potential gaps between S3 and the provider's API - no user action is required. This also means that as long as NooBaa supports a certain provider's API, users can use the S3 API to manage it (while NooBaa takes care of the 'translation').

The objects can be accessed both from within NooBaa, as well as from whatever storage providers hosts them.
Namespace buckets also support having multiple 'read' sources - for example, a single namespace bucket can show the objects from several Azure Blob containers, AWS S3 buckets, Google Cloud Platform buckets, all at the same time.
Another feature of namespace buckets is cache, which can be configured to keep frequently-accessed objects in the system's memory, effectively reducing access time as well as egress and access costs.
shirady marked this conversation as resolved.
Show resolved Hide resolved

_1. In some cases, object metadata (such as tags) might have to be modified in order to comply with a cloud provider's limits_

## Supported Storage Providers by Bucket Type
| Service | Data Buckets | Namespace Buckets |
|---------------------------------------------------| :---: | :---: |
| Amazon Web Services S3 (AWS) | ✅ | ✅
| S3 Compatible Services | ✅ | ✅
| IBM Cloud Object Storage (COS) | ✅ | ✅
| Azure Blob Storage | ✅ | ✅
| Google Cloud Platform Object Storage (GCP) | ✅ | ❌
| Filesystem (NSFS) | ❌ | ✅
| Persistent Volume Pool (PV) | ✅ | ❌

For specific API calls (including non-S3 calls, such as IAM), you can check [AWS API Compatibility](https://github.com/noobaa/noobaa-core/blob/master/docs/design/AWS_API_Compatibility.md)

_Please note that the table above might be out of date, or have additional support that is not present in older versions - the best way to check if a storage provider is supported is to run the NooBaa CLI tool (that fits the NooBaa version installed on the cluster) with a command such as `noobaa namespacestore create` or `noobaa backingstore create` and see which providers are listed in the help message._

## Buckets in Relation to Bucketclasses and Stores
In order to connect NooBaa to a storage provider (regardless of whether it's a cloud provider or an on-premises storage system), a store must be created - either a [backingstore](backing-store-crd.md), or a [namespacestore](namespace-store-crd.md). A store signifies a connection to a storage provider - it requires credentials, and sometimes additional configuartion such as a custom endpoint (when utilizing S3 compatible services that use the S3 API but aren't AWS. For example, Google's S3-compatible Cloud Storage URI is https://storage.googleapis.com).
The type of store chosen determines what type of buckets will be created on top of it -
- Namespacestores are used to create namespace buckets
- Backingstores are used to create data buckets

Once a store is created, it's necessary to create a [bucketclass](bucket-class-crd.md) resource in order to utilize it. Bucketclasses allow users to define bucket policies relating to data placement and replication, and are used as a middle layer between buckets and stores.

After the bucketclass has been created, it's now finally possible to create an [object bucket claim](obc-provisioner.md), which is then reconciled by NooBaa and creates a bucket in the system. The created bucket can then be interacted with via the S3 API.

For further reading on S3 API compatibility, see the [S3 Compatibility](s3-compatibility.md) document.
Loading