From 69db717feb2a0c7d8686a7b1fb66bec4e0de7ab2 Mon Sep 17 00:00:00 2001 From: Premkumar Date: Mon, 16 Sep 2024 14:02:26 -0700 Subject: [PATCH] Update faq page (#23704) * update faq page * fix a typo * add preview/stable * edits and fixes * changing to design goals style format * format * making link shortcode take named params * format * edits to the faq page * add trade-offs * fixes from code review * format --------- Co-authored-by: Dwight Hodge Co-authored-by: aishwarya24 --- .../Yugabyte/spelling-exceptions.txt | 1 + .../preview/architecture/design-goals.md | 12 +- .../preview/architecture/key-concepts.md | 48 +-- .../contribute/docs/widgets-and-shortcodes.md | 6 +- docs/content/preview/faq/general.md | 305 ++++++------------ .../data-migration/migrate-from-postgres.md | 8 +- .../stable/architecture/design-goals.md | 12 +- .../stable/architecture/key-concepts.md | 48 +-- .../data-migration/migrate-from-postgres.md | 8 +- docs/layouts/shortcodes/link.html | 10 +- docs/layouts/shortcodes/release.html | 8 + 11 files changed, 182 insertions(+), 284 deletions(-) diff --git a/.github/vale-styles/Yugabyte/spelling-exceptions.txt b/.github/vale-styles/Yugabyte/spelling-exceptions.txt index 593e5d940f07..81fa77244429 100644 --- a/.github/vale-styles/Yugabyte/spelling-exceptions.txt +++ b/.github/vale-styles/Yugabyte/spelling-exceptions.txt @@ -321,6 +321,7 @@ Javafuzz JavaScript Jenkins Jenkinsfile +Jepsen Jira jq jQuery diff --git a/docs/content/preview/architecture/design-goals.md b/docs/content/preview/architecture/design-goals.md index 2477b434d6e0..bb36ecbacb81 100644 --- a/docs/content/preview/architecture/design-goals.md +++ b/docs/content/preview/architecture/design-goals.md @@ -17,15 +17,15 @@ type: docs ## Scalability -YugabyteDB scales out horizontally by adding more nodes to handle increasing data volumes and higher workloads. With YugabyteDB, you can also opt for vertical scaling choosing more powerful infrastructure components. {{}} +YugabyteDB scales out horizontally by adding more nodes to handle increasing data volumes and higher workloads. With YugabyteDB, you can also opt for vertical scaling choosing more powerful infrastructure components. {{}} ## High Availability -YugabyteDB ensures continuous availability, even in the face of individual node failures or network partitions. YugabyteDB achieves this by replicating data across multiple nodes and implementing failover mechanisms via leader election. {{}} +YugabyteDB ensures continuous availability, even in the face of individual node failures or network partitions. YugabyteDB achieves this by replicating data across multiple nodes and implementing failover mechanisms via leader election. {{}} ## Fault Tolerance -YugabyteDB is resilient to various types of failures, such as node crashes, network partitions, disk failures, and other hardware or software faults and failure of various fault domains. It can automatically recover from these failures without data loss or corruption. {{}} +YugabyteDB is resilient to various types of failures, such as node crashes, network partitions, disk failures, and other hardware or software faults and failure of various fault domains. It can automatically recover from these failures without data loss or corruption. {{}} ## Consistency @@ -66,7 +66,7 @@ YugabyteDB monitors and automatically re-balances the number of tablet leaders a ## Data locality -YugabyteDB supports colocated tables and databases which enables related data to be kept together on the same node for performance reasons. {{}} +YugabyteDB supports colocated tables and databases which enables related data to be kept together on the same node for performance reasons. {{}} ## Security @@ -113,7 +113,7 @@ In addition: ## Cassandra compatibility -[YCQL](../../api/ycql/) is a [semi-relational CQL API](../../explore/ycql-language/) that is best suited for internet-scale OLTP and HTAP applications needing massive write scalability and fast queries. YCQL supports distributed transactions, strongly-consistent secondary indexes, and a native JSON column type. YCQL has its roots in the Cassandra Query Language. {{}} +[YCQL](../../api/ycql/) is a [semi-relational CQL API](../../explore/ycql-language/) that is best suited for internet-scale OLTP and HTAP applications needing massive write scalability and fast queries. YCQL supports distributed transactions, strongly-consistent secondary indexes, and a native JSON column type. YCQL has its roots in the Cassandra Query Language. {{}} ## Performance @@ -143,7 +143,7 @@ YugabyteDB has been designed with several cloud-native principles in mind. ## Kubernetes-ready -YugabyteDB works natively in Kubernetes and other containerized environments as a stateful application. {{}} +YugabyteDB works natively in Kubernetes and other containerized environments as a stateful application. {{}} ## Open source diff --git a/docs/content/preview/architecture/key-concepts.md b/docs/content/preview/architecture/key-concepts.md index 04d6328f24fc..503d22a16e2c 100644 --- a/docs/content/preview/architecture/key-concepts.md +++ b/docs/content/preview/architecture/key-concepts.md @@ -31,7 +31,7 @@ YugabyteDB provides ACID guarantees for all [transactions](#transaction). ## CDC - Change data capture -CDC is a software design pattern used in database systems to capture and propagate data changes from one database to another in real-time or near real-time. YugabyteDB supports transactional CDC guaranteeing changes across tables are captured together. This enables use cases like real-time analytics, data warehousing, operational data replication, and event-driven architectures. {{}} +CDC is a software design pattern used in database systems to capture and propagate data changes from one database to another in real-time or near real-time. YugabyteDB supports transactional CDC guaranteeing changes across tables are captured together. This enables use cases like real-time analytics, data warehousing, operational data replication, and event-driven architectures. {{}} ## Cluster @@ -43,11 +43,11 @@ Sometimes the term *cluster* is used interchangeably with the term *universe*. H ## DocDB -DocDB is the underlying document storage engine of YugabyteDB and is built on top of a highly customized and optimized verison of [RocksDB](http://rocksdb.org/). {{}} +DocDB is the underlying document storage engine of YugabyteDB and is built on top of a highly customized and optimized verison of [RocksDB](http://rocksdb.org/). {{}} ## Fault domain -A fault domain is a potential point of failure. Examples of fault domains would be nodes, racks, zones, or entire regions. {{}} +A fault domain is a potential point of failure. Examples of fault domains would be nodes, racks, zones, or entire regions. {{}} ## Fault tolerance @@ -59,15 +59,15 @@ The fault tolerance determines how resilient the cluster is to domain (that is, Normally, only the [tablet leader](#tablet-leader) can process user-facing write and read requests. Follower reads allow you to lower read latencies by serving reads from the tablet followers. This is similar to reading from a cache, which can provide more read IOPS with low latency. The data might be slightly stale, but is timeline-consistent, meaning no out of order data is possible. -Follower reads are particularly beneficial in applications that can tolerate staleness. For instance, in a social media application where a post gets a million likes continuously, slightly stale reads are acceptable, and immediate updates are not necessary because the absolute number may not really matter to the end-user reading the post. In such cases, a slightly older value from the closest replica can achieve improved performance with lower latency. Follower reads are required when reading from [read replicas](#read-replica-cluster). {{}} +Follower reads are particularly beneficial in applications that can tolerate staleness. For instance, in a social media application where a post gets a million likes continuously, slightly stale reads are acceptable, and immediate updates are not necessary because the absolute number may not really matter to the end-user reading the post. In such cases, a slightly older value from the closest replica can achieve improved performance with lower latency. Follower reads are required when reading from [read replicas](#read-replica-cluster). {{}} ## Hybrid time -Hybrid time/timestamp is a monotonically increasing timestamp derived using [Hybrid Logical clock](../transactions/transactions-overview/#hybrid-logical-clocks). Multiple aspects of YugabyteDB's transaction model are based on hybrid time. {{}} +Hybrid time/timestamp is a monotonically increasing timestamp derived using [Hybrid Logical clock](../transactions/transactions-overview/#hybrid-logical-clocks). Multiple aspects of YugabyteDB's transaction model are based on hybrid time. {{}} ## Isolation levels -[Transaction](#transaction) isolation levels define the degree to which transactions are isolated from each other. Isolation levels determine how changes made by one transaction become visible to other concurrent transactions. {{}} +[Transaction](#transaction) isolation levels define the degree to which transactions are isolated from each other. Isolation levels determine how changes made by one transaction become visible to other concurrent transactions. {{}} {{}} YugabyteDB offers 3 isolation levels - [Serializable](../../explore/transactions/isolation-levels/#serializable-isolation), [Snapshot](../../explore/transactions/isolation-levels/#snapshot-isolation) and [Read committed](../../explore/transactions/isolation-levels/#read-committed-isolation) - in the {{}} API and one isolation level - [Snapshot](../../develop/learn/transactions/acid-transactions-ycql/) - in the {{}} API. @@ -79,11 +79,11 @@ YugabyteDB tries to keep the number of leaders evenly distributed across the [no ## Leader election -Amongst the [tablet](#tablet) replicas, one tablet is elected [leader](#tablet-leader) as per the [Raft](../docdb-replication/raft) protocol. {{}} +Amongst the [tablet](#tablet) replicas, one tablet is elected [leader](#tablet-leader) as per the [Raft](../docdb-replication/raft) protocol. {{}} ## Master server -The [YB-Master](../yb-master/) service is responsible for keeping system metadata, coordinating system-wide operations, such as creating, altering, and dropping tables, as well as initiating maintenance operations such as load balancing. {{}} +The [YB-Master](../yb-master/) service is responsible for keeping system metadata, coordinating system-wide operations, such as creating, altering, and dropping tables, as well as initiating maintenance operations such as load balancing. {{}} {{}} The master server is also typically referred as just **master**. @@ -91,7 +91,7 @@ The master server is also typically referred as just **master**. ## MVCC -MVCC stands for Multi-version Concurrency Control. It is a concurrency control method used by YugabyteDB to provide access to data in a way that allows concurrent queries and updates without causing conflicts. {{}} +MVCC stands for Multi-version Concurrency Control. It is a concurrency control method used by YugabyteDB to provide access to data in a way that allows concurrent queries and updates without causing conflicts. {{}} ## Namespace @@ -123,7 +123,7 @@ Designating one region as preferred can reduce the number of network hops needed Regardless of the preferred region setting, data is replicated across all the regions in the cluster to ensure region-level fault tolerance. -You can enable [follower reads](#follower-reads) to serve reads from non-preferred regions. In cases where the cluster has [read replicas](#read-replica-cluster) and a client connects to a read replica, reads are served from the replica; writes continue to be handled by the preferred region. {{}} +You can enable [follower reads](#follower-reads) to serve reads from non-preferred regions. In cases where the cluster has [read replicas](#read-replica-cluster) and a client connects to a read replica, reads are served from the replica; writes continue to be handled by the preferred region. {{}} ## Primary cluster @@ -131,17 +131,17 @@ A primary cluster can perform both writes and reads, unlike a [read replica clus ## Raft -Raft stands for Replication for availability and fault tolerance. This is the algorithm that YugabyteDB uses for replication guaranteeing consistency. {{}} +Raft stands for Replication for availability and fault tolerance. This is the algorithm that YugabyteDB uses for replication guaranteeing consistency. {{}} ## Read replica cluster Read replica clusters are optional clusters that can be set up in conjunction with a [primary cluster](#primary-cluster) to perform only reads; writes sent to read replica clusters get automatically rerouted to the primary cluster of the [universe](#universe). These clusters enable reads in regions that are far away from the primary cluster with timeline-consistent data. This ensures low latency reads for geo-distributed applications. -Data is brought into the read replica clusters through asynchronous replication from the primary cluster. In other words, [nodes](#node) in a read replica cluster act as Raft observers that do not participate in the write path involving the Raft leader and Raft followers present in the primary cluster. Reading from read replicas requires enabling [follower reads](#follower-reads). {{}} +Data is brought into the read replica clusters through asynchronous replication from the primary cluster. In other words, [nodes](#node) in a read replica cluster act as Raft observers that do not participate in the write path involving the Raft leader and Raft followers present in the primary cluster. Reading from read replicas requires enabling [follower reads](#follower-reads). {{}} ## Rebalancing -Rebalancing is the process of keeping an even distribution of tablets across the [nodes](#node) in a cluster. {{}} +Rebalancing is the process of keeping an even distribution of tablets across the [nodes](#node) in a cluster. {{}} ## Region @@ -151,24 +151,24 @@ A region refers to a defined geographical area or location where a cloud provide The number of copies of data in a YugabyteDB universe. YugabyteDB replicates data across [fault domains](#fault-domain) (for example, zones) in order to tolerate faults. [Fault tolerance](#fault-tolerance) (FT) and RF are correlated. To achieve a FT of k nodes, the universe has to be configured with a RF of (2k + 1). -The RF should be an odd number to ensure majority consensus can be established during failures. {{}} +The RF should be an odd number to ensure majority consensus can be established during failures. {{}} Each [read replica](#read-replica-cluster) cluster can also have its own replication factor. In this case, the replication factor determines how many copies of your primary data the read replica has; multiple copies ensure the availability of the replica in case of a node outage. Replicas *do not* participate in the primary cluster Raft consensus, and do not affect the fault tolerance of the primary cluster or contribute to failover. ## Sharding -Sharding is the process of mapping a table row to a [tablet](#tablet). YugabyteDB supports 2 types of sharding, Hash and Range. {{}} +Sharding is the process of mapping a table row to a [tablet](#tablet). YugabyteDB supports 2 types of sharding, Hash and Range. {{}} ## Smart driver A smart driver in the context of YugabyteDB is essentially a PostgreSQL driver with additional "smart" features that leverage the distributed nature of YugabyteDB. These smart drivers intelligently distribute application connections across the nodes and regions of a YugabyteDB cluster, eliminating the need for external load balancers. This results in balanced connections that provide lower latencies and prevent hot nodes. For geographically-distributed applications, the driver can seamlessly connect to the geographically nearest regions and availability zones for lower latency. Smart drivers are optimized for use with a distributed SQL database, and are both cluster-aware and topology-aware. They keep track of the members of the cluster as well as their locations. As nodes are added or removed from clusters, the driver updates its membership and topology information. The drivers read the database cluster topology from the metadata table, and route new connections to individual instance endpoints without relying on high-level cluster endpoints. The smart drivers are also capable of load balancing read-only connections across the available YB-TServers. -. {{}} +. {{}} ## Tablet -YugabyteDB splits a table into multiple small pieces called tablets for data distribution. The word "tablet" finds its origins in ancient history, when civilizations utilized flat slabs made of clay or stone as surfaces for writing and maintaining records. {{}} +YugabyteDB splits a table into multiple small pieces called tablets for data distribution. The word "tablet" finds its origins in ancient history, when civilizations utilized flat slabs made of clay or stone as surfaces for writing and maintaining records. {{}} {{}} Tablets are also referred as shards. @@ -184,15 +184,15 @@ In a cluster, each [tablet](#tablet) is replicated as per the [replication facto ## Tablet splitting -When a tablet reaches a threshold size, it splits into 2 new [tablets](#tablet). This is a very quick operation. {{}} +When a tablet reaches a threshold size, it splits into 2 new [tablets](#tablet). This is a very quick operation. {{}} ## Transaction -A transaction is a sequence of operations performed as a single logical unit of work. YugabyteDB provides [ACID](#acid) guarantees for transactions. {{}} +A transaction is a sequence of operations performed as a single logical unit of work. YugabyteDB provides [ACID](#acid) guarantees for transactions. {{}} ## TServer -The [YB-TServer](../yb-tserver) service is responsible for maintaining and managing table data in the form of tablets, as well as dealing with all the queries. {{}} +The [YB-TServer](../yb-tserver) service is responsible for maintaining and managing table data in the form of tablets, as well as dealing with all the queries. {{}} ## Universe @@ -204,19 +204,19 @@ Sometimes the terms *universe* and *cluster* are used interchangeably. The two a ## xCluster -xCluster is a type of deployment where data is replicated asynchronously between two [universes](#universe) - a primary and a standby. The standby can be used for disaster recovery. YugabyteDB supports transactional xCluster {{}}. +xCluster is a type of deployment where data is replicated asynchronously between two [universes](#universe) - a primary and a standby. The standby can be used for disaster recovery. YugabyteDB supports transactional xCluster {{}}. ## YCQL -Semi-relational SQL API that is best fit for internet-scale OLTP and HTAP apps needing massive write scalability as well as blazing-fast queries. It supports distributed transactions, strongly consistent secondary indexes, and a native JSON column type. YCQL has its roots in the Cassandra Query Language. {{}} +Semi-relational SQL API that is best fit for internet-scale OLTP and HTAP apps needing massive write scalability as well as blazing-fast queries. It supports distributed transactions, strongly consistent secondary indexes, and a native JSON column type. YCQL has its roots in the Cassandra Query Language. {{}} ## YQL -The YugabyteDB Query Layer (YQL) is the primary layer that provides interfaces for applications to interact with using client drivers. This layer deals with the API-specific aspects such as query/command compilation and the run-time (data type representations, built-in operations, and more). {{}} +The YugabyteDB Query Layer (YQL) is the primary layer that provides interfaces for applications to interact with using client drivers. This layer deals with the API-specific aspects such as query/command compilation and the run-time (data type representations, built-in operations, and more). {{}} ## YSQL -Fully-relational SQL API that is wire compatible with the SQL language in PostgreSQL. It is best fit for RDBMS workloads that need horizontal write scalability and global data distribution while also using relational modeling features such as JOINs, distributed transactions, and referential integrity (such as foreign keys). Note that YSQL reuses the native query layer of the PostgreSQL open source project. {{}} +Fully-relational SQL API that is wire compatible with the SQL language in PostgreSQL. It is best fit for RDBMS workloads that need horizontal write scalability and global data distribution while also using relational modeling features such as JOINs, distributed transactions, and referential integrity (such as foreign keys). Note that YSQL reuses the native query layer of the PostgreSQL open source project. {{}} ## Zone diff --git a/docs/content/preview/contribute/docs/widgets-and-shortcodes.md b/docs/content/preview/contribute/docs/widgets-and-shortcodes.md index 2399da8d9e28..3a44001391f7 100644 --- a/docs/content/preview/contribute/docs/widgets-and-shortcodes.md +++ b/docs/content/preview/contribute/docs/widgets-and-shortcodes.md @@ -84,9 +84,9 @@ This is a warning with a [link](https://www.yugabyte.com). You can add a link to a url with an icon using the `link` shortcode which takes url as a string argument. Internal and external links will have different icons. You can use the `:version` variable to expand to all versions. -- {{}} : _External link_ `{{}}` -- {{}} : _Relative internal link_ `{{}}` -- {{}} : _Full path internal link_ `{{}}` +- {{}} : _External link_ `{{}}` +- {{}} : _Relative internal link_ `{{}}` +- {{}} : _Full path internal link_ `{{}}` ## Tables diff --git a/docs/content/preview/faq/general.md b/docs/content/preview/faq/general.md index 77d1f803e9ba..2c405c8b3b53 100644 --- a/docs/content/preview/faq/general.md +++ b/docs/content/preview/faq/general.md @@ -19,310 +19,193 @@ rightNav: hideH4: true --- -### Contents - -##### YugabyteDB - -- [What is YugabyteDB?](#what-is-yugabytedb) -- [What makes YugabyteDB unique?](#what-makes-yugabytedb-unique) -- [How many major releases YugabyteDB has had so far?](#how-many-major-releases-yugabytedb-has-had-so-far) -- [Is YugabyteDB open source?](#is-yugabytedb-open-source) -- [Can I deploy YugabyteDB to production?](#can-i-deploy-yugabytedb-to-production) -- [Which companies are currently using YugabyteDB in production?](#which-companies-are-currently-using-yugabytedb-in-production) -- [What is the definition of the "Beta" feature tag?](#what-is-the-definition-of-the-beta-feature-tag) -- [How do YugabyteDB, YugabyteDB Anywhere, and YugabyteDB Aeon differ from each other?](#how-do-yugabytedb-yugabytedb-anywhere-and-yugabytedb-aeon-differ-from-each-other) -- [How do I report a security vulnerability?](#how-do-i-report-a-security-vulnerability) - -##### Evaluating YugabyteDB - -- [What are the trade-offs involved in using YugabyteDB?](#what-are-the-trade-offs-involved-in-using-yugabytedb) -- [When is YugabyteDB a good fit?](#when-is-yugabytedb-a-good-fit) -- [When is YugabyteDB not a good fit?](#when-is-yugabytedb-not-a-good-fit) -- [Any performance benchmarks available?](#any-performance-benchmarks-available) -- [What about correctness testing?](#what-about-correctness-testing) -- [How does YugabyteDB compare to other SQL and NoSQL databases?](#how-does-yugabytedb-compare-to-other-sql-and-nosql-databases) - -##### Architecture - -- [How does YugabyteDB's common document store work?](#how-does-yugabytedb-s-common-document-store-work) -- [How can YugabyteDB be both CP and ensure high availability at the same time?](#how-can-yugabytedb-be-both-cp-and-ensure-high-availability-at-the-same-time) -- [Why is a group of YugabyteDB nodes called a universe instead of the more commonly used term clusters?](#why-is-a-group-of-yugabytedb-nodes-called-a-universe-instead-of-the-more-commonly-used-term-clusters) -- [Why is consistent hash sharding the default sharding strategy?](#why-is-consistent-hash-sharding-the-default-sharding-strategy) - ## YugabyteDB -### What is YugabyteDB? - - - -YugabyteDB is a high-performance distributed SQL database for powering global, internet-scale applications. Built using a unique combination of high-performance document store, per-shard distributed consensus replication and multi-shard ACID transactions (inspired by Google Spanner), YugabyteDB serves both scale-out RDBMS and internet-scale OLTP workloads with low query latency, extreme resilience against failures and global data distribution. As a cloud native database, it can be deployed across public and private clouds as well as in Kubernetes environments with ease. - -YugabyteDB is developed and distributed as an [Apache 2.0 open source project](https://github.com/yugabyte/yugabyte-db/). - -### What makes YugabyteDB unique? - -YugabyteDB is a transactional database that brings together four must-have needs of cloud native apps - namely SQL as a flexible query language, low-latency performance, continuous availability, and globally-distributed scalability. Other databases do not serve all 4 of these needs simultaneously. - -- Monolithic SQL databases offer SQL and low-latency reads, but neither have the ability to tolerate failures, nor can they scale writes across multiple nodes, zones, regions, and clouds. +### What is YugabyteDB -- Distributed NoSQL databases offer read performance, high availability, and write scalability, but give up on SQL features such as relational data modeling and ACID transactions. +YugabyteDB is a high-performant, highly available and scalable distributed SQL database designed for powering global, internet-scale applications. It is fully compatible with [PostgreSQL](https://www.postgresql.org/) and provides strong [ACID](/preview/architecture/key-concepts/#acid) guarantees for distributed transactions. It can be deployed in a single region, multi-region, and multi-cloud setups. -YugabyteDB feature highlights are listed below. +{{}} -#### SQL and ACID transactions +### What makes YugabyteDB unique -- SQL [JOINs](../../quick-start/explore/ysql/#join) and [distributed transactions](../../explore/transactions/distributed-transactions-ysql/) that allow multi-row access across any number of shards at any scale. +YugabyteDB stands out as a unique database solution due to its combination of features that bring together the strengths of both traditional SQL databases and modern NoSQL systems. It is [horizontally scalable](/preview/explore/linear-scalability/), supports global geo-distribution, supports [SQL (YSQL)](/preview/explore/ysql-language-features/sql-feature-support/) and [NoSQL (YCQL)](/preview/explore/ycql-language/) APIs, is [highly performant](/preview/benchmark/) and gurantees strong transactional consistency. -- Transactional [document store](../../architecture/docdb/) backed by self-healing, strongly-consistent, synchronous [replication](../../architecture/docdb-replication/replication/). +{{}} -#### High performance and massive scalability - -- Low latency for geo-distributed applications with multiple [read consistency levels](../../architecture/docdb-replication/replication/#follower-reads) and [read replicas](../../architecture/docdb-replication/read-replicas/). - -- Linearly scalable throughput for ingesting and serving ever-growing datasets. - -#### Global data consistency - -- [Global data distribution](../../explore/multi-region-deployments/) that brings consistent data close to users through multi-region and multi-cloud deployments. Optional two-region multi-master and master-follower configurations powered by CDC-driven asynchronous replication. - -- [Auto-sharding and auto-rebalancing](../../architecture/docdb-sharding/sharding/) to ensure uniform load across all nodes even for very large clusters. - -#### Cloud native - -- Built for the container era with [highly elastic scaling](../../explore/linear-scalability/) and infrastructure portability, including [Kubernetes-driven orchestration](../../quick-start/kubernetes/). +### Is YugabyteDB open source? -- [Self-healing database](../../explore/fault-tolerance/) that automatically tolerates any failures common in the inherently unreliable modern cloud infrastructure. +YugabyteDB is 100% open source. It is licensed under Apache 2.0. -#### Open source +{{}} -- Fully functional distributed database available under [Apache 2.0 open source license](https://github.com/yugabyte/yugabyte-db/). +### How many major releases YugabyteDB has had so far? -#### Built-in enterprise features +YugabyteDB released its first beta, [v0.9](https://www.yugabyte.com/blog/yugabyte-has-arrived/) in November 2017. Since then, several stable and preview versions have been released. The current stable version is {{}}, and the current preview version is {{}}. -- Starting in [v1.3](https://www.yugabyte.com/blog/announcing-yugabyte-db-v1-3-with-enterprise-features-as-open-source/), YugabyteDB is the only open-source distributed SQL database to have built-in enterprise features such as Distributed Backups, Data Encryption, and Read Replicas. New features such as [Change Data Capture (CDC)](../../architecture/docdb-replication/change-data-capture/) and [2 Data Center Deployments](../../architecture/docdb-replication/async-replication/) are also included in open source. +{{}} -### How many major releases YugabyteDB has had so far? +### What is the difference between preview and stable versions? -YugabyteDB has had the following major (stable) releases: - -- [v2.20](https://www.yugabyte.com/blog/release-220-announcement/) in November 2023 -- [v2.18](https://www.yugabyte.com/blog/release-218-announcement/) in May 2023 -- [v2.16](https://www.yugabyte.com/blog/yugabytedb-216/) in December 2022 -- [v2.14](https://www.yugabyte.com/blog/announcing-yugabytedb-2-14-higher-performance-and-security/) in July 2022. -- [v2.12](https://www.yugabyte.com/blog/announcing-yugabytedb-2-12/) in February 2022. (There was no v2.10 release.) -- [v2.8](https://www.yugabyte.com/blog/announcing-yugabytedb-2-8/) in November 2021. -- [v2.6](https://www.yugabyte.com/blog/announcing-yugabytedb-2-6/) in July 2021. -- [v2.4](https://www.yugabyte.com/blog/announcing-yugabytedb-2-4/) in January 2021. -- [v2.2](https://www.yugabyte.com/blog/announcing-yugabytedb-2-2-distributed-sql-made-easy/) in July 2020. -- [v2.1](https://www.yugabyte.com/blog/yugabytedb-2-1-is-ga-scaling-new-heights-with-distributed-sql/) in February 2020. -- [v2.0](https://www.yugabyte.com/blog/announcing-yugabyte-db-2-0-ga:-jepsen-tested,-high-performance-distributed-sql/) in September 2019. -- [v1.3](https://www.yugabyte.com/blog/announcing-yugabyte-db-v1-3-with-enterprise-features-as-open-source/) in July 2019. -- [v1.2](https://www.yugabyte.com/blog/announcing-yugabyte-db-1-2-company-update-jepsen-distributed-sql/) in March 2019. -- [v1.1](https://www.yugabyte.com/blog/announcing-yugabyte-db-1-1-and-company-update/) in September 2018. -- [v1.0](https://www.yugabyte.com/blog/announcing-yugabyte-db-1-0) in May 2018. -- [v0.9 Beta](https://www.yugabyte.com/blog/yugabyte-has-arrived/) in November 2017. - -Releases, including upcoming releases, are outlined on the [Releases Overview](/preview/releases/) page. The roadmap for this release can be found on [GitHub](https://github.com/yugabyte/yugabyte-db#whats-being-worked-on). +Preview releases include features under active development and are recommended for development and testing only. Stable releases undergo rigorous testing for a longer period of time and are ready for production use. -### Is YugabyteDB open source? +{{}} -Starting with [v1.3](https://www.yugabyte.com/blog/announcing-yugabyte-db-v1-3-with-enterprise-features-as-open-source/), YugabyteDB is 100% open source. It is licensed under Apache 2.0 and the source is available on [GitHub](https://github.com/yugabyte/yugabyte-db). +### What are the upcoming features? -### Can I deploy YugabyteDB to production? +The roadmap for upcoming releases and the list of recently released features can be found in the [yugabyte-db](https://github.com/yugabyte/yugabyte-db) repository on GitHub. -Yes, both YugabyteDB APIs are production ready. [YCQL](https://www.yugabyte.com/blog/yugabyte-db-1-0-a-peek-under-the-hood/) achieved this status starting with v1.0 in May 2018 while [YSQL](https://www.yugabyte.com/blog/announcing-yugabyte-db-2-0-ga:-jepsen-tested,-high-performance-distributed-sql/) became production ready starting v2.0 in September 2019. +{{}} ### Which companies are currently using YugabyteDB in production? -Reference deployments are listed in [Success Stories](https://www.yugabyte.com/success-stories/). - -### What is the definition of the "Beta" feature tag? - -Some features are marked Beta in every release. Following are the points to consider: +Global organizations of all sizes leverage YugabyteDB to fulfill their application requirements. -- Code is well tested. Enabling the feature is considered safe. Some of these features enabled by default. +{{}} -- Support for the overall feature will not be dropped, though details may change in incompatible ways in a subsequent beta or GA release. +### How do I report a security vulnerability? -- Recommended only for non-production use. +Follow the steps in the [vulnerability disclosure policy](../../secure/vulnerability-disclosure-policy) to report a vulnerability to our security team. The policy outlines our commitments to you when you disclose a potential vulnerability, the reporting process, and how we will respond. -Please do try our beta features and give feedback on them on our [Slack community]({{}}) or by filing a [GitHub issue](https://github.com/yugabyte/yugabyte-db/issues). +### What are YugabyteDB Anywhere and YugabyteDB Aeon? -### How do YugabyteDB, YugabyteDB Anywhere, and YugabyteDB Aeon differ from each other? +**[YugabyteDB](../../)** is the 100% open source core database. It is the best choice for startup organizations with strong technical operations expertise looking to deploy to production with traditional DevOps tools. -[YugabyteDB](../../) is the 100% open source core database. It is the best choice for the startup organizations with strong technical operations expertise looking to deploy to production with traditional DevOps tools. +**[YugabyteDB Anywhere](../../yugabyte-platform/)** is commercial software for running a self-managed YugabyteDB-as-a-Service. It has built-in cloud native operations, enterprise-grade deployment options, and world-class support. -[YugabyteDB Anywhere](../../yugabyte-platform/) is commercial software for running a self-managed YugabyteDB-as-a-Service. It has built-in cloud native operations, enterprise-grade deployment options, and world-class support. It is the simplest way to run YugabyteDB in mission-critical production environments with one or more regions (across both public cloud and on-premises data centers). +**[YugabyteDB Aeon](../../yugabyte-cloud/)** is a fully-managed cloud service on Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). [Sign up](https://cloud.yugabyte.com/) to get started. -[YugabyteDB Aeon](../../yugabyte-cloud/) is Yugabyte's fully-managed cloud service on Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). [Sign up](https://cloud.yugabyte.com/) to get started. +{{}} -For a more detailed comparison between the above, see [Compare Deployment Options](https://www.yugabyte.com/compare-products/). +### When is YugabyteDB a good fit? -### How do I report a security vulnerability? +YugabyteDB is a good fit for fast-growing, cloud-native applications that need to serve business-critical data reliably, with zero data loss, high availability, and low latency. Common use cases include: -Please follow the steps in the [vulnerability disclosure policy](../../secure/vulnerability-disclosure-policy) to report a vulnerability to our security team. The policy outlines our commitments to you when you disclose a potential vulnerability, the reporting process, and how we will respond. +- Distributed Online Transaction Processing (OLTP) applications needing multi-region scalability without compromising strong consistency and low latency. For example, user identity, Retail product catalog, Financial data service. -## Evaluating YugabyteDB +- Hybrid Transactional/Analytical Processing (HTAP) (also known as Translytical) applications needing real-time analytics on transactional data. For example, user personalization, fraud detection, machine learning. -### What are the trade-offs involved in using YugabyteDB? +- Streaming applications needing to efficiently ingest, analyze, and store ever-growing data. For example, IoT sensor analytics, time series metrics, real-time monitoring. -Trade-offs depend on the type of database used as baseline for comparison. +### When is YugabyteDB not a good fit? -#### Distributed SQL +YugabyteDB is not a good fit for traditional Online Analytical Processing (OLAP) use cases that need complete ad-hoc analytics. Use an OLAP store such as [Druid](http://druid.io/druid.html) or a data warehouse such as [Snowflake](https://www.snowflake.net/). -Examples: Amazon Aurora, Google Cloud Spanner, CockroachDB, TiDB +### What are the trade-offs of using YugabyteDB? -**Benefits of YugabyteDB** +Ensuring [ACID](../../architecture/key-concepts/#acid) transactions and full compatibility with the PostgreSQL API presents challenges in a distributed environment. The trade-offs can also vary depending on the database you're comparing it to. Here are a few key considerations when switching to YugabyteDB: -- Low-latency reads and high-throughput writes. -- Cloud-neutral deployments with a Kubernetes-native database. -- 100% Apache 2.0 open source even for enterprise features. +- **Consistency vs. Latency**: YugabyteDB uses the [Raft](../../architecture/docdb-replication/raft) consensus algorithm for strong consistency in distributed systems. While this guarantees data integrity, it can result in higher write latency compared to eventually consistent databases like Cassandra. -**Trade-offs** +- **Increased Query Latency**: Transactions and JOINs that span multiple nodes experience inter-node latency, making queries slower than in single-node databases like PostgreSQL. -- None + {{}} + [Many projects](https://github.com/yugabyte/yugabyte-db?tab=readme-ov-file#current-roadmap) are currently in progress to match the performance of a single-node database. + {{}} -Learn more: [What is Distributed SQL?](https://www.yugabyte.com/tech/distributed-sql/) +- **Cross-Region Latency**: In multi-region or globally distributed setups, YugabyteDB replicates data across regions to ensure availability and resilience. However, this can lead to higher write latency due to cross-region coordination. -#### Monolithic SQL +- **Resource Requirements**: Being a distributed database, YugabyteDB demands more hardware and networking resources to maintain high availability and fault tolerance compared to traditional monolithic databases that run on a single machine. -Examples: PostgreSQL, MySQL, Oracle, Amazon Aurora. +- **PostgreSQL Feature Support**: Every new PostgreSQL feature must be optimized for distributed environments, which is not a simple task. Be sure to verify that the PostgreSQL features your application relies on are supported in the current version of YugabyteDB. -**Benefits of YugabyteDB** +{{}} -- Scale write throughput linearly across multiple nodes and/or geographic regions. -- Automatic failover and native repair. -- 100% Apache 2.0 open source even for enterprise features. +### What is a YugabyteDB universe -**Trade-offs** +A YugabyteDB [universe](/preview/architecture/key-concepts/#universe) comprises one [primary cluster](/preview/architecture/key-concepts/#primary-cluster) and zero or more [read replica clusters](/preview/architecture/key-concepts/#read-replica-cluster) that collectively function as a resilient and scalable distributed database. It is common to have just a primary cluster and hence the terms cluster and universe are sometimes used interchangeably but it is worthwhile to note that they are different. -- Transactions and JOINs can now span multiple nodes, thereby increasing latency. +{{}} -Learn more: [Distributed PostgreSQL on a Google Spanner Architecture – Query Layer](https://www.yugabyte.com/blog/distributed-postgresql-on-a-google-spanner-architecture-query-layer/) +### Are there any performance benchmarks available? -#### Traditional NewSQL +YugabyteDB is regularly benchmarked using a variety of standard benchmarks like [TPC-C](/preview/benchmark/tpcc/), [YCSB](/preview/benchmark/ycsb-ysql/), and [sysbench](/preview/benchmark/sysbench-ysql/). -Examples: Vitess, Citus +{{}} -**Benefits of YugabyteDB** +### How is YugabyteDB tested for correctness? -- Distributed transactions across any number of nodes. -- No single point of failure given all nodes are equal. -- 100% Apache 2.0 open source even for enterprise features. +Apart from the rigorous failure testing, YugabyteDB passes most of the scenarios in [Jepsen](https://jepsen.io/) testing. Jepsen is a methodology and toolset used to verify the correctness of distributed systems, particularly in the context of consistency models and fault tolerance and has become a standard for stress-testing distributed databases, data stores, and other distributed systems. -**Trade-offs** +{{}} -- None +### How does YugabyteDB compare to other databases? -Learn more: [Rise of Globally Distributed SQL Databases – Redefining Transactional Stores for Cloud Native Era](https://www.yugabyte.com/blog/rise-of-globally-distributed-sql-databases-redefining-transactional-stores-for-cloud-native-era/) +We have published detailed comparison information against multiple SQL and NoSQL databases: -#### Transactional NoSQL +- **SQL** - [CockroachDB](../comparisons/cockroachdb/), [TiDB](../comparisons/tidb/), [Vitess](../comparisons/vitess/), [Amazon Aurora](../comparisons/amazon-aurora/), [Google Spanner](../comparisons/google-spanner/) +- **NOSQL** - [MongoDB](../comparisons/mongodb/), [FoundationDB](../comparisons/foundationdb/), [Cassandra](../comparisons/cassandra/), [DynamoDB](../comparisons/amazon-dynamodb/), [CosmosDB](../comparisons/azure-cosmos/) -Examples: MongoDB, Amazon DynamoDB, FoundationDB, Azure Cosmos DB. +{{}} -**Benefits of YugabyteDB** +## PostgreSQL support -- Flexibility of SQL as query needs change in response to business changes. -- Distributed transactions across any number of nodes. -- Low latency, strongly consistent reads given that read-time quorum is avoided altogether. -- 100% Apache 2.0 open source even for enterprise features. +### How compatible is YugabyteDB with PostgreSQL? -**Trade-offs** +YugabyteDB is [wire-protocol, syntax, feature, and runtime](https://www.yugabyte.com/postgresql/postgresql-compatibility/) compatible with PostgreSQL. But that said, supporting all PostgreSQL features in a distributed system is not always feasible. -- None +{{}} -Learn more: [Why are NoSQL Databases Becoming Transactional?](https://www.yugabyte.com/blog/nosql-databases-becoming-transactional-mongodb-dynamodb-faunadb-cosmosdb/) +### Can I use my existing PostgreSQL tools and drivers with YugabyteDB? -#### Eventually Consistent NoSQL +Yes. YugabyteDB is [fully compatible](#how-compatible-is-yugabytedb-with-postgresql) with PostgreSQL and automatically works well with most of PostgreSQL tools. -Examples: Apache Cassandra, Couchbase. +{{}} -**Benefits of YugabyteDB** +### Are PostgreSQL extensions supported? -- Flexibility of SQL as query needs change in response to business changes. -- Strongly consistent, zero data loss writes. -- Strongly consistent as well as timeline-consistent reads without resorting to eventual consistency-related penalties such as read repairs and anti-entropy. -- 100% Apache 2.0 open source even for enterprise features. +YugabyteDB pre-bundles many popular extensions and these should be readily available on your cluster. But given the distributed nature of YugabyteDB, not all extensions are supported by default. -**Trade-offs** +{{}} -- Extremely short unavailability during the leader election time for all shard leaders lost during a node failure or network partition. +### How can I migrate from PostgreSQL? -Learn more: [Apache Cassandra: The Truth Behind Tunable Consistency, Lightweight Transactions & Secondary Indexes](https://www.yugabyte.com/blog/apache-cassandra-lightweight-transactions-secondary-indexes-tunable-consistency/) +YugabyteDB is fully compatible with PostgreSQL and so most PostgreSQL applications should work as is. To address corner cases, we have published a [comprehensive guide](/preview/manage/data-migration/migrate-from-postgres/) to help you migrate from PostgreSQL. -### When is YugabyteDB a good fit? +{{}} -YugabyteDB is a good fit for fast-growing, cloud native applications that need to serve business-critical data reliably, with zero data loss, high availability, and low latency. Common use cases include: +## Architecture -- Distributed Online Transaction Processing (OLTP) applications needing multi-region scalability without compromising strong consistency and low latency. For example, user identity, Retail product catalog, Financial data service. +### How does YugabyteDB distribute data? -- Hybrid Transactional/Analytical Processing (HTAP), also known as Translytical, applications needing real-time analytics on transactional data. For example, user personalization, fraud detection, machine learning. +The table data is split into [tablets](/preview/architecture/key-concepts/#tablet) and the table rows are mapped to the tablets via [sharding](/preview/explore/linear-scalability/data-distribution/). The tablets themselves are distributed across the various nodes in the cluster. -- Streaming applications needing to efficiently ingest, analyze, and store ever-growing data. For example, IoT sensor analytics, time series metrics, real-time monitoring. +{{}} -See some success stories at [yugabyte.com](https://www.yugabyte.com/success-stories/). +### How does YugabyteDB scale? -### When is YugabyteDB not a good fit? +YugabyteDB scales seamlessly when new nodes are added to the cluster without any service disruption. Table data is [stored distributed](#how-does-yugabytedb-distribute-data) in tablets. When new nodes are added, the rebalancer moves certain tablets to other nodes and keeps the number of tablets on each node more or less the same. As data grows, these tablets also split into two and are moved to other nodes. -YugabyteDB is not a good fit for traditional Online Analytical Processing (OLAP) use cases that need complete ad-hoc analytics. Use an OLAP store such as [Druid](http://druid.io/druid.html) or a data warehouse such as [Snowflake](https://www.snowflake.net/). +{{}} -### Any performance benchmarks available? +### How does YugabyteDB provide high availability? -[Yahoo Cloud Serving Benchmark (YCSB)](https://github.com/brianfrankcooper/YCSB/wiki) is a popular benchmarking framework for NoSQL databases. We benchmarked the Yugabyte Cloud QL (YCQL) API against standard Apache Cassandra using YCSB. YugabyteDB outperformed Apache Cassandra by increasing margins as the number of keys (data density) increased across all the 6 YCSB workload configurations. +YugabyteDB replicates [tablet](/preview/architecture/key-concepts/#tablet) data onto [followers](/preview/architecture/key-concepts/#tablet-follower) of the tablet via [RAFT](/preview/architecture/docdb-replication/raft/) consensus. This ensures that a consistent copy of the data is available in case of failures. On failures, one of the tablet followers is promoted to be the [leader](/preview/architecture/key-concepts/#tablet-leader). -[Netflix Data Benchmark (NDBench)](https://github.com/Netflix/ndbench) is another publicly available, cloud-enabled benchmark tool for data store systems. We ran NDBench against YugabyteDB for 7 days and observed P99 and P995 latencies that were orders of magnitude less than that of Apache Cassandra. +{{}} -Details for both the above benchmarks are published in [Building a Strongly Consistent Cassandra with Better Performance](https://www.yugabyte.com/blog/building-a-strongly-consistent-cassandra-with-better-performance-aa96b1ab51d6). +### How is data consistency maintained across multiple nodes? -### What about correctness testing? +Every write (insert, update, delete) to the data is replicated via [RAFT](/preview/architecture/docdb-replication/raft/) consensus to [tablet followers](/preview/architecture/key-concepts/#tablet-follower) as per the [replication factor (RF)](/preview/architecture/key-concepts/#replication-factor-rf) of the cluster. Before acknowledging the write operation back to the client, YugabyteDB ensures that the data is replicated to a quorum (RF/2 + 1) of followers. -[Jepsen](https://jepsen.io/) is a widely used framework to evaluate the behavior of databases under different failure scenarios. It allows for a database to be run across multiple nodes, and create artificial failure scenarios, as well as verify the correctness of the system under these scenarios. YugabyteDB 1.2 passes [formal Jepsen testing](https://www.yugabyte.com/blog/yugabyte-db-1-2-passes-jepsen-testing/). +{{}} -### How does YugabyteDB compare to other SQL and NoSQL databases? +### What is tablet splitting? -See [Compare YugabyteDB to other databases](../comparisons/) +Data is stored in [tablets](/preview/architecture/key-concepts/#tablet). As the tablet grows, the tablet splits into two. This enables some data to be moved to other nodes in the cluster. -- [Amazon Aurora](../comparisons/amazon-aurora/) -- [Google Cloud Spanner](../comparisons/google-spanner/) -- [MongoDB](../comparisons/mongodb/) -- [CockroachDB](../comparisons/cockroachdb/) +{{}} -## Architecture +### Are indexes colocated with tables? -### How does YugabyteDB's common document store work? +Indexes are not typically colocated with the base table. The sharding of indexes is based on the primary key of the index and is independent of how the main table is sharded/distributed which is based on the primary key of the table. -[DocDB](../../architecture/docdb/), YugabyteDB's distributed document store is common across all APIs, and built using a custom integration of Raft replication, distributed ACID transactions, and the RocksDB storage engine. Specifically, DocDB enhances RocksDB by transforming it from a key-value store (with only primitive data types) to a document store (with complex data types). **Every key is stored as a separate document in DocDB, irrespective of the API responsible for managing the key.** DocDB's [sharding](../../architecture/docdb-sharding/sharding/), [replication/fault-tolerance](../../architecture/docdb-replication/replication/), and [distributed ACID transactions](../../architecture/transactions/distributed-txns/) architecture are all based on the [Google Spanner design](https://research.google.com/archive/spanner-osdi2012.pdf) first published in 2012. [How We Built a High Performance Document Store on RocksDB?](https://www.yugabyte.com/blog/how-we-built-a-high-performance-document-store-on-rocksdb/) provides an in-depth look into DocDB. +{{}} +
+{{}} ### How can YugabyteDB be both CP and ensure high availability at the same time? In terms of the [CAP theorem](https://www.yugabyte.com/blog/a-for-apple-b-for-ball-c-for-cap-theorem-8e9b78600e6d), YugabyteDB is a consistent and partition-tolerant (CP) database. It ensures high availability (HA) for most practical situations even while remaining strongly consistent. While this may seem to be a violation of the CAP theorem, that is not the case. CAP treats availability as a binary option whereas YugabyteDB treats availability as a percentage that can be tuned to achieve high write availability (reads are always available as long as a single node is available). -- During network partitions or node failures, the replicas of the impacted tablets (whose leaders got partitioned out or lost) form two groups: a majority partition that can still establish a Raft consensus and a minority partition that cannot establish such a consensus (given the lack of quorum). The replicas in the majority partition elect a new leader among themselves in a matter of seconds and are ready to accept new writes after the leader election completes. For these few seconds till the new leader is elected, the DB is unable to accept new writes given the design choice of prioritizing consistency over availability. All the leader replicas in the minority partition lose their leadership during these few seconds and hence become followers. - -- Majority partitions are available for both reads and writes. Minority partitions are not available for writes, but may serve stale reads (up to a staleness as configured by the [--max_stale_read_bound_time_ms](../../reference/configuration/yb-tserver/#max-stale-read-bound-time-ms) flag). **Multi-active availability** refers to YugabyteDB's ability to dynamically adjust to the state of the cluster and serve consistent writes at any replica in the majority partition. - -- The approach above obviates the need for any unpredictable background anti-entropy operations as well as need to establish quorum at read time. As shown in the [YCSB benchmarks against Apache Cassandra](https://forum.yugabyte.com/t/ycsb-benchmark-results-for-yugabyte-and-apache-cassandra-again-with-p99-latencies/99), YugabyteDB delivers predictable p99 latencies as well as 3x read throughput that is also timeline-consistent (given no quorum is needed at read time). - -On one hand, the YugabyteDB storage and replication architecture is similar to that of [Google Cloud Spanner](https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Spanner-and-the-CAP-Theorem.html), which is also a CP database with high write availability. While Google Cloud Spanner leverages Google's proprietary network infrastructure, YugabyteDB is designed work on commodity infrastructure used by most enterprise users. On the other hand, YugabyteDB's multi-model, multi-API, and tunable read latency approach is similar to that of [Azure Cosmos DB](https://azure.microsoft.com/en-us/blog/a-technical-overview-of-azure-cosmos-db/). - -A post on our blog titled [Practical Tradeoffs in Google Cloud Spanner, Azure Cosmos DB and YugabyteDB](https://www.yugabyte.com/blog/practical-tradeoffs-in-google-cloud-spanner-azure-cosmos-db-and-yugabyte-db/) goes through the above tradeoffs in more detail. - -### Why is a group of YugabyteDB nodes called a universe instead of the more commonly used term clusters? - -A YugabyteDB universe packs a lot more functionality than what people think of when referring to a cluster. In fact, in certain deployment choices, the universe subsumes the equivalent of multiple clusters and some of the operational work needed to run them. Here are just a few concrete differences, which made us feel like giving it a different name would help earmark the differences and avoid confusion: - -- A YugabyteDB universe can move into new machines, availability zones (AZs), regions, and data centers in an online fashion, while these primitives are not associated with a traditional cluster. - -- You can set up multiple asynchronous replicas with just a few clicks (using YugabyteDB Anywhere). This is built into the universe as a first-class operation with bootstrapping of the remote replica and all the operational aspects of running asynchronous replicas being supported natively. In the case of traditional clusters, the source and the asynchronous replicas are independent clusters. The user is responsible for maintaining these separate clusters as well as operating the replication logic. - -- Failover to asynchronous replicas as the primary data and fallback once the original is up and running are both natively supported in a universe. - -### Why is consistent hash sharding the default sharding strategy? - -Users primarily turn to YugabyteDB for scalability reasons. Consistent hash sharding is ideal for massively scalable workloads because it distributes data evenly across all the nodes in the cluster, while retaining ease of adding nodes into the cluster. Most use cases that require scalability do not need to perform range lookups on the primary key, so consistent hash sharding is the default sharding strategy for YugabyteDB. Common applications that do not need hash sharding include user identity (user IDs do not need ordering), product catalog (product IDs are not related to one another), and stock ticker data (one stock symbol is independent of all other stock symbols). For applications that benefit from range sharding, YugabyteDB lets you select that option. - -To learn more about sharding strategies and lessons learned, see [Four Data Sharding Strategies We Analyzed in Building a Distributed SQL Database](https://www.yugabyte.com/blog/four-data-sharding-strategies-we-analyzed-in-building-a-distributed-sql-database/). +{{}} diff --git a/docs/content/preview/manage/data-migration/migrate-from-postgres.md b/docs/content/preview/manage/data-migration/migrate-from-postgres.md index 92f2f1fb69e9..d221187827dd 100644 --- a/docs/content/preview/manage/data-migration/migrate-from-postgres.md +++ b/docs/content/preview/manage/data-migration/migrate-from-postgres.md @@ -304,13 +304,13 @@ To learn more about the various useful metrics that can be monitored, see [Metri Because of the distributed nature of YugabyteDB, queries are executed quite differently from Postgres. This is because the latency across the nodes are taken into account by the query planner. Adopting the following practices will help improve the performance of your applications. -- **Single-row transactions**: YugabyteDB has optimizations to improve the performance of transactions in certain scenarios where transactions operate on a single row. Consider converting multi-statement transactions to single-statement ones to improve performace. {{}} +- **Single-row transactions**: YugabyteDB has optimizations to improve the performance of transactions in certain scenarios where transactions operate on a single row. Consider converting multi-statement transactions to single-statement ones to improve performace. {{}} -- **Use On Conflict clause**: Use the optional ON CONFLICT clause in the INSERT statement to circumvent certain errors and avoid multiple round trips. {{}} +- **Use On Conflict clause**: Use the optional ON CONFLICT clause in the INSERT statement to circumvent certain errors and avoid multiple round trips. {{}} -- **Set statement timeouts**: Avoid getting stuck in a wait loop because of starvation by using a reasonable timeout for the statements. {{}} +- **Set statement timeouts**: Avoid getting stuck in a wait loop because of starvation by using a reasonable timeout for the statements. {{}} -- **Stored procedures**: Use stored procedures to bundle a set of statements with error handling to be executed on the server and avoid multiple round trips. {{}} +- **Stored procedures**: Use stored procedures to bundle a set of statements with error handling to be executed on the server and avoid multiple round trips. {{}} {{}} For a full list of best practices to improve performance, see [Performance tuning in YSQL](../../../develop/learn/transactions/transactions-performance-ysql/) diff --git a/docs/content/stable/architecture/design-goals.md b/docs/content/stable/architecture/design-goals.md index 777b73cf809f..86e2bc3fd406 100644 --- a/docs/content/stable/architecture/design-goals.md +++ b/docs/content/stable/architecture/design-goals.md @@ -15,15 +15,15 @@ type: docs ## Scalability -YugabyteDB scales out horizontally by adding more nodes to handle increasing data volumes and higher workloads. With YugabyteDB, you can also opt for vertical scaling choosing more powerful infrastructure components. {{}} +YugabyteDB scales out horizontally by adding more nodes to handle increasing data volumes and higher workloads. With YugabyteDB, you can also opt for vertical scaling choosing more powerful infrastructure components. {{}} ## High Availability -YugabyteDB ensures continuous availability, even in the face of individual node failures or network partitions. YugabyteDB achieves this by replicating data across multiple nodes and implementing failover mechanisms via leader election. {{}} +YugabyteDB ensures continuous availability, even in the face of individual node failures or network partitions. YugabyteDB achieves this by replicating data across multiple nodes and implementing failover mechanisms via leader election. {{}} ## Fault Tolerance -YugabyteDB is resilient to various types of failures, such as node crashes, network partitions, disk failures, and other hardware or software faults and failure of various fault domains. It can automatically recover from these failures without data loss or corruption. {{}} +YugabyteDB is resilient to various types of failures, such as node crashes, network partitions, disk failures, and other hardware or software faults and failure of various fault domains. It can automatically recover from these failures without data loss or corruption. {{}} ## Consistency @@ -64,7 +64,7 @@ YugabyteDB monitors and automatically re-balances the number of tablet leaders a ## Data locality -YugabyteDB supports colocated tables and databases which enables related data to be kept together on the same node for performance reasons. {{}} +YugabyteDB supports colocated tables and databases which enables related data to be kept together on the same node for performance reasons. {{}} ## Security @@ -111,7 +111,7 @@ In addition: ## Cassandra compatibility -[YCQL](../../api/ycql/) is a [semi-relational CQL API](../../explore/ycql-language/) that is best suited for internet-scale OLTP and HTAP applications needing massive write scalability and fast queries. YCQL supports distributed transactions, strongly-consistent secondary indexes, and a native JSON column type. YCQL has its roots in the Cassandra Query Language. {{}} +[YCQL](../../api/ycql/) is a [semi-relational CQL API](../../explore/ycql-language/) that is best suited for internet-scale OLTP and HTAP applications needing massive write scalability and fast queries. YCQL supports distributed transactions, strongly-consistent secondary indexes, and a native JSON column type. YCQL has its roots in the Cassandra Query Language. {{}} ## Performance @@ -141,7 +141,7 @@ YugabyteDB has been designed with several cloud-native principles in mind. ## Kubernetes-ready -YugabyteDB works natively in Kubernetes and other containerized environments as a stateful application. {{}} +YugabyteDB works natively in Kubernetes and other containerized environments as a stateful application. {{}} ## Open source diff --git a/docs/content/stable/architecture/key-concepts.md b/docs/content/stable/architecture/key-concepts.md index a3adb61ca725..7c8de26b75aa 100644 --- a/docs/content/stable/architecture/key-concepts.md +++ b/docs/content/stable/architecture/key-concepts.md @@ -26,7 +26,7 @@ YugabyteDB provides ACID guarantees for all [transactions](#transaction). ## CDC - Change data capture -CDC is a software design pattern used in database systems to capture and propagate data changes from one database to another in real-time or near real-time. YugabyteDB supports transactional CDC guaranteeing changes across tables are captured together. This enables use cases like real-time analytics, data warehousing, operational data replication, and event-driven architectures. {{}} +CDC is a software design pattern used in database systems to capture and propagate data changes from one database to another in real-time or near real-time. YugabyteDB supports transactional CDC guaranteeing changes across tables are captured together. This enables use cases like real-time analytics, data warehousing, operational data replication, and event-driven architectures. {{}} ## Cluster @@ -38,11 +38,11 @@ Sometimes the term *cluster* is used interchangeably with the term *universe*. H ## DocDB -DocDB is the underlying document storage engine of YugabyteDB and is built on top of a highly customized and optimized verison of [RocksDB](http://rocksdb.org/). {{}} +DocDB is the underlying document storage engine of YugabyteDB and is built on top of a highly customized and optimized verison of [RocksDB](http://rocksdb.org/). {{}} ## Fault domain -A fault domain is a potential point of failure. Examples of fault domains would be nodes, racks, zones, or entire regions. {{}} +A fault domain is a potential point of failure. Examples of fault domains would be nodes, racks, zones, or entire regions. {{}} ## Fault tolerance @@ -54,15 +54,15 @@ The fault tolerance determines how resilient the cluster is to domain (that is, Normally, only the [tablet leader](#tablet-leader) can process user-facing write and read requests. Follower reads allow you to lower read latencies by serving reads from the tablet followers. This is similar to reading from a cache, which can provide more read IOPS with low latency. The data might be slightly stale, but is timeline-consistent, meaning no out of order data is possible. -Follower reads are particularly beneficial in applications that can tolerate staleness. For instance, in a social media application where a post gets a million likes continuously, slightly stale reads are acceptable, and immediate updates are not necessary because the absolute number may not really matter to the end-user reading the post. In such cases, a slightly older value from the closest replica can achieve improved performance with lower latency. Follower reads are required when reading from [read replicas](#read-replica-cluster). {{}} +Follower reads are particularly beneficial in applications that can tolerate staleness. For instance, in a social media application where a post gets a million likes continuously, slightly stale reads are acceptable, and immediate updates are not necessary because the absolute number may not really matter to the end-user reading the post. In such cases, a slightly older value from the closest replica can achieve improved performance with lower latency. Follower reads are required when reading from [read replicas](#read-replica-cluster). {{}} ## Hybrid time -Hybrid time/timestamp is a monotonically increasing timestamp derived using [Hybrid Logical clock](../transactions/transactions-overview/#hybrid-logical-clocks). Multiple aspects of YugabyteDB's transaction model are based on hybrid time. {{}} +Hybrid time/timestamp is a monotonically increasing timestamp derived using [Hybrid Logical clock](../transactions/transactions-overview/#hybrid-logical-clocks). Multiple aspects of YugabyteDB's transaction model are based on hybrid time. {{}} ## Isolation levels -[Transaction](#transaction) isolation levels define the degree to which transactions are isolated from each other. Isolation levels determine how changes made by one transaction become visible to other concurrent transactions. {{}} +[Transaction](#transaction) isolation levels define the degree to which transactions are isolated from each other. Isolation levels determine how changes made by one transaction become visible to other concurrent transactions. {{}} {{}} YugabyteDB offers 3 isolation levels - [Serializable](../../explore/transactions/isolation-levels/#serializable-isolation), [Snapshot](../../explore/transactions/isolation-levels/#snapshot-isolation) and [Read committed](../../explore/transactions/isolation-levels/#read-committed-isolation) - in the {{}} API and one isolation level - [Snapshot](../../develop/learn/transactions/acid-transactions-ycql/) - in the {{}} API. @@ -74,11 +74,11 @@ YugabyteDB tries to keep the number of leaders evenly distributed across the [no ## Leader election -Amongst the [tablet](#tablet) replicas, one tablet is elected [leader](#tablet-leader) as per the [Raft](../docdb-replication/raft) protocol. {{}} +Amongst the [tablet](#tablet) replicas, one tablet is elected [leader](#tablet-leader) as per the [Raft](../docdb-replication/raft) protocol. {{}} ## Master server -The [YB-Master](../yb-master/) service is responsible for keeping system metadata, coordinating system-wide operations, such as creating, altering, and dropping tables, as well as initiating maintenance operations such as load balancing. {{}} +The [YB-Master](../yb-master/) service is responsible for keeping system metadata, coordinating system-wide operations, such as creating, altering, and dropping tables, as well as initiating maintenance operations such as load balancing. {{}} {{}} The master server is also typically referred as just **master**. @@ -86,7 +86,7 @@ The master server is also typically referred as just **master**. ## MVCC -MVCC stands for Multi-version Concurrency Control. It is a concurrency control method used by YugabyteDB to provide access to data in a way that allows concurrent queries and updates without causing conflicts. {{}} +MVCC stands for Multi-version Concurrency Control. It is a concurrency control method used by YugabyteDB to provide access to data in a way that allows concurrent queries and updates without causing conflicts. {{}} ## Namespace @@ -118,7 +118,7 @@ Designating one region as preferred can reduce the number of network hops needed Regardless of the preferred region setting, data is replicated across all the regions in the cluster to ensure region-level fault tolerance. -You can enable [follower reads](#follower-reads) to serve reads from non-preferred regions. In cases where the cluster has [read replicas](#read-replica-cluster) and a client connects to a read replica, reads are served from the replica; writes continue to be handled by the preferred region. {{}} +You can enable [follower reads](#follower-reads) to serve reads from non-preferred regions. In cases where the cluster has [read replicas](#read-replica-cluster) and a client connects to a read replica, reads are served from the replica; writes continue to be handled by the preferred region. {{}} ## Primary cluster @@ -126,17 +126,17 @@ A primary cluster can perform both writes and reads, unlike a [read replica clus ## Raft -Raft stands for Replication for availability and fault tolerance. This is the algorithm that YugabyteDB uses for replication guaranteeing consistency. {{}} +Raft stands for Replication for availability and fault tolerance. This is the algorithm that YugabyteDB uses for replication guaranteeing consistency. {{}} ## Read replica cluster Read replica clusters are optional clusters that can be set up in conjunction with a [primary cluster](#primary-cluster) to perform only reads; writes sent to read replica clusters get automatically rerouted to the primary cluster of the [universe](#universe). These clusters enable reads in regions that are far away from the primary cluster with timeline-consistent data. This ensures low latency reads for geo-distributed applications. -Data is brought into the read replica clusters through asynchronous replication from the primary cluster. In other words, [nodes](#node) in a read replica cluster act as Raft observers that do not participate in the write path involving the Raft leader and Raft followers present in the primary cluster. Reading from read replicas requires enabling [follower reads](#follower-reads). {{}} +Data is brought into the read replica clusters through asynchronous replication from the primary cluster. In other words, [nodes](#node) in a read replica cluster act as Raft observers that do not participate in the write path involving the Raft leader and Raft followers present in the primary cluster. Reading from read replicas requires enabling [follower reads](#follower-reads). {{}} ## Rebalancing -Rebalancing is the process of keeping an even distribution of tablets across the [nodes](#node) in a cluster. {{}} +Rebalancing is the process of keeping an even distribution of tablets across the [nodes](#node) in a cluster. {{}} ## Region @@ -146,24 +146,24 @@ A region refers to a defined geographical area or location where a cloud provide The number of copies of data in a YugabyteDB universe. YugabyteDB replicates data across [fault domains](#fault-domain) (for example, zones) in order to tolerate faults. [Fault tolerance](#fault-tolerance) (FT) and RF are correlated. To achieve a FT of k nodes, the universe has to be configured with a RF of (2k + 1). -The RF should be an odd number to ensure majority consensus can be established during failures. {{}} +The RF should be an odd number to ensure majority consensus can be established during failures. {{}} Each [read replica](#read-replica-cluster) cluster can also have its own replication factor. In this case, the replication factor determines how many copies of your primary data the read replica has; multiple copies ensure the availability of the replica in case of a node outage. Replicas *do not* participate in the primary cluster Raft consensus, and do not affect the fault tolerance of the primary cluster or contribute to failover. ## Sharding -Sharding is the process of mapping a table row to a [tablet](#tablet). YugabyteDB supports 2 types of sharding, Hash and Range. {{}} +Sharding is the process of mapping a table row to a [tablet](#tablet). YugabyteDB supports 2 types of sharding, Hash and Range. {{}} ## Smart driver A smart driver in the context of YugabyteDB is essentially a PostgreSQL driver with additional "smart" features that leverage the distributed nature of YugabyteDB. These smart drivers intelligently distribute application connections across the nodes and regions of a YugabyteDB cluster, eliminating the need for external load balancers. This results in balanced connections that provide lower latencies and prevent hot nodes. For geographically-distributed applications, the driver can seamlessly connect to the geographically nearest regions and availability zones for lower latency. Smart drivers are optimized for use with a distributed SQL database, and are both cluster-aware and topology-aware. They keep track of the members of the cluster as well as their locations. As nodes are added or removed from clusters, the driver updates its membership and topology information. The drivers read the database cluster topology from the metadata table, and route new connections to individual instance endpoints without relying on high-level cluster endpoints. The smart drivers are also capable of load balancing read-only connections across the available YB-TServers. -. {{}} +. {{}} ## Tablet -YugabyteDB splits a table into multiple small pieces called tablets for data distribution. The word "tablet" finds its origins in ancient history, when civilizations utilized flat slabs made of clay or stone as surfaces for writing and maintaining records. {{}} +YugabyteDB splits a table into multiple small pieces called tablets for data distribution. The word "tablet" finds its origins in ancient history, when civilizations utilized flat slabs made of clay or stone as surfaces for writing and maintaining records. {{}} {{}} Tablets are also referred as shards. @@ -179,15 +179,15 @@ In a cluster, each [tablet](#tablet) is replicated as per the [replication facto ## Tablet splitting -When a tablet reaches a threshold size, it splits into 2 new [tablets](#tablet). This is a very quick operation. {{}} +When a tablet reaches a threshold size, it splits into 2 new [tablets](#tablet). This is a very quick operation. {{}} ## Transaction -A transaction is a sequence of operations performed as a single logical unit of work. YugabyteDB provides [ACID](#acid) guarantees for transactions. {{}} +A transaction is a sequence of operations performed as a single logical unit of work. YugabyteDB provides [ACID](#acid) guarantees for transactions. {{}} ## TServer -The [YB-TServer](../yb-tserver) service is responsible for maintaining and managing table data in the form of tablets, as well as dealing with all the queries. {{}} +The [YB-TServer](../yb-tserver) service is responsible for maintaining and managing table data in the form of tablets, as well as dealing with all the queries. {{}} ## Universe @@ -199,19 +199,19 @@ Sometimes the terms *universe* and *cluster* are used interchangeably. The two a ## xCluster -xCluster is a type of deployment where data is replicated asynchronously between two [universes](#universe) - a primary and a standby. The standby can be used for disaster recovery. YugabyteDB supports transactional xCluster {{}}. +xCluster is a type of deployment where data is replicated asynchronously between two [universes](#universe) - a primary and a standby. The standby can be used for disaster recovery. YugabyteDB supports transactional xCluster {{}}. ## YCQL -Semi-relational SQL API that is best fit for internet-scale OLTP and HTAP apps needing massive write scalability as well as blazing-fast queries. It supports distributed transactions, strongly consistent secondary indexes, and a native JSON column type. YCQL has its roots in the Cassandra Query Language. {{}} +Semi-relational SQL API that is best fit for internet-scale OLTP and HTAP apps needing massive write scalability as well as blazing-fast queries. It supports distributed transactions, strongly consistent secondary indexes, and a native JSON column type. YCQL has its roots in the Cassandra Query Language. {{}} ## YQL -The YugabyteDB Query Layer (YQL) is the primary layer that provides interfaces for applications to interact with using client drivers. This layer deals with the API-specific aspects such as query/command compilation and the run-time (data type representations, built-in operations, and more). {{}} +The YugabyteDB Query Layer (YQL) is the primary layer that provides interfaces for applications to interact with using client drivers. This layer deals with the API-specific aspects such as query/command compilation and the run-time (data type representations, built-in operations, and more). {{}} ## YSQL -Fully-relational SQL API that is wire compatible with the SQL language in PostgreSQL. It is best fit for RDBMS workloads that need horizontal write scalability and global data distribution while also using relational modeling features such as JOINs, distributed transactions, and referential integrity (such as foreign keys). Note that YSQL reuses the native query layer of the PostgreSQL open source project. {{}} +Fully-relational SQL API that is wire compatible with the SQL language in PostgreSQL. It is best fit for RDBMS workloads that need horizontal write scalability and global data distribution while also using relational modeling features such as JOINs, distributed transactions, and referential integrity (such as foreign keys). Note that YSQL reuses the native query layer of the PostgreSQL open source project. {{}} ## Zone diff --git a/docs/content/stable/manage/data-migration/migrate-from-postgres.md b/docs/content/stable/manage/data-migration/migrate-from-postgres.md index c812a7769c76..33fed0999619 100644 --- a/docs/content/stable/manage/data-migration/migrate-from-postgres.md +++ b/docs/content/stable/manage/data-migration/migrate-from-postgres.md @@ -300,13 +300,13 @@ Regularly monitor the target database to ensure it is performing efficiently. Th Because of the distributed nature of YugabyteDB, queries are executed quite differently from Postgres. This is because the latency across the nodes are taken into account by the query planner. Adopting the following practices will help improve the performance of your applications. -- **Single-row transactions**: YugabyteDB has optimizations to improve the performance of transactions in certain scenarios where transactions operate on a single row. Consider converting multi-statement transactions to single-statement ones to improve performace. {{}} +- **Single-row transactions**: YugabyteDB has optimizations to improve the performance of transactions in certain scenarios where transactions operate on a single row. Consider converting multi-statement transactions to single-statement ones to improve performace. {{}} -- **Use On Conflict clause**: Use the optional ON CONFLICT clause in the INSERT statement to circumvent certain errors and avoid multiple round trips. {{}} +- **Use On Conflict clause**: Use the optional ON CONFLICT clause in the INSERT statement to circumvent certain errors and avoid multiple round trips. {{}} -- **Set statement timeouts**: Avoid getting stuck in a wait loop because of starvation by using a reasonable timeout for the statements. {{}} +- **Set statement timeouts**: Avoid getting stuck in a wait loop because of starvation by using a reasonable timeout for the statements. {{}} -- **Stored procedures**: Use stored procedures to bundle a set of statements with error handling to be executed on the server and avoid multiple round trips. {{}} +- **Stored procedures**: Use stored procedures to bundle a set of statements with error handling to be executed on the server and avoid multiple round trips. {{}} {{}} For a full list of best practices to improve performance, see [Performance tuning in YSQL](../../../develop/learn/transactions/transactions-performance-ysql/) diff --git a/docs/layouts/shortcodes/link.html b/docs/layouts/shortcodes/link.html index 88ce3d03f28c..5bf27bdcaa12 100644 --- a/docs/layouts/shortcodes/link.html +++ b/docs/layouts/shortcodes/link.html @@ -1,5 +1,7 @@ {{/* */}} -{{- $url :=.Get 0 -}} +{{- $url :=.Get "dest" -}} +{{- $text := .Get "text" -}} +{{- $before := eq (.Get "icon-before") "true" -}} {{/* get the page path so determine depth and version */}} {{- $path := split $.Page.File.Dir "/" -}} {{/* version is the first path of the path. eg: preview/stable/v2.12 ...*/}} @@ -10,5 +12,9 @@ {{- if and (strings.HasPrefix $url "http") (not (strings.HasPrefix $url "https://docs.yugabyte")) -}} {{- $icon = "fa-arrow-up-right-from-square" -}} {{- end -}} + + {{- if $before -}} {{ end -}} + {{if $text}}{{$text}} {{end}} - + {{- if not $before -}} {{- end -}} + diff --git a/docs/layouts/shortcodes/release.html b/docs/layouts/shortcodes/release.html index d1fdbc924a8f..e20c8bcd3fbc 100644 --- a/docs/layouts/shortcodes/release.html +++ b/docs/layouts/shortcodes/release.html @@ -6,6 +6,14 @@ {{- $numversions := len $versions -}} {{- $count := 0 -}} {{- range $version := $versions -}} + {{- if or (eq $version "preview") (eq $version "stable") -}} + {{- range page.Site.Data.currentVersions.dbVersions -}} + {{- if eq $version .alias -}} + {{- $version = .version -}} + {{- break -}} + {{- end -}} + {{- end -}} + {{- end -}} {{- /* retain the original text for display */ -}} {{- $orig := trim $version " " -}} {{- /* trim the spaces,+,.,x,X */ -}}