Skip to content

Commit

Permalink
update getting started and advanced faqs
Browse files Browse the repository at this point in the history
Signed-off-by: deepthi <[email protected]>
  • Loading branch information
deepthi committed Nov 21, 2024
1 parent 9770327 commit 4c3f19b
Show file tree
Hide file tree
Showing 6 changed files with 38 additions and 37 deletions.
2 changes: 1 addition & 1 deletion content/en/docs/faq/advanced-configuration/vindex.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Vindex
title: Vindexes
weight: 6
---

Expand Down
14 changes: 7 additions & 7 deletions content/en/docs/faq/getting-started/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,28 +7,28 @@ weight: 7

All Vitess components have a web UI that you can access to see the state of each component.

The first place to look is the /debug/status page.
The first place to look is the `/debug/status` page.

* This is the main landing page for a VTGate, which displays the status of a particular server. A list of tablets this VTGate process is connected to is also displayed, as this is the list of tablets that can potentially serve queries.

A second place to look is the /debug/vars page. For example, for VTGate, this page contains the following items:
A second place to look is the `/debug/vars` page. For example, for VTGate, this page contains the following items:

* VTGateApi - This is the main histogram variable to track for VTGates. It gives you a break down of all queries by command, keyspace, and type.
* VTGateApi - This is the main histogram variable to track for VTGates. It gives you a breakdown of all queries by command, keyspace, and type.
* HealthcheckConnections - It shows the number of tablet connections for query/healthcheck per keyspace, shard, and tablet type.

There are two other pages you can use to get monitoring information from Vitess in the VTGate web UI:

* /debug/query_plans - This URL gives you all the query plans for queries going through VTGate.
* /debug/vschema - This URL shows the vschema as loaded by VTGate.
* `/debug/query_plans` - This URL gives you all the query plans for queries going through VTGate.
* `/debug/vschema` - This URL shows the VSchema as loaded by VTGate.

VTTablet has a similar web UI.

Vitess component metrics can also be scraped via /metrics. This will provide a Prometheus-format metric dump that is updated continuously. This is the recommended way to collect metrics from Vitess.

## How do you integrate Prometheus and Vitess?

There is an Prometheus exporter that is on by default that enables you to configure a Prometheus compatible scraper to grab data from the various Vitess components. All Vitess components with web UI’s export their metrics on their web UI port on /metrics.
There is an Prometheus exporter that is on by default that enables you to configure a Prometheus compatible scraper to grab data from the various Vitess components. All Vitess components export their metrics on their http port at `/metrics`.

If your Vitess configuration includes running the Vitess or PlanetScaleDB Operator on Kubernetes, then you can have Prometheus or a Prometheus compatible agent running in your Kubernetes cluster. This would then scrape the metrics from Vitess automatically, as it would be run on the ports advertised and on our standard /metrics page. With the PlanetScaleDB Operator for Kubernetes, this is done for you automatically.
If your Vitess configuration includes running the Vitess Operator on Kubernetes, then you can have Prometheus or a Prometheus compatible agent running in your Kubernetes cluster. This would then scrape the metrics from Vitess automatically, as it would be run on the ports advertised and on our standard `/metrics` page.

You can read more about getting the metrics into Prometheus [here](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config).
2 changes: 1 addition & 1 deletion content/en/docs/faq/getting-started/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Vitess consists of a number of server processes and command-line utilities and i

The diagram below illustrates Vitess’ components and their location within Vitess’ architecture:

<img alt="Vitess Components" src="../img/vitess-components.png" width=100%>
<img alt="Vitess Components" src="/img/vitess-components.png" width=100%>

## Are microservices recommended for scaling?

Expand Down
7 changes: 4 additions & 3 deletions content/en/docs/faq/getting-started/topology.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,15 @@ Vitess uses a plugin implementation to support multiple backend technologies for
The Topology Service interfaces are defined in our code in go/vt/topo/, specific implementations are in go/vt/topo/<name>, and we also have a set of unit tests for it in go/vt/topo/test.

{{< info >}}
If starting from scratch, please use the `etcd` implementation. The Consul implementation is deprecated, although still supported.
If starting from scratch, please use the `etcd` implementation.
{{< /info >}}

## How do I choose which topology server to use?

The first question to consider is: Do you use one already or are you required to use a specific one? If the answer to that question is yes, then you should likely implement that rather than adding a new server to run Vitess.
The first question to consider is: do you use one already or are you required to use a specific one? If the answer to that question is yes, then you should likely implement that rather than adding a new server to run Vitess.
However, in large implementations, it makes sense to run a separate topology server dedicated to Vitess. This avoids "noisy neighbor" problems.

If the answer to that question is no, then we’d recommend that you use etcd if you can, otherwise we’d recommend that you use ZooKeeper.
By default, we recommend that you use etcd if you can, otherwise you may use ZooKeeper.

## How do I implement etcd (etcd2)?

Expand Down
22 changes: 11 additions & 11 deletions content/en/docs/faq/getting-started/vreplication.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,24 @@ weight: 6

## What is VReplication? How does it work?

VReplication is used as a building block for a number of use cases throughout Vitess. It works as a stream or combination of streams that establish replication from a source keyspace/shard into a target keyspace/shard. A given stream can replicate multiple tables. It allows Vitess to keep the data being copied in-sync by using a combination of copying rows and filtered replication.
VReplication is used as a building block for a number of features in Vitess. It works as a stream or combination of streams that establish replication from a source keyspace/shard into a target keyspace/shard. A given stream can replicate multiple tables. It allows Vitess to keep the data being copied in-sync by using a combination of copying rows and filtered replication.

Vreplication works via the following process:
VReplication works via the following process:

1. Analyzing the source table and identifying what rows it needs to copy.
2. It then very briefly locks the table and makes a note of the current GTID replication position on the source database. After it’s noted the current GTID Vreplication then unlocks the table again.
3. It selects all the rows and all the columns from GTID value 0 onward and copies from that select.
4. It then streams the copy over to Vitess to start inserting rows. Vreplication will keep copying for a period of time, around an hour, to attempt to finish the copy.
5. If Vreplication hasn’t finished in an hour, it will stop and go back to the table in order to pick up any changes that have been made since it started copying.
1. It first analyzes the source table on the source shard and identifies what rows it needs to copy.
2. It then very briefly locks the table and records the current GTID replication position on the source database. After recording the current GTID position, VReplication then unlocks the table again.
3. It selects all rows and columns that match a specified filter from GTID value 0 onward and makes a copy of the results.
4. It then streams the copy over to the target shard to start inserting rows. VReplication will keep copying for a specified period of time (default 1 hour), to attempt to finish the copy.
5. If the copying phase on the target hasn’t finished in an hour, it will stop and go back to the table in order to pick up any changes that have been made since it started copying.
6. It knows what the GTID was when it started copying and what the GTID is now. This enables it to determine what events have occurred after it performed the first select and copy.
7. It will then filter out all the events except the ones that pertain to the relevant table and will apply the changes to the destination table.

This process then repeats until Vreplication finishes copying the whole table. After the copying process finishes Vreplication will change to filtered replication to keep the table in sync.
This process then repeats until VReplication finishes copying the whole table. After the copying process finishes VReplication will transition to filtered replication to keep the table in sync between the source and the target.

## How can I use VReplication?

There are a number of higher level commands like MoveTables and Materialized Views that create Vreplication streams behind the scenes of the command. By using these higher level commands, Vitess creates VReplication rules for the user. Further use cases are listed out [here](https://vitess.io/docs/reference/features/vreplication/).
There are a number of higher level commands like MoveTables, Reshard, and Materialize that create VReplication streams behind the scenes of the command. By using these higher level commands, Vitess creates VReplication rules for the user. Further use cases are listed out [here](https://vitess.io/docs/reference/features/vreplication/).

For more information on [MoveTables](https://vitess.io/docs/user-guides/migration/move-tables/) and [Materialized Views](https://vitess.io/docs/user-guides/migration/materialize/ please follow the links provided.
For more information on [MoveTables](https://vitess.io/docs/user-guides/migration/move-tables/) and [Materialized Views](https://vitess.io/docs/user-guides/migration/materialize/) please follow the links provided.

There is a way to create VReplication rules by hand but we don’t recommend using that method as it can be challenging to configure the rules correctly.
It is possible to create VReplication rules by hand, but we don’t recommend doing that as it can be challenging to configure the rules correctly.
28 changes: 14 additions & 14 deletions content/en/docs/faq/getting-started/vschema.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: VSchema
title: VSchema and Vindexes
weight: 5
---

Expand All @@ -11,21 +11,11 @@ In contrast to a traditional database schema that contains metadata about tables

Simply put, it contains the information needed to make Vitess look and act like a single database server.

For example, the VSchema will contain the information about the sharding key for each sharded table. When the application issues a query with a WHERE clause that references the key, the VSchema information will be used to route the query to the appropriate shard.

## What is a primary Vindex and how does it work?

The Primary Vindex for a table is analogous to a database primary key.

Every sharded table must have one defined. A Primary Vindex must be unique: given an input value, it must produce a single keyspace ID. At the time of an insert to the table, the unique mapping produced by the Primary Vindex determines the target shard for the inserted row.

In Vitess, the choice of Vindex allows control of how a column value maps to a keyspace ID. In other words, a Primary Vindex in Vitess not only defines the Sharding Key, but also decides the Sharding Strategy.

Uniqueness for a Primary Vindex does not mean that the column has to be a primary key or unique key in the MySQL schema for the underlying shard. You can have multiple rows that map to the same keyspace ID. The Vindex uniqueness constraint only ensures that all rows for a keyspace ID end up in the same shard.
For example, the VSchema will contain the sharding key for each sharded table. When the application issues a query with a WHERE clause that references the key, the VSchema will be used to route the query to the appropriate shard.

## What is a Vindex and how does it work?

A Vindex provides a way to map a column value to a keyspace ID. Since each shard in Vitess covers a range of keyspace ID values, this mapping can be used to identify which shard contains a row.
A Vindex provides a way to map a column value to a keyspace ID. Since each shard in Vitess covers a range of keyspace ID values, this mapping can be used to identify which shard contains a row.

The advantages of Vindexes stem from their flexibility:

Expand All @@ -37,6 +27,16 @@ The advantages of Vindexes stem from their flexibility:

The Vschema contains the Vindex for any sharded tables. Every Vschema must have at least one Vindex, called the Primary Vindex, defined. A variety of other Vindexes are also available to choose from, with different trade-offs, and you can choose one that best suits your needs. You can read more about other Vindexes [here](https://vitess.io/docs/reference/features/vindexes/).

## What is a primary Vindex and how does it work?

The Primary Vindex for a table is analogous to a database primary key.

Every sharded table must have one defined. A Primary Vindex must be unique: given an input value, it must produce a single keyspace ID. At the time of an insert to the table, the unique mapping produced by the Primary Vindex determines the target shard for the inserted row.

In Vitess, the choice of Vindex allows control of how a column value maps to a keyspace ID. In other words, a Primary Vindex in Vitess not only defines the Sharding Key, but also decides the Sharding Strategy.

Uniqueness for a Primary Vindex does not mean that the column has to be a primary key or unique key in the MySQL schema for the underlying shard. You can have multiple rows that map to the same keyspace ID. The Vindex uniqueness constraint only ensures that all rows for a keyspace ID end up in the same shard.

## How do I create a VSchema?

The ease of creation of a VSchema depends heavily on now your data model is constructed.
Expand All @@ -53,6 +53,6 @@ Please do keep in mind that you don’t have to have Vindex to cover every query

For a very trivial setup where there is only one unsharded keyspace, there is no need to specify a VSchema because Vitess will know that there is nowhere to route a query except to the single shard.

However, once you have sharding, having a VSchema becomes a necessity. This is because a VSchema is needed to locate and place rows row each table in a sharded keyspace.
However, once you have sharding, having a VSchema becomes a necessity. This is because a VSchema is needed to locate and place rows in each table in a sharded keyspace.

The Vitess distribution has a demo of VSchema operation [here](https://github.com/vitessio/vitess/tree/master/examples/demo).

0 comments on commit 4c3f19b

Please sign in to comment.