From 78aea200fc40153a8e314f9cfc7841c60d3d6483 Mon Sep 17 00:00:00 2001 From: yangj1211 Date: Thu, 1 Aug 2024 16:25:15 +0800 Subject: [PATCH] update docs --- .../deploy-MatrixOne-cluster-without-k8.md | 32 +- .../BI-Connection/FineBI-connection.md | 22 +- .../BI-Connection/Superset-connection.md | 18 +- .../BI-Connection/yonghong-connection.md | 12 +- .../Flink/flink-kafka-matrixone.md | 2 +- .../Flink/flink-mysql-matrixone.md | 4 +- .../Flink/flink-oracle-matrixone.md | 2 +- .../Flink/flink-sqlserver-matrixone.md | 22 +- .../Flink/flink-tidb-matrixone.md | 4 +- .../Spark/spark-hive-matrixone.md | 4 +- .../Spark/spark-mysql-matrixone.md | 4 +- .../Etl/DataX/datax-influxdb-matrixone.md | 2 +- .../Etl/DataX/datax-overview.md | 2 +- .../Etl/DataX/datax-postgresql-matrixone.md | 4 +- .../Etl/DataX/datax-sqlserver-matrixone.md | 2 +- .../Scheduling-Tools/dolphinScheduler.md | 26 +- .../Publish-Subscribe/pub-sub-overview.md | 2 +- .../Develop/Vector/cluster_centers.md | 4 +- .../MatrixOne/Develop/Vector/vector_search.md | 2 +- docs/MatrixOne/Develop/Vector/vector_type.md | 4 +- .../connect-to-matrixone-with-c#.md | 2 +- .../migrate-from-postgresql-to-matrixone.md | 14 +- .../architecture/architecture-logtail.md | 4 +- .../architecture-matrixone-operator.md | 152 +++++++++ .../architecture/architecture-proxy.md | 8 +- .../architecture-transaction-lock.md | 2 +- .../Overview/architecture/architecture-wal.md | 111 +++++-- .../feature/1.1-mysql-compatibility.md | 210 ------------ .../Overview/feature/key-feature-htap.md | 12 +- .../feature/key-feature-multi-accounts.md | 2 +- .../Overview/matrixone-introduction.md | 2 +- .../matrixone-positioning .md | 2 +- .../matrixone-vs-oltp.md | 36 +- .../Aggregate-Functions/bitmap.md | 2 +- .../Vector/cosine_distance.md | 2 +- .../Vector/cosine_similarity.md | 2 +- .../Vector/inner_product.md | 2 +- .../Functions-and-Operators/Vector/l1_norm.md | 2 +- .../Vector/l2_distance.md | 2 +- .../Functions-and-Operators/Vector/l2_norm.md | 2 +- .../Vector/normalize_l2.md | 2 +- docs/MatrixOne/Reference/System-tables.md | 313 ++++++++++-------- .../lower_case_tables_name.md | 2 +- .../Tutorial/django-python-crud-demo.md | 14 +- docs/MatrixOne/Tutorial/rag-demo.md | 6 +- .../MatrixOne/Tutorial/search-picture-demo.md | 6 +- mkdocs.yml | 2 +- 47 files changed, 581 insertions(+), 509 deletions(-) create mode 100644 docs/MatrixOne/Overview/architecture/architecture-matrixone-operator.md delete mode 100644 docs/MatrixOne/Overview/feature/1.1-mysql-compatibility.md diff --git a/docs/MatrixOne/Deploy/deploy-Matrixone-cluster/deploy-MatrixOne-cluster-without-k8.md b/docs/MatrixOne/Deploy/deploy-Matrixone-cluster/deploy-MatrixOne-cluster-without-k8.md index 1ad826821..6ea8b0686 100644 --- a/docs/MatrixOne/Deploy/deploy-Matrixone-cluster/deploy-MatrixOne-cluster-without-k8.md +++ b/docs/MatrixOne/Deploy/deploy-Matrixone-cluster/deploy-MatrixOne-cluster-without-k8.md @@ -61,7 +61,7 @@ In addition, for container management and orchestration on Kubernetes, we need t The overall deployment architecture is shown in the following figure:
- +
The overall architecture consists of the following components: @@ -87,7 +87,7 @@ MatrixOne creates a series of Kubernetes objects based on Operator's rules that - PV:PV (Persistent Volume) is an abstract representation of a storage medium that can be viewed as a storage unit. After the PVC has been requested, the PV is created through software that implements the CSI interface and binds it to the PVC requesting the resource.
- +
## 1\. Deploy a Kubernetes cluster @@ -181,7 +181,7 @@ docker run -d \ Once this is done, you can enter `http://1.13.2.100` (Springboard IP address) in your browser to open the Kuboard-Spray web interface, enter the username `admin`, the default password `Kuboard123`, and log into the Kuboard-Spray interface as follows: -![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-1.png) +![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-1.png?raw=true) Once logged in, you can begin deploying Kubernetes clusters. @@ -197,21 +197,21 @@ The installation interface downloads the resource package corresponding to the K Download version `spray-v2.18.0b-2_k8s-v1.23.17_v1.24-amd64` - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-2.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-2.png?raw=true) 2. After clicking **Import**, select **Load Resource Package**, select the appropriate download source, and wait for the resource package download to complete. !!! note It is recommended that you choose Docker as the container engine for the K8s cluster. After selecting Docker as the container engine for K8s, Kuboard-Spray automatically uses Docker to run the various components of the K8s cluster, including containers on the Master node and the Worker node. - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-3.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-3.png?raw=true) 3. This `pulls` the relevant mirror dependencies: - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-4.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-4.png?raw=true) 4. After the mirrored resource pack is pulled successfully, return to Kuboard-Spray's web interface and see that the corresponding version of the resource pack has been imported. - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-5.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-5.png?raw=true) #### Installing a Kubernetes Cluster @@ -219,11 +219,11 @@ This chapter will guide you through the installation of the Kubernetes cluster. 1. Select **Cluster Management** and select **Add Cluster Installation Plan**: - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-6.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-6.png?raw=true) 2. In the pop-up dialog box, define the name of the cluster, select the version of the resource package you just imported, and click **OK**. As shown in the following figure: - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-7.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-7.png?raw=true) ##### Cluster planning @@ -233,16 +233,16 @@ After defining the completion cluster name in the previous step and selecting th 1. Select the role and name of the corresponding node: - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-8.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-8.png?raw=true) - master node: Select the ETCD and control node and name it master0\. (You can also select the working node if you want the master node to work.) This approach improves resource utilization, but reduces the high availability of Kubernetes.) - worker node: Select only the worker node and name it node0. 2. After each node has filled in the role and node name, fill in the connection information for the corresponding node to the right, as shown in the following figure: - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-9.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-9.png?raw=true) - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-9-1.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-9-1.png?raw=true) 3. Click **Save** when you have filled out all the roles. Next you are ready to install the Kubernetes cluster. @@ -252,7 +252,7 @@ After completing all roles in the previous step and **saving** them, click **Exe 1. Click **OK** to begin installing the Kubernetes cluster as shown in the following figure: - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-10.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-10.png?raw=true) 2. When you install a Kubernetes cluster, the Kubernetes cluster is installed by executing an `ansible` script on the corresponding node. The overall event can take anywhere from 5 to 10 minutes depending on the machine configuration and network and the time to wait. @@ -275,7 +275,7 @@ After completing all roles in the previous step and **saving** them, click **Exe vim /etc/resolve.conf ``` - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-10-1.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-10-1.png?raw=true) ## 2. Deployment helm @@ -432,13 +432,13 @@ __Note:__ This chapter operates at the master0 node. 3. Once launched, use to log into MinIO's page and create the information stored by the object. As shown in the following figure, the account password is the rootUser and rootPassword set by `--set rootUser=rootuser,rootPassword=rootpass123` in the above steps: - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-13.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-13.png?raw=true) 4. Once the login is complete, you need to create an object to store the relevant information: Click **Bucket > Create Bucket** and fill in Bucket's name **minio-mo** in **Bucket Name**. Once completed, click the button **Create Bucket** at the bottom right. - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/deploy/deploy-mo-cluster-14.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/deploy/deploy-mo-cluster-14.png?raw=true) ## 5. MatrixOne Cluster Deployment diff --git a/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/FineBI-connection.md b/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/FineBI-connection.md index 5104a665d..c3867b512 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/FineBI-connection.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/FineBI-connection.md @@ -19,11 +19,11 @@ MatrixOne supports integration with the data visualization tool FineBI. This art 1. After logging into FineBI, select **Management System > Data Connection > Data Connection Management > New Data Connection** as shown below, then choose **MySQL**: - ![image-20230808174909411](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/select-mysql.png) + ![image-20230808174909411](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/select-mysql.png?raw=true) 2. Fill in the MatrixOne connection configuration, including the database name, host, port, username, and password. Other parameters can be left at their default settings. You can click the **Test Connection** button to verify if the connection is functional and then click **Save** : - ![image-20230808182330603](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/testing.png) + ![image-20230808182330603](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/testing.png?raw=true) ## Creating Visual Reports Using MatrixOne Data @@ -127,7 +127,7 @@ MatrixOne supports integration with the data visualization tool FineBI. This art You can click the **Preview** button to view the results of the SQL query and then click **OK** to save it: - ![image-20230809091306270](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/preview.png) + ![image-20230809091306270](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/preview.png?raw=true) Below are examples of all the query SQL used in this demo: @@ -232,7 +232,7 @@ MatrixOne supports integration with the data visualization tool FineBI. This art After saving the dataset, you need to click the **Update Data** button and wait for the data update to complete before proceeding with the analysis: - ![image-20230809091814920](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/update-data.png) + ![image-20230809091814920](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/update-data.png?raw=true) 4. Create Analytic Themes: @@ -241,30 +241,30 @@ MatrixOne supports integration with the data visualization tool FineBI. This art - Click **My Analysis**, then click **New Folder** to create and select a folder. - Click **New Analytic Theme**, select the dataset created in the previous step, and then click **OK**. - ![image-20230809092959252](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/create-analytic.png) + ![image-20230809092959252](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/create-analytic.png?raw=true) __Note:__ You can use the **Batch Selection** feature to select multiple datasets for theme analysis. - ![image-20230809092959252](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/batch-select.png) + ![image-20230809092959252](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/batch-select.png?raw=true) Click the **Add Component** button, choose the chart type, drag the fields from the left to the right as needed, double-click to modify the field visualization name, and change the component name below to describe the content of the report analyzed by the component: - ![image-20230809092959252](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/add-compon-1.png) + ![image-20230809092959252](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/add-compon-1.png?raw=true) - ![image-20230809092959252](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/add-compon-2.png) + ![image-20230809092959252](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/add-compon-2.png?raw=true) 5. Assemble Dashboards: Click **Add Dashboard** to add the components you just created to the dashboard. You can freely drag and resize the components and change the component names below to describe the report's content analyzed by the component. - ![image-20230810123913230](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/add-dashboard.png) + ![image-20230810123913230](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/add-dashboard.png?raw=true) 6. Publish Dashboards: After assembling the dashboard, click **Publish**, set the publication name, publication node, and display platform. Then click **Confirm**, and your dashboard will be successfully published. - ![image-20230810123913230](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/publish.png) + ![image-20230810123913230](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/publish.png?raw=true) Now, see the newly published dashboard under **Navigation** and see how it looks. - ![image-20230810131752645](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/published.png) + ![image-20230810131752645](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/finebi/published.png?raw=true) diff --git a/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/Superset-connection.md b/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/Superset-connection.md index 58e12c2e4..8c4b4f5f0 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/Superset-connection.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/Superset-connection.md @@ -84,13 +84,13 @@ Here are the steps for deploying a single-node Superset using Docker: 1. Access the Superset login page, typically at `http://ip:8080`. Then, enter your username and password to log in to Superset. - ![Superset Login Page](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-login.png) + ![Superset Login Page](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-login.png?raw=true) __Note:__ The port for Superset may be either 8080 or 8088, depending on your configuration. The username and password are the ones you set during the Superset deployment. After logging in, you will see the main interface of Superset. - ![Superset Main Interface](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-dashboard.png) + ![Superset Main Interface](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-dashboard.png?raw=true) 2. Create a database connection: @@ -100,11 +100,11 @@ Here are the steps for deploying a single-node Superset using Docker: Fill in the connection information for the MatrixOne database, including the host, port, username, and password. - ![Create Database Connection](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-create-db-connection.png) + ![Create Database Connection](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-create-db-connection.png?raw=true) After filling in the details, click the **CONNECT** button and then click **FINISH**. - ![Create Query](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-create-query.png) + ![Create Query](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-create-query.png?raw=true) ## Creating Visual Monitoring Dashboards @@ -112,7 +112,7 @@ Now, you can use the MatrixOne database to create a monitoring dashboard. 1. Click on **SQL > SQL Lab** on the page, select the MatrixOne database connection you created earlier, and write SQL queries to select the tables you want to monitor. - ![image-20230807201143069](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/sql-lab.png) + ![image-20230807201143069](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/sql-lab.png?raw=true) You can write multiple queries to monitor different metrics. Here are example SQL statements for some queries: @@ -194,11 +194,11 @@ Now, you can use the MatrixOne database to create a monitoring dashboard. Here, we'll use one of the queries as an example to demonstrate how to edit a visual chart. First, select the 'disk_read_write' query as the data source for the chart. In the SQL Lab, click **CREATE CHART** below the corresponding query, or if you've saved the query in the previous step, the page will redirect to the Chart editing page: - ![Create Dashboard](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-create-dashboard.png) + ![Create Dashboard](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-create-dashboard.png?raw=true) 4. In the chart editing page, choose chart type, time field, metric columns from the query, grouping columns, and other options. Once configured, select **RUN**: - ![View Dashboard](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-view-dashboard.png) + ![View Dashboard](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-view-dashboard.png?raw=true) 5. Click **UPDATE CHART > SAVE** to save the edited chart. @@ -208,10 +208,10 @@ Now, you can use the MatrixOne database to create a monitoring dashboard. Click on **Dashboards**, then click **+ DASHBOARD** to create a new dashboard or edit an existing one. - ![image-20230808101636134](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-add-dashboard.png) + ![image-20230808101636134](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-add-dashboard.png?raw=true) 2. In the dashboard editing page, you can drag the charts you've created from the CHARTS list on the right onto the dashboard for assembly. You can also freely adjust the position of charts, add titles, and more. - ![image-20230808102033250](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-edit-dashboard.png) + ![image-20230808102033250](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/superset/superset-edit-dashboard.png?raw=true) You have successfully connected the MatrixOne database with Superset and created a simple monitoring dashboard to visualize key metrics of the MatrixOne database. diff --git a/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/yonghong-connection.md b/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/yonghong-connection.md index ed1cb44d7..6842281aa 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/yonghong-connection.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/BI-Connection/yonghong-connection.md @@ -17,34 +17,34 @@ MatrixOne supports connectivity to the intelligent data analysis tool, Yonghong Open Yonghong BI, select **Add Data Source > + (New Data Source)** on the left, and choose **MySQL** in the pop-up database options. -![Add Data Source](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_add_connect.png) +![Add Data Source](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_add_connect.png?raw=true) After filling in the connection information related to the MatrixOne database, you can select the **Test Connection** button in the upper right corner to ensure a successful connection. Once the connection is successful, click **Save** to save the data source information we just filled in. -![Connect to MatrixOne](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_connect.png) +![Connect to MatrixOne](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_connect.png?raw=true) ### Creating a Dataset In Yonghong BI, select the **Create Dataset** menu on the left, then choose the data source you added just now. You will see tables and views from the MatrixOne database. To meet your business needs, add **Custom SQL**, then click **Refresh Data**. The query results will be displayed on the right. After confirming that the query results meet expectations, click **Save** to save the dataset. -![Create Dataset](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_dataset.png) +![Create Dataset](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_dataset.png?raw=true) ### Creating Reports First, in Yonghong BI, select the **Create Report** menu on the left, then choose the appropriate **Chart Component** from the right and drag it to the left. -![Create Report](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_panel_add.png) +![Create Report](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_panel_add.png?raw=true) Select the dataset you just created, set the time dimension as the X-axis, and set the daily order count and active user count as the Y-axis. You can drag the measurement and dimension **fields to their respective positions as needed**. After editing, click **Save** to save the report you created. -![Create Report](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_report.png) +![Create Report](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_report.png?raw=true) ### Viewing Reports Finally, in Yonghong BI, select **View Report**, then click on the report name we created in the tree menu on the left. You will be able to view the report we created above. -![View Report](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_result.png) +![View Report](https://github.com/matrixorigin/artwork/blob/main/docs/develop/bi-connection/yonghong/yonghong_result.png?raw=true) You have successfully connected to the MatrixOne database using Yonghong BI and created a simple report for visualizing MatrixOne data. diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-kafka-matrixone.md b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-kafka-matrixone.md index f5618e9be..3352b101a 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-kafka-matrixone.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-kafka-matrixone.md @@ -338,7 +338,7 @@ After executing the above command, you will wait on the console to enter the mes {"id": 50, "name": "xiaoli", "age": 42} ``` -![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/develop/flink/message.png) +![](https://github.com/matrixorigin/artwork/blob/main/docs/develop/flink/message.png?raw=true) ### Step Five: View Implementation Results diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-mysql-matrixone.md b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-mysql-matrixone.md index be51aab3b..f79efb163 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-mysql-matrixone.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-mysql-matrixone.md @@ -30,7 +30,7 @@ This practice requires the installation and deployment of the following software An example configuration is shown in the following figure:
- +
2. Add project dependencies, edit the `pom.xml` file in the root of your project, and add the following to the file: @@ -248,7 +248,7 @@ After connecting to MatrixOne using a MySQL client, create the database you need 3. Run `MoRead.Main()` in IDEA with the following result: - ![MoRead execution results](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/develop/flink/moread.png) + ![MoRead execution results](https://github.com/matrixorigin/artwork/blob/main/docs/develop/flink/moread.png?raw=true) ### Step Three: Write MySQL Data to MatrixOne diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-oracle-matrixone.md b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-oracle-matrixone.md index fe5db3981..d0c35f4c0 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-oracle-matrixone.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-oracle-matrixone.md @@ -138,5 +138,5 @@ select * from oracle_empt; ```
- +
\ No newline at end of file diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-sqlserver-matrixone.md b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-sqlserver-matrixone.md index 3758338aa..7436e93e7 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-sqlserver-matrixone.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-sqlserver-matrixone.md @@ -44,13 +44,13 @@ values (1, 'Lisa', 25, '2010-10-12', '0'), ```sql exec sp_helpsrvrolemember 'sysadmin';```
- +
2. Queries if the current database has CDC (Change Data Capture Capability) enabled
- +
Remarks: 0: means not enabled; 1: means enabled @@ -68,7 +68,7 @@ values (1, 'Lisa', 25, '2010-10-12', '0'), ```
- +
Remarks: 0: means not enabled; 1: means enabled If not, execute the following sql to turn it on: @@ -87,7 +87,7 @@ values (1, 'Lisa', 25, '2010-10-12', '0'), Looking at the system tables under the database, you will see more cdc-related data tables, where cdc.dbo_sqlserver_flink_CT is the record of all DML operations that record the source tables, each corresponding to an instance table.
- +
5. Verify that the CDC agent starts properly @@ -101,19 +101,19 @@ values (1, 'Lisa', 25, '2010-10-12', '0'), If the status is `Stopped`, you need to turn on the CDC agent.
- +
Open the CDC agent in a Windows environment: On the machine where the SqlServer database is installed, open Microsoft Sql Server Managememt Studio, right-click the following image location (SQL Server agent), and click Open, as shown below:
- +
Once on, query the agent status again to confirm that the status has changed to running
- +
At this point, the table sqlserver_data starts the CDC (Change Data Capture) function all complete. @@ -209,7 +209,7 @@ select * from sqlserver_data; ```
- +
### Inserting data to SQL Server @@ -230,7 +230,7 @@ select * from sstomo.sqlserver_data; ```
- +
### Deleting incremental data in SQL Server @@ -244,7 +244,7 @@ delete from sstomo.dbo.sqlserver_data where id in(3,4); Query table data in mo, these two rows have been deleted synchronously:
- +
### Adding new data to SQL Server @@ -258,5 +258,5 @@ update sstomo.dbo.sqlserver_data set age = 18 where id in(1,2); Query table data in MatrixOne, the two rows have been updated in sync:
- +
\ No newline at end of file diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-tidb-matrixone.md b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-tidb-matrixone.md index fa1fe0418..05ceaeec9 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-tidb-matrixone.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Flink/flink-tidb-matrixone.md @@ -139,7 +139,7 @@ select * from EMPQ; ```
- +
Data can be found to have been imported @@ -151,7 +151,7 @@ delete from EMPQ_cdc where empno=1; ```
- +
Query table data in MatrixOne, this row has been deleted synchronously. diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Spark/spark-hive-matrixone.md b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Spark/spark-hive-matrixone.md index fdb175b26..26b2daae9 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Spark/spark-hive-matrixone.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Spark/spark-hive-matrixone.md @@ -28,7 +28,7 @@ This practice requires the installation and deployment of the following software - **JDK** 1.8
- +
2. Add a project dependency and edit the contents of `pom.xml` in the project root as follows: @@ -127,7 +127,7 @@ CREATE TABLE `users` ( Copy the three configuration files "etc/hadoop/core-site.xml" and "hdfs-site.xml" in the Hadoop root and "conf/hive-site.xml" in the Hive root to the "resource" directory of your project.
- +
### Step five: Write the code diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Spark/spark-mysql-matrixone.md b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Spark/spark-mysql-matrixone.md index e718e6bbc..d82d6a797 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Spark/spark-mysql-matrixone.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Computing-Engine/Spark/spark-mysql-matrixone.md @@ -27,7 +27,7 @@ This practice requires the installation and deployment of the following software - **JDK** 1.8
- +
2. Add a project dependency and edit the contents of `pom.xml` in the project root as follows: @@ -152,7 +152,7 @@ After connecting to MatrixOne using a MySQL client, create the database you need 3. Run `MoRead.Main()` in IDEA with the following result: - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/develop/spark/moread.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/develop/spark/moread.png?raw=true) ### Step Three: Write MySQL Data to MatrixOne diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-influxdb-matrixone.md b/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-influxdb-matrixone.md index fae37d8b6..78719f640 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-influxdb-matrixone.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-influxdb-matrixone.md @@ -57,7 +57,7 @@ vim /etc/influxdb/influxdb.conf ```
- +
### Restart influxdb diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-overview.md b/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-overview.md index de37d74d0..131a3a85c 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-overview.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-overview.md @@ -10,7 +10,7 @@ MatrixOne is highly compatible with MySQL 8.0, but since DataX's included MySQL MatrixOneWriter leverages the DataX framework to get the generated protocol data from Reader and generates the appropriate `insert into...` statement based on the `writeMode` you configured. When a primary key or unique index conflict is encountered, conflicting rows are excluded and writes continue. For performance optimization reasons, we took the `PreparedStatement + Batch` approach and set the `rewriteBatchedStatements=true` option to buffer the data into the thread context's buffer. A write request is triggered only when the amount of data in the buffer reaches a predetermined threshold. -![DataX](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/develop/Computing-Engine/datax-write/datax.png) +![DataX](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Computing-Engine/datax-write/datax.png?raw=true) !!! note You need to have at least `insert into ...` permissions to execute the entire task. Whether you need additional permissions depends on your `preSql` and `postSql` in the task configuration. diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-postgresql-matrixone.md b/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-postgresql-matrixone.md index aa2e64e1e..bdaf2bd73 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-postgresql-matrixone.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-postgresql-matrixone.md @@ -196,11 +196,11 @@ python ./bin/datax.py ./job/pgsql2mo.json #in the datax directory When the task is complete, print the overall operation:
- +
### View data in a MatrixOne table
- +
\ No newline at end of file diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-sqlserver-matrixone.md b/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-sqlserver-matrixone.md index 976489264..c0c22a295 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-sqlserver-matrixone.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Etl/DataX/datax-sqlserver-matrixone.md @@ -115,5 +115,5 @@ select * from test_2; ```
- +
\ No newline at end of file diff --git a/docs/MatrixOne/Develop/Ecological-Tools/Scheduling-Tools/dolphinScheduler.md b/docs/MatrixOne/Develop/Ecological-Tools/Scheduling-Tools/dolphinScheduler.md index 672c36eb1..8d30f9708 100644 --- a/docs/MatrixOne/Develop/Ecological-Tools/Scheduling-Tools/dolphinScheduler.md +++ b/docs/MatrixOne/Develop/Ecological-Tools/Scheduling-Tools/dolphinScheduler.md @@ -40,13 +40,13 @@ MatrixOne supports integration with DolphinScheduler, a visual DAG workflow task Use the default username `admin` and password `dolphinscheduler123`. Access the DolphinScheduler web user interface by visiting `http://ip:12345/dolphinscheduler/ui`, as shown below: - ![image-20230809145317885](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809145317885.png) + ![image-20230809145317885](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809145317885.png?raw=true) 4. Create a Data Source: Click on **Data Source Center > Create Data Source** and enter the MatrixOne data connection information. Afterward, click on **Test Connection**; if the connection is successful, click **OK** to save it: - ![image-20230809145935857](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809145935857.png) + ![image-20230809145935857](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809145935857.png?raw=true) ### Step 2: Create a Project Workflow @@ -54,7 +54,7 @@ MatrixOne supports integration with DolphinScheduler, a visual DAG workflow task In the **Security Center**, click on **Create Tenant** and enter the tenant name, as shown below: - ![image-20230809160632965](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809160632965.png) + ![image-20230809160632965](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809160632965.png?raw=true) !!! Note In a production environment, it is not recommended to use `root` as the tenant. @@ -63,7 +63,7 @@ MatrixOne supports integration with DolphinScheduler, a visual DAG workflow task In **Project Management**, click on **Create Project** and enter the project name, as shown below: - ![image-20230809150528364](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809150528364.png) + ![image-20230809150528364](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809150528364.png?raw=true) 3. Create a Workflow and Add Nodes: @@ -71,11 +71,11 @@ MatrixOne supports integration with DolphinScheduler, a visual DAG workflow task The node created in this step is for creating a table, and the SQL statement is used to create a table. - ![image-20230809151554568](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809151554568.png) + ![image-20230809151554568](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809151554568.png?raw=true) Next, create **Insert Data** and **Query Data** nodes in a similar way. The dependency relationship between these three nodes is shown below, and you can manually connect them: - ![image-20230809153149428](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809153149428.png) + ![image-20230809153149428](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809153149428.png?raw=true) The SQL statements for these three nodes are as follows: @@ -95,30 +95,30 @@ MatrixOne supports integration with DolphinScheduler, a visual DAG workflow task Connect these three nodes based on their dependency relationship, then click **Save**. Enter the **Workflow Name**, select the previously created **Tenant**, choose **Parallel** as the execution policy, and click **OK**. - ![image-20230809161503945](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809161503945.png) + ![image-20230809161503945](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809161503945.png?raw=true) Once the workflow is created, you can see it in the **Workflow Relations** page with the status "Workflow Offline": - ![image-20230809161909925](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809161909925.png) + ![image-20230809161909925](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809161909925.png?raw=true) Similarly, you can also see the defined workflow in the **Workflow Definitions** page with the status "Offline": - ![image-20230809162411368](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809162411368.png) + ![image-20230809162411368](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809162411368.png?raw=true) 4. Publish and Run the Workflow: A workflow must be published before it can be run. Click the **Publish** button to publish the workflow created earlier: - ![image-20230809162245088](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809162245088.png) + ![image-20230809162245088](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809162245088.png?raw=true) After publishing, the workflow status will appear as follows: - ![image-20230809163722777](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809163722777.png) + ![image-20230809163722777](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809163722777.png?raw=true) Next, click the **Run** button, set the configuration parameters before starting, and then click **OK**: - ![image-20230809162828049](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809162828049.png) + ![image-20230809162828049](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809162828049.png?raw=true) Finally, return to the **Project Overview** to check whether the workflow and the three tasks below it have run successfully, as shown below: - ![image-20230809163533339](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809163533339.png) + ![image-20230809163533339](https://github.com/matrixorigin/artwork/blob/main/docs/develop/Scheduling-tool/image-20230809163533339.png?raw=true) diff --git a/docs/MatrixOne/Develop/Publish-Subscribe/pub-sub-overview.md b/docs/MatrixOne/Develop/Publish-Subscribe/pub-sub-overview.md index a52d214b2..6a2cf6bfc 100644 --- a/docs/MatrixOne/Develop/Publish-Subscribe/pub-sub-overview.md +++ b/docs/MatrixOne/Develop/Publish-Subscribe/pub-sub-overview.md @@ -58,7 +58,7 @@ The publish subscription feature has several typical application scenarios: This chapter will give an example of how three tenants, sys, acc1, and acc2, currently exist in a MatrixOne cluster, operating on the three tenants in order of operation: -![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/develop/pub-sub/data-share.png) +![](https://github.com/matrixorigin/artwork/blob/main/docs/develop/pub-sub/data-share.png?raw=true) 1. **Publisher**: sys tenant creates database sub1 with table t1 and publishes pub1: diff --git a/docs/MatrixOne/Develop/Vector/cluster_centers.md b/docs/MatrixOne/Develop/Vector/cluster_centers.md index a3b75edb2..dcae4b7bd 100644 --- a/docs/MatrixOne/Develop/Vector/cluster_centers.md +++ b/docs/MatrixOne/Develop/Vector/cluster_centers.md @@ -88,7 +88,7 @@ Suppose we have annual shopping data for a set of customers, including their ann A good cluster usually appears as a distinctly separated group in the visualization. As can be seen from the figure below, the cluster center selection is more appropriate.
- +
By identifying cluster centers, we can divide our customers into two groups: those with middle income and middle consumption levels (cluster center A) and those with higher income and higher consumption levels (cluster center B). Merchants can tailor their product positioning to each group's consumption characteristics, such as offering better value for money for Cluster Center A and high-end or luxury brands for Cluster Center B. @@ -153,7 +153,7 @@ A music streaming service wants to divide users into groups based on their prefe Use t-SNE to reduce high-dimensional data to 2D and visualize clustering results. As can be seen from the figure below, the data points are clearly separated by cluster centers in the space after dimension reduction, which increases confidence in the correctness of the cluster centers.
- +
By determining the cluster centers, we can divide users into two groups: Cluster 1 is primarily composed of users who prefer rock and hip hop music, which may represent a group of users seeking modern and rhythmic music. Cluster 2 is composed of users who prefer pop and jazz music, which may represent a group of users who prefer melodic and relaxed atmosphere music. Media companies can push out styles of music for users based on their preferences. diff --git a/docs/MatrixOne/Develop/Vector/vector_search.md b/docs/MatrixOne/Develop/Vector/vector_search.md index 7ad105b22..6fa2a4d88 100644 --- a/docs/MatrixOne/Develop/Vector/vector_search.md +++ b/docs/MatrixOne/Develop/Vector/vector_search.md @@ -5,7 +5,7 @@ Vector retrieval is the retrieval of K vectors (K-Nearest Neighbor, KNN) that are close to the query vectors in a given vector dataset by some measure. This is a technique for finding vectors similar to a given query vector in large-scale high-dimensional vector data. Vector retrieval has a wide range of applications in many AI fields, such as image retrieval, text retrieval, speech recognition, recommendation systems, and more. Vector retrieval is very different from traditional database retrieval.Scalar search on traditional database mainly targets structured data for accurate data query,while vector search mainly targets vector data after vectorization of unstructured data for similar retrieval,which can only approximate the best match.
- +
Matrixone currently supports vector retrieval using the following distance measure functions: diff --git a/docs/MatrixOne/Develop/Vector/vector_type.md b/docs/MatrixOne/Develop/Vector/vector_type.md index 04c00b538..9f1e675e7 100644 --- a/docs/MatrixOne/Develop/Vector/vector_type.md +++ b/docs/MatrixOne/Develop/Vector/vector_type.md @@ -5,7 +5,7 @@ In a database, vectors are usually a set of numbers that are arranged in a particular way to represent some data or feature. These vectors can be one-dimensional arrays, multi-dimensional arrays, or data structures with higher dimensions. In machine learning and data analysis, vectors are used to represent data points, features, or model parameters. They are typically used to process unstructured data, such as pictures, speech, text, etc., to transform the unstructured data into embedding vectors through machine learning models and subsequently process and analyze the data.
- +
## Matrixone support vector type @@ -31,7 +31,7 @@ Matrixone currently supports vectors of type `float32` and `float64`, called `ve ```python import binascii # 'value' is a NumPy object - def to\_binary(value): if value is None: return value + def to_binary(value): if value is None: return value # small endian floating point array value = np.asarray(value, dtype=' DDL** Click **Copy**, first copy this SQL to a text editor for text editing Name the filer as *pg_ddl.sql* and save it locally on the springboard machine. - ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-1.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-1.png?raw=true) 2. Use `pg2mysql` translation tool to convert *pg_ddl.sql* file to MySQL format DDL:** @@ -103,19 +103,19 @@ Here we take the TPCH dataset as an example and migrate the 8 tables of the TPCH 1. Open DBeaver, select the table to be migrated from PostgreSQL, right-click and select **Export Data**: - ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-2.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-2.png?raw=true) 2. In the **Conversion Target > Export Target** window, select **Database**, click **Next**; in the **Table Mapping** window, select **Target Container**, and select the MatrixOne database for the target container *tpch*: - ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-3.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-3.png?raw=true) - ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-4.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-4.png?raw=true) 3. In the **Extraction Settings** and **Data Loading Settings** windows, set the number of selected extractions and inserts. To trigger MatrixOne's direct write S3 strategy, it is recommended to fill in 5000: - ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-5.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-5.png?raw=true) - ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-6.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-6.png?raw=true) 4. After completing the settings, DBeaver starts to migrate the data, and after completion, DBeaver will prompt that the migration is successful. @@ -146,7 +146,7 @@ Here we take the TPCH dataset as an example and migrate the 8 tables of the TPCH 1. Open DBeaver, select the table to be migrated from PostgreSQL, right-click and select **Generate SQL > DDL > Copy**, first copy this SQL to a text editor, and name the text editor *pg_ddl.sql*, saved locally on the springboard machine. - ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-1.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/migrate/PostgreSQL-1.png?raw=true) 2. Use `pg2mysql` translation tool to convert *pg_ddl.sql* file to MySQL format DDL:** diff --git a/docs/MatrixOne/Overview/architecture/architecture-logtail.md b/docs/MatrixOne/Overview/architecture/architecture-logtail.md index 53b056ddb..cce8709a2 100644 --- a/docs/MatrixOne/Overview/architecture/architecture-logtail.md +++ b/docs/MatrixOne/Overview/architecture/architecture-logtail.md @@ -61,7 +61,7 @@ Based on the logtail table, from the moment a pull request is received, the prim 3. Convert the collected log information into the Logtail protocol format and return it as a response to CN. -![Pull Workflow](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/logtail-arch-1.png) +![Pull Workflow](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/logtail-arch-1.png?raw=true) ``` type RespBuilder interface { @@ -91,4 +91,4 @@ The primary purpose of push is to synchronize incremental logs from TN to CN in If a table has not been updated for a long time, how does a CN become aware of it? Here, a heartbeat mechanism is introduced, with a default of 2 ms. In TN's commit queue, a heartbeat transaction is placed, which performs no substantial work but consumes a timestamp, triggering a Logtail send to notify CN that all table data had updates sent previously, pushing the CN's timestamp watermark. -![Push Workflow](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/logtail-arch-2.png) +![Push Workflow](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/logtail-arch-2.png?raw=true) diff --git a/docs/MatrixOne/Overview/architecture/architecture-matrixone-operator.md b/docs/MatrixOne/Overview/architecture/architecture-matrixone-operator.md new file mode 100644 index 000000000..aa97e416c --- /dev/null +++ b/docs/MatrixOne/Overview/architecture/architecture-matrixone-operator.md @@ -0,0 +1,152 @@ +# Matrixone-Operator Design and Implementation Details + +MatrixOne is a cloud-native distributed database that naturally adapts to cloud infrastructure and is optimized for cloud-oriented cost models. And unlike typical SaaS services, databases in serious scenarios often need to follow the application, running on the same infrastructure as the application, out of a need for performance and data security. To serve as many users as possible, MatrixOne needs to adapt to all types of public, private, and even hybrid clouds. And the largest of these conventions is Kubernetes (K8S). That's why MatrixOne uses K8S as the default operating environment for distributed deployments, adapting to different clouds in a unified way. MatrixOne-Operator is exactly MatrixOne's automated deployment operations software on the K8S. It extends the K8S to provide external operations management capabilities of the MatrixOne cluster in a K8S-style declarative API. + +This article will explain the design and implementation of MatrixOne-Operator and share our empirical thinking. + +## MatrixOne-Operator Design + +Although K8S natively provides a StatefulSet API to serve the orchestration of stateful apps, K8S natively does not support managing app state due to the difficulty of uniform abstraction of app layer state across different stateful apps. To solve this problem, the Operator model emerged. A typical K8S Operator consists of an API and a controller: + +- API + +Typically declared through K8S' CustomResourceDefinition (CRD) object, after committing a K8S CRD to K8S's api-server, the api-server registers a corresponding Restful API with itself. All K8SClients can get, LIST, POST, DELETE, etc. on this newly declared API in a similar way to manipulating native resources. By convention, within each API object, the `.spec` structure is managed by the user to declare the desired state of the object, and the `.status` structure is managed by the controller below to expose the actual state of the object. + +- Controller + +A controller is an ongoing piece of code that monitors (watches) a range of K8S objects, including the API objects we just defined. Then, automation is performed based on the expected state of these objects and the actual state collected from the reality (note: the actual state here is collected from the reality and rewritten into `.status`, not directly from `.status`) to drive the actual state to the desired state. The process continues in a loop, graphically known as a control loop, and in some places with a more classical word, reconciliation loop, cleverly keeping the flavors aligned with K8S's word for "orchestration." + +The following diagram provides a general description of the process using the simplified MatrixOneCluster API as an example: + +
+ +
+ +MatrixOne-Operator provides not only load-based APIs such as MatrixOneCluster for managing MO clusters, but also task-based APIs such as backup recovery and resource-based APIs such as object buckets. Each API and their controller are designed with unique considerations in mind, but invariably, all APIs and controllers are built in the above pattern. Next, we'll continue exploring the tradeoffs in each API design. + +## Cluster API Design + +A distributed MO cluster consists of multiple components such as log services, transaction nodes, compute nodes, and proxies, where compute nodes also have explicit heterogeneous requirements to achieve model optimization for load and capabilities across clouds, cloud edges, and more. Centralizing the management of an entire cluster into one API object and declaring another controller to manage, while easy to use, is a code maintenance nightmare. Therefore, MatrixOne-Operator made the principle of **loosely coupled fine-grained APIs** clear at the beginning of its design, designing clear APIs with responsibilities such as LogSet, CNSet, ProxySet, BucketClaim, and controllers that are independent of each other. To maintain ease of use, the MatrixOneCluster API was also introduced. Responsible for MatrixOneCluster controllers not duplicating the work of other controllers—when a cluster requires a LogSet to provide logging services, the MatrixOneCluster controller simply creates a LogSet object and delegates the rest to the LogSet controller. + +
+ +
+ +With this design, while there are many APIs, users always only need to care about the MatrixOneCluster API, and the developers of MatrixOne-Operator are adding features or solving problems, often with problem domains no larger than a fine-grained API and controller. + +Of course, there are certain dependencies between multiple API objects, such as transaction nodes and compute nodes that rely on HAKeeper running in the log service to get cluster information for service discovery. This requires that the logging service be started and HAKeeper bootstrap completed when the cluster is deployed before the transaction and compute nodes can continue to start. While this type of logic can be implemented by a MatrixOneCluster controller, it also means that the business knowledge of other controllers is compromised and the implementation of the individual controls still creates coupling. Therefore, in mo-operator, we implement the business logic that creates dependencies between all components on the relying party, which exposes itself to the outside world only through the convention `.status` field. For example, when a controller is tuning a CNset, it actively waits for the LogSet pointed to by the CNSet to be ready before following up. Neither the LogSet controller nor the upper MatrixOneCluster controller need to be aware of this. + +The loosely coupled fine-grained API adapts well to CN's heterogeneous orchestration scenarios. In MatrixOne-Operator, in addition to the convenient usage of declaring multiple CN groups in MatrixOneCluster for heterogeneous orchestration, it is also possible to directly create a CNSet to join an existing cluster, which means that the new CNSet can be deployed in another set of K8S clusters with network-level support for MO orchestration across clouds or cloud-side scenarios. + +During iterations of individual controllers, MatrixOne-Operator also tends to add new features by adding new API objects. For example, when implementing object storage management, MatrixOne-Operator needs to ensure that there is no intersection between object storage paths used by different clusters and that they are cleaned up automatically after the cluster is destroyed. MatrixOne-Operator's solution is to add a BucketClaim API that references the control logic of K8S PersistentVolumeClaim to complete lifecycle management of an object storage path in a standalone controller, avoiding complex race condition handling and code coupling issues. + +## Controller implementation + +K8S provides the controller-runtime package to help developers implement their own controllers, but for versatility the interface design is relatively low-level: + +``` +Reconcile(ctx context.Context, req Request)(Result, error) +``` + +The controller needs to implement the Reconcile interface, which is then registered through the controller-runtime interface, declare the objects to listen on, and some filtering rules for listening. The controller-runtime calls the controller's Reconcile method each time the object changes or retries Reconcile, and passes the identifier of the target object in the req parameter. There will be a lot of template code within this interface, usually represented by pseudocode: + +``` +func tuning(Namespace+Name of object A) { + Get the Spec of object A + if object A is being deleted { + Execute the cleanup logic + update the cleanup progress to A.status + remove finalizer on object A + } else { + Add finalizer to A + Execute the tuning logic + Update tuning progress to A.status + } +} +``` + +Similar logic recurs in various community controller implementations, and developers need to care about a lot more than business: handling finalizers correctly to ensure that resources are not leaked, updating progress and errors to status in a timely manner to increase visibility, and more detailed loggers with context and kubeClient with cache issues. + +Since there is no need to consider versatility, there is a more specialized abstraction within MatrixOne-Operator, designing the Actor interface: + +``` +type Actor[T client.Object] interface { + Observe(*Context[T]) (Action[T], error) + Finalize(*Context[T]) (done bool, err error) +} + +type Action[T client.Object] func(*Context[T]) error +``` + +Behind that, the generic controller framework logic takes care of all the logic and details similar to the template code above, preparing within Context\[T] the objects that currently need to be reconciled and the Logger, EventRecorder, KubeClient objects that have already taken care of the context. Finally: + +- When tuning an undeleted object, call Actor.Observe to have real business logic perform the tuning; + +- When tuning an object in progress, Actor.Finalize is called to perform resource cleanup behavior within the business logic, retrying until Finalize returns completion before removing the object's finalizer. + +The state machine for an object is as follows: + +
+ +
+ +Under this process, a controller is straightforward about the implementation of both the create and destroy parts of API object lifecycle management. Nothing more than calling K8S's API to request storage, deploy workloads, configure service discovery, or conversely destroying all external resources created while the API is in the delete phase, in conjunction with MO's operational knowledge. The object's updated tuning operation is also regular diff logic to MatrixOneCluster's. The cnSets field, for example, can be represented by the following pseudocode: + +``` +func sync(c MatrixOneCluster) { + existingCNSets := Collect all CNSets for this cluster + for _, desired := range c.spec.CNSets { + cnSet := build CNSet(desired) + if _, ok := existingCNSets[cnSet.Name]; ok { + // 1. CNSet exist,update CNSet + .... + // 2. Marks this cnSet as needed in the desired state. + delete(existingCNSets, cnSet.Name) + } else { + // CNSet does not exist, create + .... + } + } + for _, orphan := range existingCNSets { + // Cleanup for CNSets that actually exist but do not exist in the desired state + } +} +``` + +More error-prone is ConfigMap / Secret's update logic. MO, like many apps, requires a configuration file and restarts to reread the configuration each time it is updated, which is usually stored with K8S native ConfigMap objects. One easy place to tread the pit is that the contents of the ConfigMap object are mutable, whereas most apps tend to read the configuration file within the ConfigMap only once at startup and never reload later. Therefore, viewing the content within the ConfigMap that the Pod is currently referencing does not determine the configuration that the Pod is currently using (it is possible that the ConfigMap content has changed since startup). Also, if you want to update your app on a rolling basis after a ConfigMap change, a common practice is to make a hash of the contents of the ConfigMap into the PodTemplate's Annotation and update this Annotation after each update of the ConfigMap change triggers a rolling update of your app. But this can also happen unexpectedly by modifying ConfigMap in place: + +
+ +
+ +For example, let's assume that the ConfigMap Hash in Annotation is updated from 123 to 321, and 321 is not ready after startup because of a problem with the ConfigMap configuration. At this point, with the right policy configuration, the rolling update will get stuck to avoid a wider failure range. However, a new version of ConfigMap has also been read within the Pod that has not yet been updated, and as soon as a container restart or Pod rebuild occurs, problems occur. This obviously doesn't behave the same as updating mirrors or other fields. When updating other fields, the green pod still belongs to the old ReplicaSet/ControllerRevision, neither reboot nor rebuild will start with the new version of the configuration, and the fault range is manageable. + +The root of the problem is that the contents of the ConfigMap are not within the spec of the Pod, and directly modifying the contents of the ConfigMap conflicts with the Pod's **immutable infrastructure** principles. + +Therefore, MatrixOne-Operator designs all objects that will be referenced within the Pod to be immutable. In the case of ConfigMap, each time a component's configuration is updated via CRD, MatrixOne-Operator generates a new ConfigMap and rolls all copies of the component onto this new ConfigMap: + +
+ +
+ +Based on this principle, at any given moment we can clarify the information within all Pods via the current Pod Spec. The problem of rolling updates is also solved. + +## Application State Management + +In addition to lifecycle management of the app itself, MatrixOne-Operator has an important responsibility to manage the state of the app itself. However, distributed systems often manage their own application state based on heartbeats or similar mechanisms, so why do you have to do this more in Operator? + +The reason is that Operator has knowledge of automated O&M within its code, for example Operator knows exactly which Pod is to be rebuilt/restarted next during the rolling update process. So it's possible to adjust in-app status ahead of time, such as migrating loads on the Pod to minimize the impact of rolling updates. There are two common implementations of such application state management logic: + +- Synchronize app status within the Pod's own various lifecycle hooks, such as InitContainer, PostStart Hook, and PreStop Hook. + +- Call the app interface within Operator's tuning loop to adjust the app state. + +Mode 1 is simpler to implement, while Mode 2 is more self-contained and free to cope better with complex scenarios. For example, when shrinking a CNSet, migrate the session on the condensed CN Pod to another CN Pod before stopping the CN Pod. If this action is placed in Pod PreStop Hook, it cannot be undone. In the actual scenario, there is a set of CNs that are shrunk and then expanded before the shrunk is complete (especially after auto-scaling is turned on). At this point, the tuning loop within Operator can calculate that the CNs that are still offline can be directly multiplexed at this point, calling the management interface inside the MO to restore the CN to the service state, not migrating the session to the other CNs and accepting the new session back to the Proxy without having to expand a new CN. + +## Summary + +As a mainstream option to extend the orchestration capabilities of K8S, the Operator model has evolved to today have mature base libraries and tool chains, and there are plenty of mature open source projects in the community to reference. Developing an Operator on K8S is no longer a new topic. But the real complexity is always hidden in the details of the actual business, and solving these problems requires a thorough understanding that combines knowledge of K8S and its own business systems domain. As a cloud-native distributed database, MatrixOne shares many of its design concepts and domain knowledge with other cloud-native systems. Hopefully this short article will not only help you understand the design implementation of mo-operator, but also give you an empirical reference when designing your own Operator. + +## Reference Documents + +To learn about MatrixOne-Operator deployment operations, see Chapter [Operator Administration](../../Deploy/MatrixOne-Operator-mgmt.md) diff --git a/docs/MatrixOne/Overview/architecture/architecture-proxy.md b/docs/MatrixOne/Overview/architecture/architecture-proxy.md index 27ec35a1c..2f06a8596 100644 --- a/docs/MatrixOne/Overview/architecture/architecture-proxy.md +++ b/docs/MatrixOne/Overview/architecture/architecture-proxy.md @@ -4,7 +4,9 @@ Proxy, the sole component in MatrixOne responsible for load balancing and SQL re The architecture diagram of its SQL request distribution is as follows: -![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/proxy/proxy-arch.png?raw=true) +
+ +
- The Kubernetes Library layer utilizes built-in Kubernetes features to ensure high availability and load balancing of the Proxy layer. - SQL Proxy implements long connections, allowlists, and SQL request distribution, achieving load balancing and request forwarding for CNs. @@ -14,7 +16,9 @@ The architecture diagram of its SQL request distribution is as follows: Based on the multi-CN architecture of MatrixOne's storage-compute separation and the responsibilities of Proxy, the concept of CN label groups is introduced in HAKeeper and Proxy, that is, CN collections with fixed names and quantities. -![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/proxy/proxy-arch-2.png?raw=true) +
+ +
As shown in the figure above, the technical implementation process is explained as follows: diff --git a/docs/MatrixOne/Overview/architecture/architecture-transaction-lock.md b/docs/MatrixOne/Overview/architecture/architecture-transaction-lock.md index 8718d8619..a98d7f2ad 100644 --- a/docs/MatrixOne/Overview/architecture/architecture-transaction-lock.md +++ b/docs/MatrixOne/Overview/architecture/architecture-transaction-lock.md @@ -90,7 +90,7 @@ In a pessimistic mode, multiple TN nodes exist in the MatrixOne cluster. Thus, i ### Lock Service -![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/lockservice.png) +![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/lockservice.png?raw=true) MatrixOne has implemented LockService to provide lock services, including locking, unlocking, lock conflict detection, lock waiting, and deadlock detection. diff --git a/docs/MatrixOne/Overview/architecture/architecture-wal.md b/docs/MatrixOne/Overview/architecture/architecture-wal.md index bf4736669..9288584c4 100644 --- a/docs/MatrixOne/Overview/architecture/architecture-wal.md +++ b/docs/MatrixOne/Overview/architecture/architecture-wal.md @@ -8,7 +8,7 @@ MatrixOne's WAL is a physical log that records where each row of updates occurs, Commit Pipeline is a component that handles transaction commits. The memtable is updated before committing, persisting WAL entries, and the time taken to perform these tasks determines the performance of the commit. Persistent WAL entry involves IO and is time consuming. The commit pipeline is used in MatrixOne to asynchronously persist WAL entries without blocking updates in memory. -![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/overview/architecture/wal_Commit_Pipeline.png) +![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/wal_Commit_Pipeline.png?raw=true) **The transaction commit process is:** @@ -29,13 +29,13 @@ Checkpoint writes dirty data to Storage, destroys old log entries, and frees up - Dump DML modifications. DML changes are stored in various blocks in the memtable. Logtail Mgr is a memory module that records which blocks are changed for each transaction. Scan transactions between \[t0,t1] on Logtail Mgr, initiate background transactions to dump these blocks onto Storage, and record addresses in metadata. This way, all DML changes committed before t1 can be traced to addresses in the metadata. In order to do checkpoints in time to keep the WAL from growing infinitely, even if the block in the interval changes only one line, it needs to be dumped.
- +
- Scans for Catalog dump DDL and metadata changes. The Catalog is a tree that records all DDL and metadata information, and each node on the tree records the timestamp at which the change occurred. Collect all changes that fall between \[t0,t1] when scanning.
- +
- Destroy the old WAL entry. The LSN corresponding to each transaction is stored in Logtail Mgr. Based on the timestamp, find the last transaction before t1 and tell Log Backend to clean up all logs before the LSN of this transaction. @@ -48,21 +48,27 @@ MatrixOne's WAL can be written in various Log Backends. The original Log Backend - Append, write log entry asynchronously when committing a transaction: -``` Append(entry) (Lsn, error) ``` +``` +Append(entry) (Lsn, error) +``` - Read, batch read log entry on reboot: -``` Read(Lsn, maxSize) (entry, Lsn, error) ``` +``` +Read(Lsn, maxSize) (entry, Lsn, error) +``` - The Truncate interface destroys all log entries before the LSN, freeing up space: -``` Truncate(lsn Lsn) error ``` +``` +Truncate(lsn Lsn) error +``` ## Group Commit Group Commit accelerates persistent log entries. Persistent log entry involves IO, is time consuming, and is often a bottleneck for commits. To reduce latency, bulk write log entries to Log Backend. For example, fsync takes a long time in a file system. If each entry is fsynced, it takes a lot of time. In file system-based Log Backend, where multiple entries are written and fsynced only once, the sum of the time costs of these entry swipes approximates the time of one entry swipe. -![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/overview/architecture/wal_Group_Commit.png) +![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/wal_Group_Commit.png?raw=true) Concurrent writes are supported in the Log Service, and the time of each entry swipe can overlap, which also reduces the total time to write an entry and improves the concurrency of commits. @@ -70,7 +76,7 @@ Concurrent writes are supported in the Log Service, and the time of each entry s To accelerate, concurrent entries are written to Log Backend in an inconsistent order of success and the order in which the requests are made, resulting in inconsistent LSNs generated in Log Backend and logical LSNs passed to the Driver by the upper layers. Truncate and reboot to handle these out-of-order LSNs. In order to ensure that the LSNs in Log Backend are basically ordered and the out-of-order span is not too large, a window of logical LSNs is maintained that stops writing new entries to Log Backend if there are very early log entries that are not being written successfully. For example, if the window is 7 in length and an entry with an LSN of 13 in the figure has not been returned, it blocks an entry with an LSN greater than or equal to 20. -![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/overview/architecture/wal_Log_Backend.png) +![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/wal_Log_Backend.png?raw=true) Destroy the log in Log Backend with the truncate operation, destroying all entries before the specified LSN. entry before this LSN corresponds to a logical LSN that is smaller than the logical truncate point. For example, the logic truncates through 7 in the figure. This entry corresponds to 11 in Log Backend, but the logical LSNs for 5, 6, 7, and 10 in Log Backend are all greater than 7 and cannot be truncate. Log Backend can only truncate 4. @@ -80,7 +86,13 @@ On restart, those discontinuous entries at the beginning and end are skipped. Fo Each write transaction corresponds to one log entry and consists of an LSN, Transaction Context, and multiple Commands. -``` +---------------------------------------------------------+ | Transaction Entry | +-----+---------------------+-----------+-----------+- -+ | LSN | Transaction Context | Command-1 | Command-2 | ... | +-----+---------------------+-----------+-----------+- -+ ``` +``` ++---------------------------------------------------------+ +| Transaction Entry | ++-----+---------------------+-----------+-----------+- -+ +| LSN | Transaction Context | Command-1 | Command-2 | ... | ++-----+---------------------+-----------+-----------+- -+ +``` **LSN**: Each log entry corresponds to one LSN. The LSN is incremented continuously and is used to delete entries when doing checkpoints. @@ -90,11 +102,23 @@ Each write transaction corresponds to one log entry and consists of an LSN, Tran - CommitTS is the timestamp of the end. - Memo records where a transaction changes data. Upon reboot, this information is restored to Logtail Mgr and used for checkpointing. -``` +---------------------------+ | Transaction Context | +---------+----------+------+ | StartTS | CommitTS | Memo | +---------+----------+------+ ``` +``` ++---------------------------+ +| Transaction Context | ++---------+----------+------+ +| StartTS | CommitTS | Memo | ++---------+----------+------+ +``` **Transaction Commands**: Each write operation in a transaction corresponds to one or more commands. log entry logs all commands in the transaction. -| Operator | Command | | :----------------- | :---------------- | | DDL | Update Catalog | | Insert | Update Catalog | | | Append | | Delete | Delete | | Compact&Merge | Update Catalog | +| Operator | Command | +| :----------------- | :---------------- | +| DDL | Update Catalog | +| Insert | Update Catalog | +| | Append | +| Delete | Delete | +| Compact&Merge | Update Catalog | - Operators: The DN in MatrixOne is responsible for committing transactions, writing log entries into Log Backend, doing checkpoints. DN supports build library, delete library, build table, delete table, update table structure, insert, delete, while background automatically triggers sorting. The update operation is split into insert and delete. @@ -116,42 +140,91 @@ Each write transaction corresponds to one log entry and consists of an LSN, Tran The Catalog is database, table, segment, and block from top to bottom. An Updata Catalog Command corresponds to a Catalog Entry. One Update Catalog Command per ddl or with the new metadata. The Update Catalog Command contains Dest and EntryNode. -``` +-------------------+ | Update Catalog | +-------+-----------+ | Dest | EntryNode | +-------+-----------+ ``` +``` ++-------------------+ +| Update Catalog | ++-------+-----------+ +| Dest | EntryNode | ++-------+-----------+ +``` Dest is where this command works, recording the id of the corresponding node and his ancestor node. Upon reboot, via Dest, locate the location of the action on the Catalog. -| Type | Dest | | :------------------|:------------------------------------------| | Update Database | database id | | Update Table | database id,table id | | Update Segment | database id,table id,segment id | | Update Block | atabase id,table id,segment id,block id | +| Type | Dest | +| :------------------|:------------------------------------------| +| Update Database | database id | +| Update Table | database id,table id | +| Update Segment | database id,table id,segment id | +| Update Block | atabase id,table id,segment id,block id | EntryNode records when an entry was created and deleted. If entry is not deleted, the deletion time is 0. If the current transaction is being created or deleted, the corresponding time is UncommitTS. -``` +-------------------+ | Entry Node | +---------+---------+ | Create@ | Delete@ | +---------+---------+ ``` +``` ++-------------------+ +| Entry Node | ++---------+---------+ +| Create@ | Delete@ | ++---------+---------+ +``` For segment and block, Entry Node also records metaLoc, deltaLoc, which are the addresses recorded on S3 for data and deletion, respectively. -``` +----------------------------------------+ | Entry Node | +---------+---------+---------+----------+ | Create@ | Delete@ | metaLoc | deltaLoc | +---------+---------+---------+----------+ ``` +``` + +----------------------------------------+ + | Entry Node | + +---------+---------+---------+----------+ + | Create@ | Delete@ | metaLoc | deltaLoc | + +---------+---------+---------+----------+ +``` For tables, Entry Node also documents the table structure schema. -``` +----------------------------+ | Entry Node | +---------+---------+--------+ | Create@ | Delete@ | schema | +---------+---------+--------+ ``` +``` + +----------------------------+ + | Entry Node | + +---------+---------+--------+ + | Create@ | Delete@ | schema | + +---------+---------+--------+ +```
&nbsp&nbsp&nbsp2. &nbspAppend
The inserted data and the location of that data are documented in the Append Command. -``` +-------------------------------------------+ | Append Command | +--------------+--------------+- -+-------+ | AppendInfo-1 | AppendInfo-2 | ... | Batch | +--------------+--------------+- -+-------+ ``` +``` ++-------------------------------------------+ +| Append Command | ++--------------+--------------+- -+-------+ +| AppendInfo-1 | AppendInfo-2 | ... | Batch | ++--------------+--------------+- -+-------+ +``` - Batch is the inserted data. - AppendInfo Data in one Append Data Command may span multiple blocks. Each block corresponds to an Append Info that records the location of the data in Command's Batch pointer to data, and the location destination of the data in the block. -``` +------------------------------------------------------------------------------+ | AppendInfo | +-----------------+------------------------------------------------------------+ | pointer to data | destination | +--------+--------+-------+----------+------------+----------+--------+--------+ | offset | length | db id | table id | segment id | block id | offset | length | +--------+--------+-------+----------+------------+----------+--------+--------+ ``` +``` ++------------------------------------------------------------------------------+ +| AppendInfo | ++-----------------+------------------------------------------------------------+ +| pointer to data | destination | ++--------+--------+-------+----------+------------+----------+--------+--------+ +| offset | length | db id | table id | segment id | block id | offset | length | ++--------+--------+-------+----------+------------+----------+--------+--------+ +```
&nbsp&nbsp&nbsp3. &nbspDelete Command
Each Delete Command contains only one delete from a block. -``` +---------------------------+ | Delete Command | +-------------+-------------+ | Destination | Delete Mask | +-------------+-------------+ ``` +``` ++---------------------------+ +| Delete Command | ++-------------+-------------+ +| Destination | Delete Mask | ++-------------+-------------+ +``` - Destination record on which Block Delete occurred. - Delete Mask records the deleted line number. diff --git a/docs/MatrixOne/Overview/feature/1.1-mysql-compatibility.md b/docs/MatrixOne/Overview/feature/1.1-mysql-compatibility.md deleted file mode 100644 index ba4e7668a..000000000 --- a/docs/MatrixOne/Overview/feature/1.1-mysql-compatibility.md +++ /dev/null @@ -1,210 +0,0 @@ -# **MySQL Compatibility** - -This documentation primarily introduces the compatibility comparison information between the MySQL mode of MatrixOne database and the native MySQL database. - -MatrixOne is highly compatible with the MySQL 8.0 protocol and commonly used features and syntax of MySQL 8.0. Additionally, MatrixOne provides support for commonly used MySQL-related tools, including Navicat, MySQL Workbench, JDBC, etc. However, due to the different technical architecture of MatrixOne and its ongoing development and improvement, some functionalities are not yet supported. This section will mainly discuss the differences between the MySQL mode of MatrixOne database and the native MySQL database from the following aspects: - -- DDL Statements -- DCL Statements -- DML Statements -- Advanced SQL Features -- Data Types -- Indexes and Constraints -- Partition -- Functions and Operators -- Storage Engine -- Transaction -- Security and Permissions -- Backup and Restore -- System Variables -- Programming Language -- Peripheral Tools - -## DDL statements - -### About DATABASE - -* A database with a Chinese name is not supported. -* `ENCRYPTION` are currently supported but do not work. -* `ALTER DATABASE` is not supported. -* Only the `utf8mb4` character set and `utf8mb4_bin` collation are supported by default and cannot be changed. - -### About TABLE - -* The `CREATE TABLE .. AS SELECT` statement is not supported. -* Support `AUTO_INCREMENT` in the column definition, but not the `AUTO_INCREMENT` custom start value in a table definition. -* `CHARACTER SET/CHARSET` and `COLLATE` in column definitions are not supported. -* `ENGINE=` in the table definition is not supported. -* The clauses: `CHANGE [COLUMN]`, `MODIFY [COLUMN]`, `RENAME COLUMN`, `ADD [CONSTRAINT [symbol]] PRIMARY KEY`, `DROP PRIMARY KEY`, and `ALTER COLUMN ORDER BY` can be freely combined in `ALTER TABLE`, these are not supported to be used with other clauses for the time being. -* Temporary tables currently do not support using `ALTER TABLE` to modify the table structure. -* Tables created using `CREATE TABLE ... CLUSTER BY...` do not allow modifications to the table structure using `ALTER TABLE`. -* `ALTER TABLE` does not support `PARTITION` related operations. -* Support defining `Cluster by column` clauses to pre-sort a column to speed up queries. - -### About VIEW - -* `CREATE OR REPLACE VIEW` is not supported. -* The `with check option` clause is not supported, but MatrixOne simply ignores' ENGINE= '. -* The `DEFINER` and `SQL SECURITY` clauses are not supported. - -### About SEQUENCE - -* MySQL does not support `SEQUENCE` objects, but MatrixOne can create a sequence through `CREATE SEQUENCE`, and the syntax of MatrixOne is the same as PostgreSQL. -* When using `SEQUENCE` in a table, you must pay attention to the `auto_increment` and `sequence` cannot be used together; otherwise, an error will be occured. - -## DCL Statement - -### About ACCOUNT - -* Multi Account is a unique function of MatrixOne, including related statements such as `CREATE/ALTER/DROP ACCOUNT`. - -### About Permission - -* `GRANT`, authorization logic is different from MySQL. - -* `REVOLE`, the recovery logic is different from MySQL. - -### About SHOW - -* MatrixOne does not support performing SHOW operations on certain objects, including `TRIGGER`, `FUNCTION`, `EVENT`, `PROCEDURE`, `ENGINE`, and so on. -* Due to architectural differences, MatrixOne has implemented some SHOW commands solely for syntactic compatibility; these commands will not produce any output, such as `SHOW STATUS/PRIVILEGES`, etc. -* Although some commands have the same syntax as MySQL, their results differ significantly from MySQL due to different implementations. These commands include `SHOW GRANTS`, `SHOW ERRORS`, `SHOW PROCESSLIST`, `SHOW VARIABLES`. -* For the purpose of its own management, MatrixOne offers several unique SHOW commands such as `SHOW BACKEND SERVERS`, `SHOW ACCOUNTS`, `SHOW ROLES`, `SHOW NODE LIST`, and others. - -### About SET - -* The system variables in MatrixOne differ significantly from MySQL, with most only providing syntactic compatibility. The parameters that can be set at present include: `ROLE`, `SQL_MODE`, and `TIME_ZONE`. - -## DML Statements - -### About SELECT - -* `SELECT...FOR UPDATE` only supports single-table queries. - -### About INSERT - -* MatrixOne does not support modifiers such as `LOW_PRIORITY`, `DELAYED`, `HIGH_PRIORITY`, `IGNORE`. - -### About UPDATE - -* MatrixOne does not support the use of `LOW_PRIORITY` and `IGNORE` modifiers. - -### About DELETE - -* MatrixOne does not support modifiers such as `LOW_PRIORITY`, `QUICK`, or `IGNORE`. - -### About Subqueries - -* MatrixOne does not support multi-level associated subqueries in `IN`. - -### About LOAD - -* MatrixOne supports `SET`, but only in the form of `SET columns_name=nullif(expr1,expr2)`. -* MatrixOne does not support `ESCAPED BY`. -* MatrixOne supports `LOAD DATA LOCAL` on the client side, but the `--local-infle` parameter must be added when connecting. -* MatrixOne supports the import of `JSONlines` files but requires some unique syntax. -* MatrixOne supports importing files from object storage but requires some unique syntax. - -### About EXPLAIN - -* MatrixOne's `Explain` and `Explain Analyze` printing formats refer to PostgreSQL, which differs from MySQL. -* JSON-type output is not supported. - -### other - -* The `REPLACE` statement does not currently support rows of values ​​inserted using the `VALUES row_constructor_list` parameter. - -## Advanced SQL Features - -* Triggers are not supported. -* Stored procedures are not supported. -* Event dispatchers are not supported. -* Custom functions are not supported. -* Materialized views are not supported. - -## Data Types - -* BOOL: Different from MySQL's Boolean value type, which is int, MatrixOne's `Boolean` value is a new type whose value can only be `True` or `False`. -* DECIMAL: `DECIMAL(P, D)`, the maximum precision of the effective number P and the number of digits after the decimal point D of MatrixOne is 38 digits, and MySQL is 65 and 30, respectively. -* Float numbers: The usage of `Float(M,D)` and `Double(M,D)` is discarded after MySQL 8.0.17, but MatrixOne still retains this usage. -* DATETIME: The maximum value range of MySQL is `'1000-01-01 00:00:00'` to `'9999-12-31 23:59:59'`, and the maximum range of MatrixOne is `'0001-01 -01 00:00:00'` to `'9999-12-31 23:59:59'`. -* TIMESTAMP: The maximum value range of MySQL is `'1970-01-01 00:00:01.000000'` UTC to `'2038-01-19 03:14:07.999999'` UTC, the maximum range of MatrixOne is `'0001- 01-01 00:00:00'` UTC to `'9999-12-31 23:59:59'` UTC. -* MatrixOne supports `UUID` type. -* Spatial types are not supported. -* `BIT` and `SET` types are not supported. -* `MEDIUMINT` type is not supported. - -## Indexes and Constraints - -* Secondary indexes only implement syntax and have no speedup effect. -* Foreign keys do not support the `ON CASCADE DELETE` cascade delete. - -## Partition Support - -* Only support `KEY`, `HASH`, `RANGE`, `RANGE COLUMNS`, `LIST`, `LIST COLUMNS` six partition types. -* Subpartitions implement only syntax, not functionality. - -## Functions and Operators - -### Aggregate Functions - -* Support MatrixOne-specific Median function. - -### Date and Time Functions - -* MatrixOne's `TO_DATE` function is the same as MySQL's `STR_TO_DATE` function. - -### CAST Function - -* The type conversion rules are pretty different from MySQL; see [CAST](../../Reference/Operators/operators/cast-functions-and-operators/cast.md). - -### Window functions - -* Only `RANK`, `DENSE_RANK`, `ROW_NUMBER` are supported. - -### JSON functions - -* Only `JSON_UNQUOTE`, `JSON_QUOTE`, `JSON_EXTRACT` are supported. - -### System Management functions - -- `CURRENT_ROLE_NAME()`, `CURRENT_ROLE()`, `CURRENT_USER_NAME()`, `CURRENT_USER()`, `PURGE_LOG()` are supported. - -## TAE Storage Engine - -* MatrixOne's TAE storage engine is independently developed and does not support MySQL's InnoDB, MyISAM, or other engines. -* There is only a TAE storage engine in MatrixOne; there is no need to use `ENGINE=XXX` to change the engine. - -## Security and Permissions - -* Only using `ALTER USER` can change the password. -* Does not support modifying the upper limit of user connections. -* Connection IP whitelisting is not supported. -* Does not support `LOAD` file authorization management. -* Can support `SELECT INTO` file authorization management through the `CREATE STAGE` section. - -## Transaction - -* MatrixOne defaults to optimistic transactions. -* different from MySQL, DDL statements in MatrixOne are transactional, and DDL operations can be rolled back within a transaction. -* Table-level lock `LOCK/UNLOCK TABLE` is not supported. - -## Backup and Restore - -* The mysqldump backup tool is not supported; only the modump tool is supported. -* Physical backups are supported. -* Does not support binlog log backup. -* Incremental backups are not supported. - -## System variables - -* MatrixOne's `lower_case_table_names` has 5 modes; the default is 1. - -## Programming language - -* Java, Python, Golang connectors, and ORM are basically supported, and connectors and ORMs in other languages ​​may encounter compatibility issues. - -## Other support tools - -* Navicat, DBeaver, MySQL Workbench, and HeidiSQL are basically supported, but the support for table design functions could be better due to the incomplete ability of ALTER TABLE. -* The xtrabackup backup tool is not supported. diff --git a/docs/MatrixOne/Overview/feature/key-feature-htap.md b/docs/MatrixOne/Overview/feature/key-feature-htap.md index 34b783212..067e9eddd 100644 --- a/docs/MatrixOne/Overview/feature/key-feature-htap.md +++ b/docs/MatrixOne/Overview/feature/key-feature-htap.md @@ -20,7 +20,9 @@ MatrixOne implements HTAP through modular storage, calculation, transaction arch The overall technical architecture of MatrixOne adopts a separate architecture of storage and computation. The modular design separates the database's computation, storage, and transaction processing into independent modules, thus forming a database system with independent scalability for each component. As shown in the following figure, MatrixOne is composed of three independent layers: -![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/htap/mo-htap-arch.png?raw=true) +
+ +
- **Computation layer**, with Compute Node as the unit, realizes serverless computation and transaction processing. It has its Cache, supporting random restarts and scaling; multiple Compute Nodes can calculate parallel to improve query efficiency. - **Transaction layer**, composed of Transaction Node and Log Service, provides complete log service and metadata information, with built-in Logtail for storing recently written new data. @@ -48,7 +50,9 @@ At the execution level, MatrixOne will route it to different processing links ac ##### Write Request Processing -![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/htap/write.png?raw=true) +
+ +
As shown in the figure, when processing write requests (INSERT/UPDATE/DELETE): @@ -62,7 +66,9 @@ From the figure above, it is known that small data volume OLTP-type write reques ##### Read Request Processing -![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/htap/read.png?raw=true) +
+ +
As shown in the figure, the CN node will first check the subscribed Logtail data when handling read requests. If the data directly hits Logtail, it is in the latest part of the written data and can be directly returned. If it does not hit Logtail, CN will check its cache and other visible CNs. If it hits the cache, it will directly return the result. If it does not hit the cache, CN will judge whether a large amount of data needs to be read through the execution plan. Multiple CN nodes will read in parallel from the object storage if it exceeds a certain threshold (such as 200 block sizes). A single CN node will read from object storage if it does not exceed the threshold. diff --git a/docs/MatrixOne/Overview/feature/key-feature-multi-accounts.md b/docs/MatrixOne/Overview/feature/key-feature-multi-accounts.md index 401abc745..05c623506 100644 --- a/docs/MatrixOne/Overview/feature/key-feature-multi-accounts.md +++ b/docs/MatrixOne/Overview/feature/key-feature-multi-accounts.md @@ -68,7 +68,7 @@ Both traditional models have specific challenges: The multi-account capability of MatrixOne brings a new architectural approach. accounts still share a MatrixOne cluster, and unified account O&M and management can be performed through system accounts. In addition, the isolation of data and resources is realized through the built-in multi-account capability. Each account can independently expand and contract resources, further reducing the difficulty of operation and maintenance. This approach meets not only the requirements for isolation but also the requirements for low resource and operation and maintenance costs. -mo-account-arch +mo-account-arch |Multi-account mode|Data isolation degree|Resource cost|Resource isolation|Operation and maintenance complexity| |---|---|---|---|---| diff --git a/docs/MatrixOne/Overview/matrixone-introduction.md b/docs/MatrixOne/Overview/matrixone-introduction.md index deef87853..2126857d4 100644 --- a/docs/MatrixOne/Overview/matrixone-introduction.md +++ b/docs/MatrixOne/Overview/matrixone-introduction.md @@ -4,7 +4,7 @@ MatrixOne is a hyper-converged cloud & edge native distributed database with a s MatrixOne touts significant features, including real-time HTAP, multi-tenancy, stream computation, extreme scalability, cost-effectiveness, enterprise-grade availability, and extensive MySQL compatibility. MatrixOne unifies tasks traditionally performed by multiple databases into one system by offering a comprehensive ultra-hybrid data solution. This consolidation simplifies development and operations, minimizes data fragmentation, and boosts development agility. -![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/mo-new-arch.png?raw=true) +![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/architecture/archi-en-1.png?raw=true) MatrixOne is optimally suited for scenarios requiring real-time data input, large data scales, frequent load fluctuations, and a mix of procedural and analytical business operations. It caters to use cases such as mobile internet apps, IoT data applications, real-time data warehouses, SaaS platforms, and more. diff --git a/docs/MatrixOne/Overview/matrixone-vs-other_databases/matrixone-positioning .md b/docs/MatrixOne/Overview/matrixone-vs-other_databases/matrixone-positioning .md index 3ff51ffa7..1ceb6e775 100644 --- a/docs/MatrixOne/Overview/matrixone-vs-other_databases/matrixone-positioning .md +++ b/docs/MatrixOne/Overview/matrixone-vs-other_databases/matrixone-positioning .md @@ -4,7 +4,7 @@ Among the large and complex data technology stack and various database products, The database product closest to MatrixOne among the industry's existing database offerings is SingleStore, both of which use a unified storage model that supports the convergence of OLTP, OLAP, and a host of other data loads and data types, while also both having cloud native and flexible scalability as their core architectural capabilities. -![mo\_vs\_singlestore](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/overview/mo-other-database/mo_vs_singlestore.png) +![mo_vs_singlestore](https://github.com/matrixorigin/artwork/blob/main/docs/overview/mo-other-database/mo_vs_singlestore.png?raw=true) - Architecturally, MatrixOne is a fully cloud-native and containerized database. MatrixOne draws on Snowflake's computational separation design for [cloud-native data warehouses](https://event.cwi.nl/lsde/papers/p215-dageville-snowflake.pdf), completely handing over storage to shared storage on the cloud, while fully building the compute layer into a stateless container. At the same time, to accommodate the processing of fast write requests by OLTP-type loads, MatrixOne adds the concepts of TN and LogService to support high-frequency writes with block storage, ensures high availability of write log WALs with Raft triple copy consistency guarantee, and asynchronously drops WALs into shared storage. Unlike SingleStore, which extends from the Share-nothing architecture to cloud-native memory separation, it only puts cold data in shared storage (see [SingleStore architecture paper](https://dl.acm.org/doi/pdf/10.1145/3514221.3526055)) and still requires data fragmentation and rebalancing. MatrixOne, on the other hand, is consistent with Snowflake and is entirely based on shared storage without any data fragmentation. diff --git a/docs/MatrixOne/Overview/matrixone-vs-other_databases/matrixone-vs-oltp.md b/docs/MatrixOne/Overview/matrixone-vs-other_databases/matrixone-vs-oltp.md index 11ede5e66..4d83df077 100644 --- a/docs/MatrixOne/Overview/matrixone-vs-other_databases/matrixone-vs-oltp.md +++ b/docs/MatrixOne/Overview/matrixone-vs-other_databases/matrixone-vs-oltp.md @@ -24,7 +24,7 @@ OLTP databases can also be divided into centralized databases, distributed datab It is worth noting that there are no strict dividing criteria for these three classifications, and each database has gradually begun to integrate the capabilities of other route products as it has evolved in practice. Oracle's RAC architecture, for example, is a typical shared storage architecture with some scalability. Products like CockroachDB and TiDB are also evolving toward cloud-native and shared storage. In practice, OLTP is the most widely needed database scenario, and products along all three technical routes are also used by a large number of users.
- +
## OLTP Features of MatrixOne @@ -44,20 +44,48 @@ There are two differences from Aurora: Of course, MatrixOne isn't limited to OLTP capabilities, and MatrixOne's ability to accommodate other loads is significantly different from Aurora's positioning. -![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/overview/mo-other-database/mo_vs_aurora.png) +![](https://github.com/matrixorigin/artwork/blob/main/docs/overview/mo-other-database/mo_vs_aurora.png?raw=true) ## MatrixOne versus MySQL Since MatrixOne's primary goal is to be compatible with MySQL, MySQL itself is the world['s most popular open source database](https://db-engines.com/en/ranking). A large portion of MatrixOne's users are migrated from open source MySQL to MatrixOne, so here we compare MatrixOne to MySQL in detail. -| | | MySQL | MatrixOne | | | -------------------------- | -------------------------------------- | ---------------------------------- | -------------------------------------------------- | | Version | 8.0.37 | Latest Version | ---------------------------------------------------------------------- | | | License | GPL License | Apache License 2.0 | | | | Schema | Centralized Database | Distributed Cloud Native Database | | | | Load Type | OLTP, Analytical Load Depends on Enterprise Heatwave | HTAP, Timing | | Storage Format | RBAC-Based Functions Base Window | Row Storage Engine | InnoDB/MyIsam | TAE | | Interactions | SQL | SQL | | | Deployment Method | Standalone Deployment/Master/Slave Deployment | Standalone Deployment/Master/Slave Deployment/Distributed Deployment/K8s Deployment | | Scale Out Capabilities | Reliance on Split Table Middleware Implementation | Natural Support | | Transaction Capabilities | Pessimistic Transactions/Optimistic Transactions + ANSI 4 Isolation Levels (InnoDB Engine) | Pessimistic Transactions/Optimistic Transactions + RC/SI | | | | Data Types | Base Numerical Values, Time Date, Characters, JSON, Spatial | Base Numerical Values, Time, Dates, Characters, JSON, Vector | Indexes and Constraints | Primary Keys, Unique Foreign Keys, Unique Foreign Keys, Unique Foreign Keys | Foreign Keys +| | MySQL | MatrixOne | +| ------------------ | ----------------- | ---------------------- | +| Versions | 8.0.37 | Latest Version +| License | GPL License 2.0 | Apache License 2.0 | Apache License| +| Architecture | Centralized Databases | Distributed Cloud-Native Databases| +| Load Types | OLTP, Analytical loads rely on enterprise version of Heatwave | HTAP, Time-Series | +| Storage Formats | Row Stores | Column Stores | +| Storage Engines | InnoDB/MyIsam | TAE | +| Interaction | SQL | SQL | +| Deployment Mode | Standalone Deployment/Master-Slave Deployment | Standalone Deployment/Master-Slave Deployment/Distributed Deployment/K8s Deployment | +| Horizontal Scalability | Dependent on Split Database and Split Table Middleware | Natural Support | +| Affair capacity | Pessimistic transactions/optimistic transactions + ANSI 4 isolation levels (InnoDB Engine) | Pessimistic Service/Optimistic Service + RC/SI | +| Data Types | Base Numeric, TimeDate, Character, JSON, Space | Base Numeric, TimeDate, Character, JSON, Vector | +| Indexes and Constraints | Primary key, Secondary key, Unique key, Foreign key| Primary key, Secondary key, Unique key, Foreign key | +| Access Control | RBAC-Based | RBAC-Based | RBAC-Based | +| Window Functions | Base Window Functions | Base Window Functions, Time Sliding Window | +| Advanced SQL Capabilities | Triggers, Stored Procedures | Unsupported +| Streaming Computing | Not Supported | Streaming Writes/kafka Connector/Dynamic Tables | +| UDF | UDF for SQL and C | UDF for SQL and Python | UDF for SQL and Python | +| Multi-tenancy | Not Supported | Supported | +| Data Sharing | Not Supported | Support for Inter-tenant Data Sharing | +| Programming Languages | Most Languages | Java, Python, Golang Connector and ORM Basic Support | +| Common Visualization Management Tools | Navicat, DBeaver, MySQL Workbench, DataGrip, HeidiSQL, etc. | Consistent with MySQL | +| Backup Tools | Logical Backup, Physical Backup | Logical Backup, Physical Backup, Snapshot Backup | Logical Backup, Physical Backup, Snapshot Backup | +| CDC Competencies | Yes | No | +| OLTP Performance | Standalone excellent, non-scalable | Standalone good, scalable | +| OLAP Performance | Poor | Excellent, Scalable | +| High Volume Write Performance | Poor | Excellent, Scalable | +| Storage Space | Limited to Disk | Unlimited Expansion | Additional details can be found in [MatrixOne's MySQL compatibility details](../../Overview/feature/mysql-compatibility.md). Overall, MatrixOne is a highly MySQL-compatible cloud-native HTAP database that works seamlessly with most MySQL-based applications. At the same time, MatrixOne naturally has great scalability and the ability to support other types of business loads. In addition, based on MatrixOne's memory separation and multi-tenancy features, users have the flexibility to design their application architecture with MatrixOne as a one-stop shop for load isolation issues previously addressed by applications, middleware, or other databases.
- +
For MySQL users, MatrixOne is a more appropriate option if they experience bottlenecks with: diff --git a/docs/MatrixOne/Reference/Functions-and-Operators/Aggregate-Functions/bitmap.md b/docs/MatrixOne/Reference/Functions-and-Operators/Aggregate-Functions/bitmap.md index b2562ad6e..b86440f71 100644 --- a/docs/MatrixOne/Reference/Functions-and-Operators/Aggregate-Functions/bitmap.md +++ b/docs/MatrixOne/Reference/Functions-and-Operators/Aggregate-Functions/bitmap.md @@ -8,7 +8,7 @@ We can use only one bit to identify the presence or absence of an element, 1 for We specify that the maximum width of bitmap is 32768 (2^15 = 4K), and for non-negative integer n, take its lower 15 bits (binary) as the position in bitmap and the other high bits as the number of the bitmap bucket. The following diagram shows the logic of bitmap: -![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/reference/bitmap.png) +![](https://github.com/matrixorigin/artwork/blob/main/docs/reference/bitmap.png?raw=true) Each bucket is a bitmap, and since the buckets are orthogonal, each bucket doing the operation (or,bit_count) can be done only in the current bucket, regardless of the other buckets. diff --git a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/cosine_distance.md b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/cosine_distance.md index 040da6596..e168af55b 100644 --- a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/cosine_distance.md +++ b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/cosine_distance.md @@ -7,7 +7,7 @@ The `COSINE_DISTANCE()` function is used to calculate the cosine distance betwee Cosine Distance is a measure of the difference in direction between two vectors, usually defined as 1 minus [Cosine Similarity](cosine_similarity.md). The value of the cosine distance ranges from 0 to 2. 0 means that both vectors are in exactly the same direction (minimum distance). 2 means that the two vectors are in exactly the opposite direction (maximum distance). In text analysis, cosine distance can be used to measure similarities between documents. Since it considers only the direction of the vector and not the length, it is fair for comparisons between long and short text.
- +
## Function syntax diff --git a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/cosine_similarity.md b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/cosine_similarity.md index 75ef4b7d6..cef4ecfc4 100644 --- a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/cosine_similarity.md +++ b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/cosine_similarity.md @@ -5,7 +5,7 @@ `cosine_similarity()` is a cosine similarity that measures the cosine value of the angle between two vectors, indicating their similarity by how close they are in multidimensional space, where 1 means exactly similar and -1 means completely different. Cosine similarity is calculated by dividing the inner product of two vectors by the product of their l2 norm.
- +
## **Function syntax** diff --git a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/inner_product.md b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/inner_product.md index 9c2ef20bd..ecfc3fe20 100644 --- a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/inner_product.md +++ b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/inner_product.md @@ -4,7 +4,7 @@ The `INNER PRODUCT` function is used to calculate the inner/dot product between two vectors. It is the result of multiplying the corresponding elements of two vectors and then adding them. -![inner_product](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/reference/vector/inner_product.png?raw=true) +![inner_product](https://github.com/matrixorigin/artwork/blob/main/docs/reference/vector/inner_product.png?raw=true) ## **Function syntax** diff --git a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l1_norm.md b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l1_norm.md index 2b8914ab7..17fc82678 100644 --- a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l1_norm.md +++ b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l1_norm.md @@ -5,7 +5,7 @@ The `l1_norm` function is used to calculate the `l1`/Manhattan/TaxiCab norm. The `l1` norm is obtained by summing the absolute values of the vector elements.
- +
You can use the `l1` norm to calculate the `l1` distance. diff --git a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l2_distance.md b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l2_distance.md index 8b1ea1ec8..af02023b6 100644 --- a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l2_distance.md +++ b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l2_distance.md @@ -7,7 +7,7 @@ The `L2_DISTANCE()` function is used to calculate the Euclidean distance between L2 distance, also known as Euclidean Distance, is one of the most commonly used distance measures in vector spaces. It measures the straight line distance between two points in multidimensional space. l2 distance has many practical applications, including areas such as machine learning, computer vision, and spatial analysis.
- +
## Function syntax diff --git a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l2_norm.md b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l2_norm.md index c198d318c..aa90611d4 100644 --- a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l2_norm.md +++ b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/l2_norm.md @@ -5,7 +5,7 @@ The `l2_norm` function is used to calculate the `l2`/Euclidean norm. The `l2` norm is obtained by performing a square root operation on the sum of squares of the vector elements.
- +
## **Function syntax** diff --git a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/normalize_l2.md b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/normalize_l2.md index 1751b9390..e36ebfec3 100644 --- a/docs/MatrixOne/Reference/Functions-and-Operators/Vector/normalize_l2.md +++ b/docs/MatrixOne/Reference/Functions-and-Operators/Vector/normalize_l2.md @@ -7,7 +7,7 @@ The `NORMALIZE_L2()` function performs Euclidean normalization on vectors. The L2 norm is the square root of the sum of the squares of the vector elements, so the purpose of L2 normalization is to make the length (or norm) of the vector 1, which is often referred to as a unit vector. This normalization method is particularly useful in machine learning, especially when dealing with feature vectors. It can help standardize the scale of features and thus improve the performance of algorithms.
- +
## Function syntax diff --git a/docs/MatrixOne/Reference/System-tables.md b/docs/MatrixOne/Reference/System-tables.md index 2e43c6a13..2c948e1c4 100644 --- a/docs/MatrixOne/Reference/System-tables.md +++ b/docs/MatrixOne/Reference/System-tables.md @@ -10,68 +10,25 @@ The `mo_catalog` is used to store metadata about MatrixOne objects such as: data The concept of multi-tenancy was introduced with MatrixOne version 0.6, and the default `sys` tenant behaves slightly differently from other tenants. The system table `mo_account`, which serves multi-tenant management, is only visible to `sys` tenants; it is not visible to other tenants. -### mo_database table - -| column | type | comments | -| ---------------- | --------------- | --------------------------------------- | -| dat_id | bigint unsigned | Primary key ID | -| datname | varchar(100) | Database name | -| dat_catalog_name | varchar(100) | Database catalog name, default as `def` | -| dat_createsql | varchar(100) | Database creation SQL statement | -| owner | int unsigned | Role id | -| creator | int unsigned | User id | -| created_time | timestamp | Create time | -| account_id | int unsigned | Account id | -| dat_type | varchar(23) | Database type, common library or subscription library | - -### mo_tables table - -| column | type | comments | -| -------------- | --------------- | ------------------------------------------------------------ | -| rel_id | bigint unsigned | Primary key, table ID | -| relname | varchar(100) | Name of the table, index, view, and so on. | -| reldatabase | varchar(100) | The database that contains this relation. reference mo_database.datname | -| reldatabase_id | bigint unsigned | The database id that contains this relation. reference mo_database.datid | -| relpersistence | varchar(100) | p = permanent table, t = temporary table | -| relkind | varchar(100) | r = ordinary table, e = external table, i = index, S = sequence, v = view, m = materialized view | -| rel_comment | varchar(100) | | -| rel_createsql | varchar(100) | Table creation SQL statement | -| created_time | timestamp | Create time | -| creator | int unsigned | Creator ID | -| owner | int unsigned | Creator's default role id | -| account_id | int unsigned | Account id | -| partitioned | blob | Partition by statement | -| partition_info | blob | the information of partition | -| viewdef | blob | View definition statement | -| constraint | varchar(5000) | Table related constraints | -| catalog_version | INT UNSIGNED(0) | Version number of the system table | - -### mo_columns table +### mo_indexes table -| column | type | comments | -| --------------------- | --------------- | ------------------------------------------------------------ | -| att_uniq_name | varchar(256) | Primary Key. Hidden, composite primary key, format is like "${att_relname_id}-${attname}" | -| account_id | int unsigned | accountID | -| att_database_id | bigint unsigned | databaseID | -| att_database | varchar(256) | database Name | -| att_relname_id | bigint unsigned | table id | -| att_relname | varchar(256) | The table this column belongs to.(references mo_tables.relname) | -| attname | varchar(256) | The column name | -| atttyp | varchar(256) | The data type of this column (zero for a dropped column). | -| attnum | int | The number of the column. Ordinary columns are numbered from 1 up. | -| att_length | int | bytes count for the type. | -| attnotnull | tinyint(1) | This represents a not-null constraint. | -| atthasdef | tinyint(1) | This column has a default expression or generation expression. | -| att_default | varchar(1024) | default expression | -| attisdropped | tinyint(1) | This column has been dropped and is no longer valid. A dropped column is still physically present in the table, but is ignored by the parser and so cannot be accessed via SQL. | -| att_constraint_type | char(1) | p = primary key constraint, n=no constraint | -| att_is_unsigned | tinyint(1) | unsigned or not | -| att_is_auto_increment | tinyint(1) | auto increment or not | -| att_comment | varchar(1024) | comment | -| att_is_hidden | tinyint(1) | hidden or not | -| attr_has_update | tinyint(1) | This columns has update expression | -| attr_update | varchar(1024) | update expression | -| attr_is_clusterby | tinyint(1) | Whether this column is used as the cluster by keyword to create the table | +| column | type | comments | +| -----------------| --------------- | ----------------- | +| id | BIGINT UNSIGNED(64) | index ID | +| table_id | BIGINT UNSIGNED(64) | ID of the table where the index resides | +| database_id | BIGINT UNSIGNED(64) | ID of the database where the index resides | +| name | VARCHAR(64) | name of the index | +| type | VARCHAR(11) | The type of index, including primary key index (PRIMARY), unique index (UNIQUE), secondary index (MULTIPLE) | +| algo_table_type | VARCHAR(11) | Algorithm for creating indexes | +| algo_table_type | VARCHAR(11) | Hidden table types for multi-table indexes | +| | algo_params | VARCHAR(2048) | Parameters for indexing algorithms | +| is_visible | TINYINT(8) | Whether the index is visible, 1 means visible, 0 means invisible (currently all MatrixOne indexes are visible indexes) | +| hidden | TINYINT(8) | Whether the index is hidden, 1 is a hidden index, 0 is a non-hidden index| +| comment | VARCHAR(2048) | Comment information for the index | +| column_name | VARCHAR(256) | The column name of the constituent columns of the index | +| ordinal_position | INT UNSIGNED(32) | Column ordinal in index, starting from 1 | +| options | TEXT(0) | options option information for index | +| index_table_name | VARCHAR(5000) | The table name of the index table corresponding to the index, currently only the unique index contains the index table | ### mo_table_partitions table @@ -88,6 +45,22 @@ The concept of multi-tenancy was introduced with MatrixOne version 0.6, and the | options | TEXT(0) | Partition options information, currently set to NULL. | | partition_table_name | VARCHAR(1024) | The name of the subtable corresponding to the current partition. | +### mo_user table + +| column | type | comments | +| --------------------- | ------------ | ------------------- | +| user_id | int | user id, primary key | +| user_host | varchar(100) | user host address | +| user_name | varchar(100) | user name | +| authentication_string | varchar(100) | authentication string encrypted with password | +| status | varchar(8) | open,locked,expired | +| created_time | timestamp | user created time | +| expired_time | timestamp | user expired time | +| login_type | varchar(16) | ssl/password/other | +| creator | int | the creator id who created this user | +| owner | int | the admin id for this user | +| default_role | int | the default role id for this user | + ### mo_account table (Only visible for `sys` account) | column | type | comments | @@ -100,33 +73,31 @@ The concept of multi-tenancy was introduced with MatrixOne version 0.6, and the | suspended_time | TIMESTAMP | Time of the account's status is changed | | version | bigint unsigned | the version status of the current account| +### mo_database table + +| column | type | comments | +| ---------------- | --------------- | --------------------------------------- | +| dat_id | bigint unsigned | Primary key ID | +| datname | varchar(100) | Database name | +| dat_catalog_name | varchar(100) | Database catalog name, default as `def` | +| dat_createsql | varchar(100) | Database creation SQL statement | +| owner | int unsigned | Role id | +| creator | int unsigned | User id | +| created_time | timestamp | Create time | +| account_id | int unsigned | Account id | +| dat_type | varchar(23) | Database type, common library or subscription library | + ### mo_role table | column | type | comments | | ------------ | ------------ | ----------------------------- | -| role_id | int unsigned | role id, primary key | +| role_id | int unsigned | role id, primary key | | role_name | varchar(100) | role name | | creator | int unsigned | user_id | | owner | int unsigned | MOADMIN/ACCOUNTADMIN ownerid | | created_time | timestamp | create time | | comments | text | comment | -### mo_user table - -| column | type | comments | -| --------------------- | ------------ | ------------------- | -| user_id | int | user id, primary key | -| user_host | varchar(100) | user host address | -| user_name | varchar(100) | user name | -| authentication_string | varchar(100) | authentication string encrypted with password | -| status | varchar(8) | open,locked,expired | -| created_time | timestamp | user created time | -| expired_time | timestamp | user expired time | -| login_type | varchar(16) | ssl/password/other | -| creator | int | the creator id who created this user | -| owner | int | the admin id for this user | -| default_role | int | the default role id for this user | - ### mo_user_grant table | column | type | comments | @@ -162,6 +133,57 @@ The concept of multi-tenancy was introduced with MatrixOne version 0.6, and the | granted_time | timestamp | granted time | | with_grant_option | bool | If permission granting is permitted | +### mo_user_defined_function table + +| column | type | comments | +| -----------------| --------------- | ----------------- | +| function_id | INT(32) | ID of the function, primary key | +| name | VARCHAR(100) | the name of the function | +| owner | INT UNSIGNED(32) | ID of the role who created the function | +| args | TEXT(0) | Argument list for the function | +| rettype | VARCHAR(20) | return type of the function | +| body | TEXT(0) | function body | +| language | VARCHAR(20) | language used by the function | +| db | VARCHAR(100) | database where the function is located | +| definer | VARCHAR(50) | name of the user who defined the function | +| modified_time | TIMESTAMP(0) | time when the function was last modified | +| created_time | TIMESTAMP(0) | creation time of the function | +| type | VARCHAR(10) | type of function, default FUNCTION | +| security_type | VARCHAR(10) | security processing method, uniform value DEFINER | +| comment | VARCHAR(5000) | Create a comment for the function | +| character_set_client | VARCHAR(64) | Client character set: utf8mb4 | +| collation_connection | VARCHAR(64) | Connection sort: utf8mb4_0900_ai_ci | +| database_collation | VARCHAR(64) | Database connection collation: utf8mb4_0900_ai_ci | + +### mo_mysql_compatbility_mode table + +| column | type | comments | +| -----------------| --------------- | ----------------- | +| configuration_id | INT(32) | Configuration item id, self-incrementing column, used as primary key to distinguish between different configurations | +| account_id | INT(32) | Tenant id of the configuration | +| account_name | VARCHAR(300) | The name of the tenant where the configuration is located | +| dat_name | VARCHAR(5000) | The name of the database where the configuration resides | +| variable_name | VARCHAR(300) | The name of the variable | +| variable_value | VARCHAR(5000) | The name of the database where the configuration resides. | +| variable_value | VARCHAR(5000) | The value of the variable | +| system_variables | BOOL(0) | if it is a system variable (compatibility variables are added in addition to system variables) | + +### mo_pubs table + +| column | type | comments | +| -----------------| --------------- | ----------------- | +| pub_name | VARCHAR(64) | publication name| +| database_name | VARCHAR(5000) | The name of the published data | +| database_id | BIGINT UNSIGNED(64) | ID of the publishing database, corresponding to dat_id in the mo_database table | +| all_table | BOOL(0) | Whether the publishing library contains all tables in the database corresponding to database_id | +| all_account | BOOL(0) | Whether all accounts can subscribe to the library | +| table_list | TEXT(0) | When it is not all table, publish the list of tables contained in the library, and the table name corresponds to the table under the database corresponding to database_id| +| account_list | TEXT(0) |Account list that is allowed to subscribe to the publishing library when it is not all accounts| +| created_time | TIMESTAMP(0) | Time when the release repository was created | +| owner | INT UNSIGNED(32) | The role ID corresponding to the creation of the release library | +| creator | INT UNSIGNED(32) | The ID of the user who created the release library | +| comment | TEXT(0) | Remarks for creating a release library | + ### mo_stages table | column | type | comments | @@ -240,76 +262,73 @@ The concept of multi-tenancy was introduced with MatrixOne version 0.6, and the | lock_content | VARCHAR(65535) | Point locks represent individual values, range locks represent ranges, usually in the form of "low - high". Note that transactions may involve multiple locks, but only the first lock is shown here.| | lock_mode | VARCHAR(65535) | Indicates the mode of the lock, either exclusive or shared. | -### mo_user_defined_function table +### `mo_transactions` 视图 -| column | type | comments | -| -----------------| --------------- | ----------------- | -| function_id | INT(32) | ID of the function, primary key | -| name | VARCHAR(100) | the name of the function | -| owner | INT UNSIGNED(32) | ID of the role who created the function | -| args | TEXT(0) | Argument list for the function | -| rettype | VARCHAR(20) | return type of the function | -| body | TEXT(0) | function body | -| language | VARCHAR(20) | language used by the function | -| db | VARCHAR(100) | database where the function is located | -| definer | VARCHAR(50) | name of the user who defined the function | -| modified_time | TIMESTAMP(0) | time when the function was last modified | -| created_time | TIMESTAMP(0) | creation time of the function | -| type | VARCHAR(10) | type of function, default FUNCTION | -| security_type | VARCHAR(10) | security processing method, uniform value DEFINER | -| comment | VARCHAR(5000) | Create a comment for the function | -| character_set_client | VARCHAR(64) | Client character set: utf8mb4 | -| collation_connection | VARCHAR(64) | Connection sort: utf8mb4_0900_ai_ci | -| database_collation | VARCHAR(64) | Database connection collation: utf8mb4_0900_ai_ci | - -### mo_mysql_compatbility_mode table - -| column | type | comments | -| -----------------| --------------- | ----------------- | -| configuration_id | INT(32) | Configuration item id, self-incrementing column, used as primary key to distinguish between different configurations | -| account_id | INT(32) | Tenant id of the configuration | -| account_name | VARCHAR(300) | The name of the tenant where the configuration is located | -| dat_name | VARCHAR(5000) | The name of the database where the configuration resides | -| variable_name | VARCHAR(300) | The name of the variable | -| variable_value | VARCHAR(5000) | The name of the database where the configuration resides. | -| variable_value | VARCHAR(5000) | The value of the variable | -| system_variables | BOOL(0) | if it is a system variable (compatibility variables are added in addition to system variables) | +| column | type | comments | +| ------------- | --------------- | ------------------------------------ | +| cn_id | VARCHAR(65535) | ID that uniquely identifies the CN (Compute Node). | +| txn_id | VARCHAR(65535) | The ID that uniquely identifies the transaction. | +| create_ts | VARCHAR(65535) | Record the transaction creation timestamp, following the RFC3339Nano format ("2006-01-02T15:04:05.99999999999Z07:00"). | +| snapshot_ts | VARCHAR(65535) | Represents the snapshot timestamp of the transaction, expressed in both physical and logical time. | +| prepared_ts | VARCHAR(65535) | Indicates the prepared timestamp of the transaction, in the form of physical and logical time. | +| commit_ts | VARCHAR(65535) | Indicates the commit timestamp of the transaction, in both physical and logical time.| +| txn_mode | VARCHAR(65535) | Identifies the transaction mode, which can be either pessimistic or optimistic. | +| isolation | VARCHAR(65535) | Indicates the isolation level of the transaction, either SI (Snapshot Isolation) or RC (Read Committed). | +| user_txn | VARCHAR(65535) | Indicates a user transaction, i.e., a transaction created by a SQL operation performed by a user connecting to MatrixOne via a client. | +| txn_status | VARCHAR(65535) | Indicates the current state of the transaction, with possible values including active, committed, aborting, aborted. In the distributed transaction 2PC model, this would also include prepared and committing. | +| table_id | VARCHAR(65535) | Indicates the ID of the table involved in the transaction. | +| lock_key | VARCHAR(65535) | Indicates the type of lock, either range or point. | +| lock_content | VARCHAR(65535) | Point locks represent individual values, range locks represent ranges, usually in the form of "low - high". Note that transactions may involve multiple locks, but only the first lock is shown here.| +| lock_mode | VARCHAR(65535) | Indicates the mode of the lock, either exclusive or shared. | -### mo_pubs table +### mo_columns table -| column | type | comments | -| -----------------| --------------- | ----------------- | -| pub_name | VARCHAR(64) | publication name| -| database_name | VARCHAR(5000) | The name of the published data | -| database_id | BIGINT UNSIGNED(64) | ID of the publishing database, corresponding to dat_id in the mo_database table | -| all_table | BOOL(0) | Whether the publishing library contains all tables in the database corresponding to database_id | -| all_account | BOOL(0) | Whether all accounts can subscribe to the library | -| table_list | TEXT(0) | When it is not all table, publish the list of tables contained in the library, and the table name corresponds to the table under the database corresponding to database_id| -| account_list | TEXT(0) |Account list that is allowed to subscribe to the publishing library when it is not all accounts| -| created_time | TIMESTAMP(0) | Time when the release repository was created | -| owner | INT UNSIGNED(32) | The role ID corresponding to the creation of the release library | -| creator | INT UNSIGNED(32) | The ID of the user who created the release library | -| comment | TEXT(0) | Remarks for creating a release library | +| column | type | comments | +| --------------------- | --------------- | ------------------------------------------------------------ | +| att_uniq_name | varchar(256) | Primary Key. Hidden, composite primary key, format is like "${att_relname_id}-${attname}" | +| account_id | int unsigned | accountID | +| att_database_id | bigint unsigned | databaseID | +| att_database | varchar(256) | database Name | +| att_relname_id | bigint unsigned | table id | +| att_relname | varchar(256) | The table this column belongs to.(references mo_tables.relname) | +| attname | varchar(256) | The column name | +| atttyp | varchar(256) | The data type of this column (zero for a dropped column). | +| attnum | int | The number of the column. Ordinary columns are numbered from 1 up. | +| att_length | int | bytes count for the type. | +| attnotnull | tinyint(1) | This represents a not-null constraint. | +| atthasdef | tinyint(1) | This column has a default expression or generation expression. | +| att_default | varchar(1024) | default expression | +| attisdropped | tinyint(1) | This column has been dropped and is no longer valid. A dropped column is still physically present in the table, but is ignored by the parser and so cannot be accessed via SQL. | +| att_constraint_type | char(1) | p = primary key constraint, n=no constraint | +| att_is_unsigned | tinyint(1) | unsigned or not | +| att_is_auto_increment | tinyint(1) | auto increment or not | +| att_comment | varchar(1024) | comment | +| att_is_hidden | tinyint(1) | hidden or not | +| attr_has_update | tinyint(1) | This columns has update expression | +| attr_update | varchar(1024) | update expression | +| attr_is_clusterby | tinyint(1) | Whether this column is used as the cluster by keyword to create the table | -### mo_indexes table +### mo_tables table -| column | type | comments | -| -----------------| --------------- | ----------------- | -| id | BIGINT UNSIGNED(64) | index ID | -| table_id | BIGINT UNSIGNED(64) | ID of the table where the index resides | -| database_id | BIGINT UNSIGNED(64) | ID of the database where the index resides | -| name | VARCHAR(64) | name of the index | -| type | VARCHAR(11) | The type of index, including primary key index (PRIMARY), unique index (UNIQUE), secondary index (MULTIPLE) | -| algo_table_type | VARCHAR(11) | Algorithm for creating indexes | -| algo_table_type | VARCHAR(11) | Hidden table types for multi-table indexes | -| | algo_params | VARCHAR(2048) | Parameters for indexing algorithms | -| is_visible | TINYINT(8) | Whether the index is visible, 1 means visible, 0 means invisible (currently all MatrixOne indexes are visible indexes) | -| hidden | TINYINT(8) | Whether the index is hidden, 1 is a hidden index, 0 is a non-hidden index| -| comment | VARCHAR(2048) | Comment information for the index | -| column_name | VARCHAR(256) | The column name of the constituent columns of the index | -| ordinal_position | INT UNSIGNED(32) | Column ordinal in index, starting from 1 | -| options | TEXT(0) | options option information for index | -| index_table_name | VARCHAR(5000) | The table name of the index table corresponding to the index, currently only the unique index contains the index table | +| column | type | comments | +| -------------- | --------------- | ------------------------------------------------------------ | +| rel_id | bigint unsigned | Primary key, table ID | +| relname | varchar(100) | Name of the table, index, view, and so on. | +| reldatabase | varchar(100) | The database that contains this relation. reference mo_database.datname | +| reldatabase_id | bigint unsigned | The database id that contains this relation. reference mo_database.datid | +| relpersistence | varchar(100) | p = permanent table, t = temporary table | +| relkind | varchar(100) | r = ordinary table, e = external table, i = index, S = sequence, v = view, m = materialized view | +| rel_comment | varchar(100) | | +| rel_createsql | varchar(100) | Table creation SQL statement | +| created_time | timestamp | Create time | +| creator | int unsigned | Creator ID | +| owner | int unsigned | Creator's default role id | +| account_id | int unsigned | Account id | +| partitioned | blob | Partition by statement | +| partition_info | blob | the information of partition | +| viewdef | blob | View definition statement | +| constraint | varchar(5000) | Table related constraints | +| catalog_version | INT UNSIGNED(0) | Version number of the system table | ## `system_metrics` database @@ -509,9 +528,9 @@ The description of columns in the `ENGINES` table is as follows: - `XA`: Whether the storage engine supports XA transactions. - `SAVEPOINTS`: Whether the storage engine supports `savepoints`. -### `PARTITIONS` table +### `PARTITIONS` view -The description of columns in the `PARTITIONS` table is as follows: +The description of columns in the `PARTITIONS` View is as follows: - `TABLE_CATALOG`: The name of the catalog to which the table belongs. This value is always def. - `TABLE_SCHEMA`: The name of the schema (database) to which the table belongs. @@ -560,7 +579,7 @@ Fields in the `PROCESSLIST` view are described as follows: - `QUERY_START`: Query start time. - `CLIENT_HOST`: client address -### `SCHEMATA` table +### `SCHEMATA` view The `SCHEMATA` table provides information about databases. The table data is equivalent to the result of the `SHOW DATABASES` statement. Fields in the `SCHEMATA` table are described as follows: @@ -608,7 +627,7 @@ Fields in the `USER_PRIVILEGES` table are described as follows: - `PRIVILEGE_TYPE`: The privilege type to be granted. Only one privilege type is shown in each row. - `IS_GRANTABLE`: If you have the `GRANT OPTION` privilege, the value is `YES`; otherwise, the value is `NO`. -### `VIEW` view +### `VIEWS` view - `TABLE_CATALOG`: The name of the catalog the view belongs to. The value is `def`. - `TABLE_SCHEMA`: The name of the database to which the view belongs. diff --git a/docs/MatrixOne/Reference/Variable/system-variables/lower_case_tables_name.md b/docs/MatrixOne/Reference/Variable/system-variables/lower_case_tables_name.md index 9c3116fad..dd3129645 100644 --- a/docs/MatrixOne/Reference/Variable/system-variables/lower_case_tables_name.md +++ b/docs/MatrixOne/Reference/Variable/system-variables/lower_case_tables_name.md @@ -64,7 +64,7 @@ mysql> select Aa from Tt;--Name comparison is case sensitive ### Parameter set to 1 -将 `lower_case_table_names` Set to 1. identifiers are stored in lowercase and name comparisons are case insensitive. + Set `lower_case_table_names` to 1. identifiers are stored in lowercase and name comparisons are case insensitive. **Example** diff --git a/docs/MatrixOne/Tutorial/django-python-crud-demo.md b/docs/MatrixOne/Tutorial/django-python-crud-demo.md index 34f23ea7e..7ea56fe6d 100644 --- a/docs/MatrixOne/Tutorial/django-python-crud-demo.md +++ b/docs/MatrixOne/Tutorial/django-python-crud-demo.md @@ -77,7 +77,7 @@ Before you begin, confirm that you have downloaded and installed the following s Enter the ip of your server in your browser (here we enter the native IP address: 127.0.0.1:8000) and the port number. If it starts normally, the output is as follows: - ![](https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/tutorial/django/django-1.png) + ![](https://github.com/matrixorigin/artwork/blob/main/docs/tutorial/django/django-1.png?raw=true) 4. We found the DATABASES configuration item in the project's settings.py file and modified its information to: @@ -156,7 +156,7 @@ The Django model uses its own ORM. The above class name represents the database ORM correspondence table:
- +
Refer to: for more model field types. @@ -214,7 +214,7 @@ python3 manage.py runserver 0.0.0.0:8000 Enter the ip of your server in your browser (here we enter the native IP address: 127.0.0.1:8000) and the port number. If it starts normally, the output is as follows:
- +
- Connecting to the database to query the data, you can see that the data was successfully inserted: @@ -260,13 +260,13 @@ python3 manage.py runserver 0.0.0.0:8000 Enter the ip of your server in your browser (here we enter the native IP address: 127.0.0.1:8000) and the port number. If it starts normally, the output is as follows:
- +
The command line results are:
- +
## Update Data @@ -298,7 +298,7 @@ python3 manage.py runserver 0.0.0.0:8000 Enter the ip of your server in your browser (here we enter the native IP address: 127.0.0.1:8000) and the port number. If it starts normally, the output is as follows:
- +
- Looking at the *testmodel\_book table*, you can see that the data was updated successfully: @@ -340,7 +340,7 @@ python3 manage.py runserver 0.0.0.0:8000 Enter the ip of your server in your browser (here we enter the native IP address: 127.0.0.1:8000) and the port number. If it starts normally, the output is as follows:
- +
- Looking at the *testmodel\_book table*, you can see that the data was successfully deleted. diff --git a/docs/MatrixOne/Tutorial/rag-demo.md b/docs/MatrixOne/Tutorial/rag-demo.md index 356576fa2..01ae6b52c 100644 --- a/docs/MatrixOne/Tutorial/rag-demo.md +++ b/docs/MatrixOne/Tutorial/rag-demo.md @@ -7,13 +7,13 @@ RAG, known as Retrieval-Augmented Generation, is a technology that combines info For example, when I asked GPT about the latest version of MatrixOne, it didn't give an answer.
- +
In addition, these models can sometimes produce misleading information and produce factually incorrect content. For example, when I asked Lu Xun about his relationship with Zhou Shuren, GPT started a serious nonsense.
- +
To solve the above problem, we can retrain the LLM model, but at a high cost. The main advantage of RAG, on the other hand, is that it avoids having to train again for specific tasks. Its high availability and low threshold make it one of the most popular scenarios in LLM systems, on which many LLM applications are built. The core idea of RAG is for the model to not only rely on what it learns during the training phase when generating responses, but also to utilize external, up-to-date, proprietary sources of information, so that users can optimize the output of the model by enriching the input with additional external knowledge bases based on the actual situation. @@ -27,7 +27,7 @@ RAG's workflow typically consists of the following steps: The following is a flow chart for Native RAG:
- +
As you can see, the retrieval link plays a crucial role in the RAG architecture, and MatrixOne's ability to retrieve vectors provides powerful data retrieval support for building RAG applications. diff --git a/docs/MatrixOne/Tutorial/search-picture-demo.md b/docs/MatrixOne/Tutorial/search-picture-demo.md index 401ee3de5..7683cfac9 100644 --- a/docs/MatrixOne/Tutorial/search-picture-demo.md +++ b/docs/MatrixOne/Tutorial/search-picture-demo.md @@ -5,7 +5,7 @@ Currently, graphic and text search applications cover a wide range of areas. In The following is a flow chart of a graphic search:
- +
As you can see, vectorized storage and retrieval of images is involved in building graph-to-text search applications, while MatrixOne's vector capabilities and multiple retrieval methods provide critical technical support for building graph-to-text search applications. @@ -224,13 +224,13 @@ if __name__ == "__main__": Using the results of the chart search, the first chart on the left is a comparison chart. As you can see, the searched picture is very similar to the comparison chart:
- +
As you can see from the text search results, the searched image matches the input text:
- +
## Reference Documents diff --git a/mkdocs.yml b/mkdocs.yml index ea40f82bd..4e3e84606 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -289,7 +289,7 @@ nav: - SQL Mode: MatrixOne/Reference/Variable/system-variables/sql-mode.md - Save query result support: MatrixOne/Reference/Variable/system-variables/save_query_result.md - Timezone support: MatrixOne/Reference/Variable/system-variables/timezone.md - - Lowercase table names support: MatrixOne/Reference/Variable/system-variables/lower_case_tables_name.md + - Lower case table names support: MatrixOne/Reference/Variable/system-variables/lower_case_tables_name.md - Foreign key checking support: MatrixOne/Reference/Variable/system-variables/foreign_key_checks.md - User-specified case consistency support for query result set column names: MatrixOne/Reference/Variable/system-variables/keep_user_target_list_in_result.md - Custom variable: MatrixOne/Reference/Variable/custom-variable.md