From 6d86852b64e515dbe2f88a71c215a828bd6ba826 Mon Sep 17 00:00:00 2001 From: MoonGyu1 Date: Wed, 16 Aug 2023 15:20:50 +0900 Subject: [PATCH 01/10] Create template --- design/concurrent-tree-editing.md | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 design/concurrent-tree-editing.md diff --git a/design/concurrent-tree-editing.md b/design/concurrent-tree-editing.md new file mode 100644 index 000000000..9cf76b99b --- /dev/null +++ b/design/concurrent-tree-editing.md @@ -0,0 +1,29 @@ +--- +title: concurrent-tree-editing +target-version: 0.4.6 +--- + +# Concurrent Tree Editing + +## Summary + +Write a brief description of the feature here. + +### Goals + +List the specific goals of the proposal. How will we know that this has +succeeded? + +### Non-Goals + +What is out of scope for this proposal? Listing non-goals helps to focus +discussion and make progress. + +## Proposal Details + +This is where we detail how to use the feature with snippet or API and describe +the internal implementation. + +### Risks and Mitigation + +What are the risks of this proposal and how do we mitigate. From 285c11f36105a37ea50afccd357631ca4b925dae Mon Sep 17 00:00:00 2001 From: MoonGyu1 Date: Wed, 16 Aug 2023 15:20:50 +0900 Subject: [PATCH 02/10] Create template --- design/concurrent-tree-editing.md | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 design/concurrent-tree-editing.md diff --git a/design/concurrent-tree-editing.md b/design/concurrent-tree-editing.md new file mode 100644 index 000000000..9cf76b99b --- /dev/null +++ b/design/concurrent-tree-editing.md @@ -0,0 +1,29 @@ +--- +title: concurrent-tree-editing +target-version: 0.4.6 +--- + +# Concurrent Tree Editing + +## Summary + +Write a brief description of the feature here. + +### Goals + +List the specific goals of the proposal. How will we know that this has +succeeded? + +### Non-Goals + +What is out of scope for this proposal? Listing non-goals helps to focus +discussion and make progress. + +## Proposal Details + +This is where we detail how to use the feature with snippet or API and describe +the internal implementation. + +### Risks and Mitigation + +What are the risks of this proposal and how do we mitigate. From b376d0329bc6975ae95c6315198fc77e156ce085 Mon Sep 17 00:00:00 2001 From: MoonGyu1 Date: Fri, 25 Aug 2023 11:11:33 +0900 Subject: [PATCH 03/10] Add draft docs --- design/concurrent-tree-editing.md | 151 ++++++++++++++++++++++++++++-- 1 file changed, 143 insertions(+), 8 deletions(-) diff --git a/design/concurrent-tree-editing.md b/design/concurrent-tree-editing.md index 9cf76b99b..b5fb94677 100644 --- a/design/concurrent-tree-editing.md +++ b/design/concurrent-tree-editing.md @@ -7,23 +7,158 @@ target-version: 0.4.6 ## Summary -Write a brief description of the feature here. +In Yorkie, users can create and edit JSON-like documents using JSON-like data structures such as `Primitive`, `Object`, `Array`, `Text`, and `Tree`. Among these, the `Tree` structure is used to represent the document model of a tree-based text editor, similar to XML. + +This document introduces the `Tree` data structure, and explains the operations provided by `Tree`, focusing on the `Tree` coordinate system and the logic of the `Tree.Edit` operation. Furthermore, it explains how this logic ensures eventual consistency in concurrent document editing scenarios. ### Goals -List the specific goals of the proposal. How will we know that this has -succeeded? +This document aims to help new SDK contributors understand the overall `Tree`` data structure and explain how Yorkie ensures consistency when multiple clients are editing concurrently. ### Non-Goals -What is out of scope for this proposal? Listing non-goals helps to focus -discussion and make progress. +This document focuses on `Tree.Edit` operations rather than `Tree.Style`. ## Proposal Details -This is where we detail how to use the feature with snippet or API and describe -the internal implementation. +### JSON-like Tree + +In yorkie, a JSON-like `Tree` is used to represent the document model of a tree-based text editor. + +This tree-based document model resembles XML tree and consists of element nodes and text nodes. element nodes can have attributes, and text nodes contain a string as their value. For example: + +[이미지] + +**Operation** + +The JSON-like `Tree` provides specialized operations tailored for text editing rather than typical operations of a general tree. To specify the operation's range, an `index` is used. For example: + +[이미지] + +These `index`es are assigned in order at positions where the user's cursor can reach. These `index`es draw inspiration from ProseMirror's index and share a similar structural concept. + +1. `Tree.Edit` + +Users can use the `Edit` operation to insert or delete nodes within the `Tree`. + +[코드] + +Where `fromIdx` is the starting position of editing, `toIdx` is the ending position, and `contents` represent the nodes to be inserted. If `contents` are omitted, the operation only deletes nodes between `fromIdx` and `toIdx`. + +[이미지] + +[코드] + +Similarly, users can specify the editing range using a `path` that leads to the `Tree`'s node in the type of `[]int`. + +2. `Tree.Style` + +Users can use the `Style` operation to specify attributes for the element nodes in the `Tree`. + +[코드] + +### Implementation of Edit Operation + +**Tree Coordinate System** + +[이미지] + +Yorkie implements the above data structure to create a JSON-like `Document`, which consists of different layers, each with its own coordinate system. The dependency graph above can be divided into three main groups. The **JSON-like** group directly used by users to edit JSON-like `Document`s. The **CRDT** Group is utilized from the JSON-like group to resolve conflicts in concurrent editing situations. Finally, the **common** group is used for the detailed implementation of CRDT group and serves general purposes. + +Thus, the JSON-like `Tree`, introduced in this document, has dependencies such as **'`Tree` → `CRDTTree` → `IndexTree`'**, and each layer has its own coordinate system: + +[이미지] + +These coordinate systems transform in the order of '`index(path)` → `IndexTree.TreePos` → `CRDTTree.TreeNodeID` → `CRDTTree.TreePos`'. + +[이미지] + +1. `index` → `IndexTree.TreePos` + +The `index` is the coordinate system used by users for local editing. This `index` is received from the user, and is converted to `IndexTree.TreePos`. This `IndexTree.TreePos` represents the physical position within the local tree and is used for actual tree editing. + +2. `IndexTree.TreePos` → (`CRDTTree.TreeNodeID`) → `CRDTTree.TreePos` + +Next, the obtained `IndexTree.TreePos` is transformed into the logical coordinate system of the distributed tree, represented by `CRDTTree.TreePos`. To achieve this, the given physical position, `IndexTree.TreePos`, is used to find the parent node and left sibling node. Then, a `CRDTTree.TreePos` is created using the unique IDs of the parent node and left sibling node, which are `CRDTTree.TreeNodeID`. This coordinate system is used in subsequent `Tree.Edit` and `Tree.Style` operations. + +In the case of remote editing, where the local coordinate system is received from the user in local editing, there is no need for Step 1 since changes are pulled from the server using `ChangePack` to synchronize the changes. (For more details: [링크]) + +**Tree.Edit Logic** + +The core process of the `Tree.Edit` operation is as follows: + +1. Find `CRDTTree.TreePos` from the given `fromIdx` and `toIdx` (local editing only). +2. Find the corresponding left sibling node and parent node within the `IndexTree` based on `CRDTTree.TreePos`. +3. Delete nodes in the range of `fromTreePos` to `toTreePos`. +4. Insert the given nodes at the appropriate positions (insert operation only). + +**[STEP 1]** Find `CRDTTree.TreePos` from the given `fromIdx` and `toIdx` (local editing only) [링크] + +In the case of local editing, the given `index`es are converted to `CRDTTree.TreePos`. The detailed process is the same as described in the 'Tree Coordinate System' above. + +**[STEP 2]** Find the corresponding left sibling node and parent node within the `IndexTree` based on `CRDTTree.TreePos` [링크] + +2-1. For text nodes, if necessary, split nodes at the appropriate positions to find the left sibling node. + +2-2. Determine the sequence of nodes and find the appropriate position. Since `Clone`s[링크] of each client might exist in different states, the `findFloorNode` function is used to find the closest node (lower bound). + +**[STEP 3]** Delete nodes in the range of `fromTreePos` to `toTreePos` [링크] + +3-1. Traverse the range and identify nodes to be removed. If a node is an element node and doesn't include both opening and closing tags, it is excluded from removal. + +3-2. Update the `latestCreatedAtMapByActor` information for each node and mark nodes with tombstones in the `IndexTree` to indicate removal. + +**[STEP 4]** Insert the given nodes at the appropriate positions (insert operation only) [링크] + +4-1. If the left sibling node at the insertion position is the same as the parent node, it means the node will be inserted as the leftmost child of the parent. Hence, the node is inserted at the leftmost position of the parent's children list. + +4-2. Otherwise, the new node is inserted to the right of the left sibling node. + +### How to Guarantee Eventual Consistency + +**Coverage** + +[이미지] + +Using conditions such as range type, node type, and edit type, 27 possible cases of concurrent editing can be represented. + +[이미지] + +Eventual consistency is guaranteed for these 27 cases. In addition, eventual consistency is ensured for the following edge cases: + +- Selecting multiple nodes in a multi-level range +- Selecting only a part of nodes (e.g., selecting only the opening tag or closing tag of the node) + +**How does it work?** + +- `lastCreatedAtMapByActor` + +[코드] + +`latestCreatedAtMapByActor` is a map that stores the latest creation time by actor for the nodes included in the editing range. However, relying solely on the typical `lamport` clocks that represent local clock of clients, it's not possible to determine if two events are causally related or concurrent. For instance: + +[이미지] + +In the case of the example above, during the process of synchronizing operations between clients A and B, client A is unaware of the existence of '`c`' when client B performs `Edit(0,2)`. As a result, an issue arises where the element '`c`', which is within the contained range, gets deleted together. + +To address this, the `lastCreatedAtMapByActor` is utilized during operation execution to store final timestamp information for each actor. Subsequently, this information allows us to ascertain the causal relationship between the two events. + +- Restricted to only `insertAfter` + +[코드] + +To ensure consistency in concurrent editing scenarios, only the `insertAfter` operation is allowed, rather than `insertBefore`, similar to conventional CRDT algorithms. To achieve this, `CRDTTree.TreePos` takes a form that includes `LeftSiblingID`, thus always maintaining a reference to the left sibling node. + +If the left sibling node is the same as the parent node, it indicates that the node is positioned at the far left of the parent's children list. + +- `FindOffset` + +[코드] + +During the traversal of the given range in `traverseInPosRange` (STEP3), the process of converting the provided `CRDTTree.TreePos` to an `IndexTree.TreePos` is executed. To determine the `offset` for this conversion, the `FindOffset` function is utilized. In doing so, calculating the `offset` excluding the removed nodes prevents potential issues that can arise in concurrent editing scenarios. ### Risks and Mitigation -What are the risks of this proposal and how do we mitigate. +- In the current conflict resolution policy of Yorkie, when both insert and delete operations occur simultaneously, even if the insert range is included in the delete range, the inserted node remains after synchronization. This might not always reflect the user's intention accurately. + +- The `Tree.Edit` logic uses index-based traversal instead of node-based traversal for a clearer implementation. This might lead to a performance impact. If this becomes a concern, switching to node-based traversal can be considered. From 9af96347b6de089fdfa55172a12db1d3c33946e9 Mon Sep 17 00:00:00 2001 From: gyuwonMoon <78714820+MoonGyu1@users.noreply.github.com> Date: Fri, 25 Aug 2023 12:18:25 +0900 Subject: [PATCH 04/10] Add images --- design/concurrent-tree-editing.md | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/design/concurrent-tree-editing.md b/design/concurrent-tree-editing.md index b5fb94677..82b0698be 100644 --- a/design/concurrent-tree-editing.md +++ b/design/concurrent-tree-editing.md @@ -13,7 +13,7 @@ This document introduces the `Tree` data structure, and explains the operations ### Goals -This document aims to help new SDK contributors understand the overall `Tree`` data structure and explain how Yorkie ensures consistency when multiple clients are editing concurrently. +This document aims to help new SDK contributors understand the overall `Tree` data structure and explain how Yorkie ensures consistency when multiple clients are editing concurrently. ### Non-Goals @@ -27,13 +27,14 @@ In yorkie, a JSON-like `Tree` is used to represent the document model of a tree- This tree-based document model resembles XML tree and consists of element nodes and text nodes. element nodes can have attributes, and text nodes contain a string as their value. For example: -[이미지] + + **Operation** The JSON-like `Tree` provides specialized operations tailored for text editing rather than typical operations of a general tree. To specify the operation's range, an `index` is used. For example: -[이미지] + These `index`es are assigned in order at positions where the user's cursor can reach. These `index`es draw inspiration from ProseMirror's index and share a similar structural concept. @@ -45,7 +46,7 @@ Users can use the `Edit` operation to insert or delete nodes within the `Tree`. Where `fromIdx` is the starting position of editing, `toIdx` is the ending position, and `contents` represent the nodes to be inserted. If `contents` are omitted, the operation only deletes nodes between `fromIdx` and `toIdx`. -[이미지] + [코드] @@ -61,17 +62,17 @@ Users can use the `Style` operation to specify attributes for the element nodes **Tree Coordinate System** -[이미지] + Yorkie implements the above data structure to create a JSON-like `Document`, which consists of different layers, each with its own coordinate system. The dependency graph above can be divided into three main groups. The **JSON-like** group directly used by users to edit JSON-like `Document`s. The **CRDT** Group is utilized from the JSON-like group to resolve conflicts in concurrent editing situations. Finally, the **common** group is used for the detailed implementation of CRDT group and serves general purposes. Thus, the JSON-like `Tree`, introduced in this document, has dependencies such as **'`Tree` → `CRDTTree` → `IndexTree`'**, and each layer has its own coordinate system: -[이미지] + These coordinate systems transform in the order of '`index(path)` → `IndexTree.TreePos` → `CRDTTree.TreeNodeID` → `CRDTTree.TreePos`'. -[이미지] + 1. `index` → `IndexTree.TreePos` @@ -118,11 +119,11 @@ In the case of local editing, the given `index`es are converted to `CRDTTree.Tre **Coverage** -[이미지] + Using conditions such as range type, node type, and edit type, 27 possible cases of concurrent editing can be represented. -[이미지] + Eventual consistency is guaranteed for these 27 cases. In addition, eventual consistency is ensured for the following edge cases: @@ -137,7 +138,7 @@ Eventual consistency is guaranteed for these 27 cases. In addition, eventual con `latestCreatedAtMapByActor` is a map that stores the latest creation time by actor for the nodes included in the editing range. However, relying solely on the typical `lamport` clocks that represent local clock of clients, it's not possible to determine if two events are causally related or concurrent. For instance: -[이미지] + In the case of the example above, during the process of synchronizing operations between clients A and B, client A is unaware of the existence of '`c`' when client B performs `Edit(0,2)`. As a result, an issue arises where the element '`c`', which is within the contained range, gets deleted together. From 105bead60e2123cfac724cdbea3391be6b5e94d9 Mon Sep 17 00:00:00 2001 From: gyuwonMoon <78714820+MoonGyu1@users.noreply.github.com> Date: Fri, 25 Aug 2023 15:59:15 +0900 Subject: [PATCH 05/10] Add path example --- design/concurrent-tree-editing.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/design/concurrent-tree-editing.md b/design/concurrent-tree-editing.md index 82b0698be..f9b473532 100644 --- a/design/concurrent-tree-editing.md +++ b/design/concurrent-tree-editing.md @@ -27,17 +27,19 @@ In yorkie, a JSON-like `Tree` is used to represent the document model of a tree- This tree-based document model resembles XML tree and consists of element nodes and text nodes. element nodes can have attributes, and text nodes contain a string as their value. For example: - + **Operation** -The JSON-like `Tree` provides specialized operations tailored for text editing rather than typical operations of a general tree. To specify the operation's range, an `index` is used. For example: +The JSON-like `Tree` provides specialized operations tailored for text editing rather than typical operations of a general tree. To specify the operation's range, an `index` or `path` is used. For example: - + These `index`es are assigned in order at positions where the user's cursor can reach. These `index`es draw inspiration from ProseMirror's index and share a similar structural concept. +In the case of a `path`, it contains `offset`s of each node from the root node as elements except the last. The last element of the `path` represents the position in the parent node. For example, the `path` of the position between '`k`' and '`i`' is `[1, 4]`. The first element of the `path` is the `offset` of the `` in `

` and the second element represents the position between '`k`' and '`i`' in ``. + 1. `Tree.Edit` Users can use the `Edit` operation to insert or delete nodes within the `Tree`. @@ -119,7 +121,7 @@ In the case of local editing, the given `index`es are converted to `CRDTTree.Tre **Coverage** - + Using conditions such as range type, node type, and edit type, 27 possible cases of concurrent editing can be represented. From 9bcfa93bbb81eb9a2fc51179f29af2e0d7e1bef6 Mon Sep 17 00:00:00 2001 From: gyuwonMoon <78714820+MoonGyu1@users.noreply.github.com> Date: Fri, 25 Aug 2023 16:45:55 +0900 Subject: [PATCH 06/10] Add code lines --- design/concurrent-tree-editing.md | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/design/concurrent-tree-editing.md b/design/concurrent-tree-editing.md index f9b473532..2fe6fe31f 100644 --- a/design/concurrent-tree-editing.md +++ b/design/concurrent-tree-editing.md @@ -44,21 +44,21 @@ In the case of a `path`, it contains `offset`s of each node from the root node a Users can use the `Edit` operation to insert or delete nodes within the `Tree`. -[코드] +https://github.com/yorkie-team/yorkie/blob/fd3b15c7d2c482464b6c8470339bcc497204114e/pkg/document/json/tree.go#L115-L131 Where `fromIdx` is the starting position of editing, `toIdx` is the ending position, and `contents` represent the nodes to be inserted. If `contents` are omitted, the operation only deletes nodes between `fromIdx` and `toIdx`. -[코드] - Similarly, users can specify the editing range using a `path` that leads to the `Tree`'s node in the type of `[]int`. +https://github.com/yorkie-team/yorkie/blob/fd3b15c7d2c482464b6c8470339bcc497204114e/pkg/document/json/tree.go#L217-L237 + 2. `Tree.Style` Users can use the `Style` operation to specify attributes for the element nodes in the `Tree`. -[코드] +https://github.com/yorkie-team/yorkie/blob/fd3b15c7d2c482464b6c8470339bcc497204114e/pkg/document/json/tree.go#L239-L268 ### Implementation of Edit Operation @@ -66,9 +66,9 @@ Users can use the `Style` operation to specify attributes for the element nodes -Yorkie implements the above data structure to create a JSON-like `Document`, which consists of different layers, each with its own coordinate system. The dependency graph above can be divided into three main groups. The **JSON-like** group directly used by users to edit JSON-like `Document`s. The **CRDT** Group is utilized from the JSON-like group to resolve conflicts in concurrent editing situations. Finally, the **common** group is used for the detailed implementation of CRDT group and serves general purposes. +Yorkie implements the above [data structure](https://github.com/yorkie-team/yorkie/blob/main/design/data-structure.md) to create a JSON-like `Document`, which consists of different layers, each with its own coordinate system. The dependency graph above can be divided into three main groups. The **JSON-like** group directly used by users to edit JSON-like `Document`s. The **CRDT** Group is utilized from the JSON-like group to resolve conflicts in concurrent editing situations. Finally, the **common** group is used for the detailed implementation of CRDT group and serves general purposes. -Thus, the JSON-like `Tree`, introduced in this document, has dependencies such as **'`Tree` → `CRDTTree` → `IndexTree`'**, and each layer has its own coordinate system: +Thus, the JSON-like `Tree`, introduced in this document, has dependencies such as '`Tree` → `CRDTTree` → `IndexTree`', and each layer has its own coordinate system: @@ -84,7 +84,7 @@ The `index` is the coordinate system used by users for local editing. This `inde Next, the obtained `IndexTree.TreePos` is transformed into the logical coordinate system of the distributed tree, represented by `CRDTTree.TreePos`. To achieve this, the given physical position, `IndexTree.TreePos`, is used to find the parent node and left sibling node. Then, a `CRDTTree.TreePos` is created using the unique IDs of the parent node and left sibling node, which are `CRDTTree.TreeNodeID`. This coordinate system is used in subsequent `Tree.Edit` and `Tree.Style` operations. -In the case of remote editing, where the local coordinate system is received from the user in local editing, there is no need for Step 1 since changes are pulled from the server using `ChangePack` to synchronize the changes. (For more details: [링크]) +In the case of remote editing, where the local coordinate system is received from the user in local editing, there is no need for Step 1 since changes are pulled from the server using `ChangePack` to synchronize the changes. Refer to the [document-editing](https://github.com/yorkie-team/yorkie/blob/main/design/document-editing.md) for more details. **Tree.Edit Logic** @@ -95,23 +95,23 @@ The core process of the `Tree.Edit` operation is as follows: 3. Delete nodes in the range of `fromTreePos` to `toTreePos`. 4. Insert the given nodes at the appropriate positions (insert operation only). -**[STEP 1]** Find `CRDTTree.TreePos` from the given `fromIdx` and `toIdx` (local editing only) [링크] +**[[STEP 1]](https://github.com/yorkie-team/yorkie/blob/fd3b15c7d2c482464b6c8470339bcc497204114e/pkg/document/json/tree.go#L121C1-L128)** Find `CRDTTree.TreePos` from the given `fromIdx` and `toIdx` (local editing only) In the case of local editing, the given `index`es are converted to `CRDTTree.TreePos`. The detailed process is the same as described in the 'Tree Coordinate System' above. -**[STEP 2]** Find the corresponding left sibling node and parent node within the `IndexTree` based on `CRDTTree.TreePos` [링크] +**[[STEP 2]](https://github.com/yorkie-team/yorkie/blob/fd3b15c7d2c482464b6c8470339bcc497204114e/pkg/document/crdt/tree.go#L572C1-L580C3)** Find the corresponding left sibling node and parent node within the `IndexTree` based on `CRDTTree.TreePos` 2-1. For text nodes, if necessary, split nodes at the appropriate positions to find the left sibling node. 2-2. Determine the sequence of nodes and find the appropriate position. Since `Clone`s[링크] of each client might exist in different states, the `findFloorNode` function is used to find the closest node (lower bound). -**[STEP 3]** Delete nodes in the range of `fromTreePos` to `toTreePos` [링크] +**[[STEP 3]](https://github.com/yorkie-team/yorkie/blob/fd3b15c7d2c482464b6c8470339bcc497204114e/pkg/document/crdt/tree.go#L582-L640)** Delete nodes in the range of `fromTreePos` to `toTreePos` 3-1. Traverse the range and identify nodes to be removed. If a node is an element node and doesn't include both opening and closing tags, it is excluded from removal. 3-2. Update the `latestCreatedAtMapByActor` information for each node and mark nodes with tombstones in the `IndexTree` to indicate removal. -**[STEP 4]** Insert the given nodes at the appropriate positions (insert operation only) [링크] +**[[STEP 4]](https://github.com/yorkie-team/yorkie/blob/fd3b15c7d2c482464b6c8470339bcc497204114e/pkg/document/crdt/tree.go#L642-L681)** Insert the given nodes at the appropriate positions (insert operation only) 4-1. If the left sibling node at the insertion position is the same as the parent node, it means the node will be inserted as the leftmost child of the parent. Hence, the node is inserted at the leftmost position of the parent's children list. @@ -127,7 +127,7 @@ Using conditions such as range type, node type, and edit type, 27 possible cases -Eventual consistency is guaranteed for these 27 cases. In addition, eventual consistency is ensured for the following edge cases: +Eventual consistency is guaranteed for these [27 cases](https://github.com/yorkie-team/yorkie/blob/fd3b15c7d2c482464b6c8470339bcc497204114e/test/integration/tree_test.go#L736-L2094). In addition, eventual consistency is ensured for the following edge cases: - Selecting multiple nodes in a multi-level range - Selecting only a part of nodes (e.g., selecting only the opening tag or closing tag of the node) @@ -136,7 +136,8 @@ Eventual consistency is guaranteed for these 27 cases. In addition, eventual con - `lastCreatedAtMapByActor` -[코드] +https://github.com/yorkie-team/yorkie/blob/81137b32d0d1d3d36be5b63652e5ab0273f536de/pkg/document/operations/tree_edit.go#L36-L38 + `latestCreatedAtMapByActor` is a map that stores the latest creation time by actor for the nodes included in the editing range. However, relying solely on the typical `lamport` clocks that represent local clock of clients, it's not possible to determine if two events are causally related or concurrent. For instance: @@ -148,7 +149,7 @@ To address this, the `lastCreatedAtMapByActor` is utilized during operation exec - Restricted to only `insertAfter` -[코드] +https://github.com/yorkie-team/yorkie/blob/422901861aedbd3a86fdcb9cf3b5740d6daf38eb/pkg/index/tree.go#L552-L570 To ensure consistency in concurrent editing scenarios, only the `insertAfter` operation is allowed, rather than `insertBefore`, similar to conventional CRDT algorithms. To achieve this, `CRDTTree.TreePos` takes a form that includes `LeftSiblingID`, thus always maintaining a reference to the left sibling node. @@ -156,7 +157,7 @@ If the left sibling node is the same as the parent node, it indicates that the n - `FindOffset` -[코드] +https://github.com/yorkie-team/yorkie/blob/422901861aedbd3a86fdcb9cf3b5740d6daf38eb/pkg/index/tree.go#L393-L412 During the traversal of the given range in `traverseInPosRange` (STEP3), the process of converting the provided `CRDTTree.TreePos` to an `IndexTree.TreePos` is executed. To determine the `offset` for this conversion, the `FindOffset` function is utilized. In doing so, calculating the `offset` excluding the removed nodes prevents potential issues that can arise in concurrent editing scenarios. From a3170decae499c9702cd9fba26d4db2a502d6e74 Mon Sep 17 00:00:00 2001 From: gyuwonMoon <78714820+MoonGyu1@users.noreply.github.com> Date: Fri, 25 Aug 2023 17:18:26 +0900 Subject: [PATCH 07/10] Update data structure related to Tree --- design/data-structure.md | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/design/data-structure.md b/design/data-structure.md index b736810a2..bfab2f745 100644 --- a/design/data-structure.md +++ b/design/data-structure.md @@ -1,6 +1,6 @@ --- title: data-structure -target-version: 0.3.1 +target-version: 0.4.6 --- # Data Structures @@ -28,7 +28,9 @@ The `json` and `crdt` package has data structures for representing the contents Below is the dependency graph of data structures used in a JSON-like document. -![data-structure](./media/data-structure.png) + + + The data structures can be divided into three groups: @@ -45,7 +47,9 @@ JSON-like data strucutres are used when editing JSON-like documents. - `Primitive`: represents primitive data like `string`, `number`, `boolean`, `null`, etc. - `Object`: represents [object type](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object) of JavaScript. Just like JavaScript, you can use `Object` as [hash table](https://en.wikipedia.org/wiki/Hash_table). - `Array`: represents [array type](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array) of JavaScript. You can also use `Array` as [list](https://en.wikipedia.org/wiki/List_(abstract_data_type)). -- `Text`: represents text with style attributes in rich text editors such as [Quill](https://github.com/yorkie-team/yorkie-js-sdk/blob/main/examples/quill.html). Users can express styles such as bold, italic, and underline to text content. Of course, it can represent just a plain text in text-based editors such as [CodeMirror](https://github.com/yorkie-team/yorkie-js-sdk/blob/main/examples/index.html). It supports collaborative editing; multiple users can modify parts of the contents without conflict. +- `Text`: represents text with style attributes in rich text editors such as [Quill](https://quilljs.com/). Users can express styles such as bold, italic, and underline to text content. Of course, it can represent just a plain text in text-based editors such as [CodeMirror](https://codemirror.net). It supports collaborative editing; multiple users can modify parts of the contents without conflict. +- `Counter`: represents a counter in the document. As a proxy for the CRDT counter, it is used when the user manipulates the counter from the outside. +- `Tree`: represents CRDT-based tree structure that is used to represent the document tree of text-based editor such as [ProseMirror](https://prosemirror.net/). JSON-like data structures can be edited through proxies. For example: @@ -72,7 +76,7 @@ CRDT data structures are used by JSON-like group to resolve conflicts in concurr - `ElementRHT`: similar to `RHT`, but has elements as values. - `RGATreeList`: extended `RGA(Replicated Growable Array)` with an additional index tree. The index tree manages the indices of elements and provides faster access to elements at the int-based index. - `RGATreeSplit`: extended `RGATreeList` allowing characters to be represented as blocks rather than each single character. - +- `CRDTTree`: represents the CRDT tree with an index tree structure'. It resolves conflicts arising from concurrent editing. ### Common Group Common data structures can be used for general purposes. @@ -80,7 +84,8 @@ Common data structures can be used for general purposes. - [`SplayTree`](https://en.wikipedia.org/wiki/Splay_tree): A tree that moves nodes to the root by splaying. This is effective when user frequently access the same location, such as text editing. We use `SplayTree` as an index tree to give each node a weight, and to quickly access the node based on the index. - [`LLRBTree`](https://en.wikipedia.org/wiki/Left-leaning_red%E2%80%93black_tree): A tree simpler than Red-Black Tree. Newly added `floor` method finds the node of the largest key less than or equal to the given key. - [`Trie`](https://en.wikipedia.org/wiki/Trie): A data structure that can quickly search for prefixes of sequence data such as strings. We use `Trie` to remove nested events when the contents of the `Document` are modified at once. - +- `IndexTree`: A tree implementation to represent a document of text-based editors. + ### Risks and Mitigation We can replace the data structures with better ones for some reason, such as performance. For example, `SplayTree` used in `RGATreeList` can be replaced with [TreeList](https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/list/TreeList.html). From 2691c5f421c5a1a485af9bfda2750d99eb68260d Mon Sep 17 00:00:00 2001 From: gyuwonMoon <78714820+MoonGyu1@users.noreply.github.com> Date: Fri, 25 Aug 2023 17:32:11 +0900 Subject: [PATCH 08/10] Replace JSON-like tree to XML-like tree --- design/concurrent-tree-editing.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/design/concurrent-tree-editing.md b/design/concurrent-tree-editing.md index 2fe6fe31f..7c3a0bfa9 100644 --- a/design/concurrent-tree-editing.md +++ b/design/concurrent-tree-editing.md @@ -21,9 +21,9 @@ This document focuses on `Tree.Edit` operations rather than `Tree.Style`. ## Proposal Details -### JSON-like Tree +### XML-like Tree -In yorkie, a JSON-like `Tree` is used to represent the document model of a tree-based text editor. +In yorkie, a XML-like `Tree` is used to represent the document model of a tree-based text editor. This tree-based document model resembles XML tree and consists of element nodes and text nodes. element nodes can have attributes, and text nodes contain a string as their value. For example: @@ -32,7 +32,7 @@ This tree-based document model resembles XML tree and consists of element nodes **Operation** -The JSON-like `Tree` provides specialized operations tailored for text editing rather than typical operations of a general tree. To specify the operation's range, an `index` or `path` is used. For example: +The XML-like `Tree` provides specialized operations tailored for text editing rather than typical operations of a general tree. To specify the operation's range, an `index` or `path` is used. For example: @@ -68,7 +68,7 @@ https://github.com/yorkie-team/yorkie/blob/fd3b15c7d2c482464b6c8470339bcc4972041 Yorkie implements the above [data structure](https://github.com/yorkie-team/yorkie/blob/main/design/data-structure.md) to create a JSON-like `Document`, which consists of different layers, each with its own coordinate system. The dependency graph above can be divided into three main groups. The **JSON-like** group directly used by users to edit JSON-like `Document`s. The **CRDT** Group is utilized from the JSON-like group to resolve conflicts in concurrent editing situations. Finally, the **common** group is used for the detailed implementation of CRDT group and serves general purposes. -Thus, the JSON-like `Tree`, introduced in this document, has dependencies such as '`Tree` → `CRDTTree` → `IndexTree`', and each layer has its own coordinate system: +Thus, the `Tree`, introduced in this document, has dependencies such as '`Tree` → `CRDTTree` → `IndexTree`', and each layer has its own coordinate system: From 2fa0a56ed3cedb77eefe52ac7814c0fece9c722a Mon Sep 17 00:00:00 2001 From: MoonGyu1 Date: Fri, 25 Aug 2023 17:50:06 +0900 Subject: [PATCH 09/10] Rename docs concurrent-tree-editing to tree --- design/{concurrent-tree-editing.md => tree.md} | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) rename design/{concurrent-tree-editing.md => tree.md} (99%) diff --git a/design/concurrent-tree-editing.md b/design/tree.md similarity index 99% rename from design/concurrent-tree-editing.md rename to design/tree.md index 7c3a0bfa9..76fa283cf 100644 --- a/design/concurrent-tree-editing.md +++ b/design/tree.md @@ -1,9 +1,9 @@ --- -title: concurrent-tree-editing +title: tree target-version: 0.4.6 --- -# Concurrent Tree Editing +# Tree ## Summary @@ -29,7 +29,6 @@ This tree-based document model resembles XML tree and consists of element nodes - **Operation** The XML-like `Tree` provides specialized operations tailored for text editing rather than typical operations of a general tree. To specify the operation's range, an `index` or `path` is used. For example: @@ -138,7 +137,6 @@ Eventual consistency is guaranteed for these [27 cases](https://github.com/yorki https://github.com/yorkie-team/yorkie/blob/81137b32d0d1d3d36be5b63652e5ab0273f536de/pkg/document/operations/tree_edit.go#L36-L38 - `latestCreatedAtMapByActor` is a map that stores the latest creation time by actor for the nodes included in the editing range. However, relying solely on the typical `lamport` clocks that represent local clock of clients, it's not possible to determine if two events are causally related or concurrent. For instance: From 0293c8dea2e9b9ef2735997bee6d18ca40c14ba2 Mon Sep 17 00:00:00 2001 From: MoonGyu1 Date: Fri, 25 Aug 2023 18:11:21 +0900 Subject: [PATCH 10/10] Remove link at Clone --- design/tree.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/tree.md b/design/tree.md index 76fa283cf..ab7071acc 100644 --- a/design/tree.md +++ b/design/tree.md @@ -102,7 +102,7 @@ In the case of local editing, the given `index`es are converted to `CRDTTree.Tre 2-1. For text nodes, if necessary, split nodes at the appropriate positions to find the left sibling node. -2-2. Determine the sequence of nodes and find the appropriate position. Since `Clone`s[링크] of each client might exist in different states, the `findFloorNode` function is used to find the closest node (lower bound). +2-2. Determine the sequence of nodes and find the appropriate position. Since `Clone`s of each client might exist in different states, the `findFloorNode` function is used to find the closest node (lower bound). **[[STEP 3]](https://github.com/yorkie-team/yorkie/blob/fd3b15c7d2c482464b6c8470339bcc497204114e/pkg/document/crdt/tree.go#L582-L640)** Delete nodes in the range of `fromTreePos` to `toTreePos`