Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add update dataflow in architecture document #601

Merged
merged 43 commits into from
Aug 7, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
837ebdd
feat: update flow
hlts2 Jul 28, 2020
d8a0097
feat: add steps of update dataflow
hlts2 Jul 28, 2020
2de629b
fix: typo
hlts2 Jul 30, 2020
a971a4a
fix: improve contents
hlts2 Jul 30, 2020
c00b169
fix: on -> in
hlts2 Jul 30, 2020
a4ca8fa
feat: add image
hlts2 Jul 30, 2020
290dd47
feat: add image tag
hlts2 Jul 30, 2020
4653f56
fix: update image
hlts2 Jul 30, 2020
a7760b4
Update docs/overview/architecture.md
hlts2 Jul 30, 2020
04614c3
Update docs/overview/architecture.md
hlts2 Jul 30, 2020
90659fa
Update docs/overview/architecture.md
hlts2 Jul 30, 2020
f2523d6
Update docs/overview/architecture.md
hlts2 Jul 30, 2020
f5a63d4
Update docs/overview/architecture.md
hlts2 Jul 30, 2020
a7d45c2
fix: update image
hlts2 Jul 30, 2020
400e5da
Update docs/overview/architecture.md
hlts2 Jul 31, 2020
b93010f
Update docs/overview/architecture.md
hlts2 Jul 31, 2020
81d4e46
Update docs/overview/architecture.md
hlts2 Aug 3, 2020
873f477
Update docs/overview/architecture.md
hlts2 Aug 3, 2020
d525d0c
Update docs/overview/architecture.md
hlts2 Aug 3, 2020
abf8c06
Update docs/overview/architecture.md
hlts2 Aug 3, 2020
3bd4502
Update docs/overview/architecture.md
hlts2 Aug 3, 2020
d50b247
feat: git rebase master
hlts2 Aug 4, 2020
58cfca3
Update docs/overview/architecture.md
hlts2 Aug 5, 2020
2dccc93
Update docs/overview/architecture.md
hlts2 Aug 5, 2020
4560895
Update docs/overview/architecture.md
hlts2 Aug 5, 2020
3244c88
Update docs/overview/architecture.md
hlts2 Aug 5, 2020
16cf30c
Update docs/overview/architecture.md
hlts2 Aug 5, 2020
a3d61d9
Update docs/overview/architecture.md
hlts2 Aug 5, 2020
849582c
Update docs/overview/architecture.md
hlts2 Aug 5, 2020
d68a0c9
Update docs/overview/architecture.md
hlts2 Aug 5, 2020
5823dc8
Update docs/overview/architecture.md
hlts2 Aug 5, 2020
041fc48
fix: apply suggestion
hlts2 Aug 5, 2020
5b82036
fix: apply suggestion
hlts2 Aug 5, 2020
1191359
fix: apply suggestion
hlts2 Aug 5, 2020
7a1f04a
fix: apply suggestion
hlts2 Aug 5, 2020
35a012b
Update docs/overview/architecture.md
hlts2 Aug 5, 2020
80ea004
Update docs/overview/architecture.md
hlts2 Aug 6, 2020
0559f28
Update docs/overview/architecture.md
hlts2 Aug 6, 2020
c05f724
Update docs/overview/architecture.md
hlts2 Aug 6, 2020
85a0ec3
Update docs/overview/architecture.md
hlts2 Aug 6, 2020
75fc31c
Update docs/overview/architecture.md
hlts2 Aug 6, 2020
f766a05
fix: all of the -> all the
hlts2 Aug 6, 2020
7b789e0
Update docs/overview/architecture.md
hlts2 Aug 6, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions assets/docs/update_flow.drawio

Large diffs are not rendered by default.

Binary file added assets/docs/update_flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
32 changes: 31 additions & 1 deletion docs/overview/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,37 @@ When the user searches a vector from Vald:
15. Vald Egress Filter will return the filtered result to the Vald Filter Gateway.
16. Vald Filter Gateway will return the final result to the Vald Ingress.

<!-- ### Update -->
### Update

<img src="../../assets/docs/update_flow.png" />
hlts2 marked this conversation as resolved.
Show resolved Hide resolved
hlts2 marked this conversation as resolved.
Show resolved Hide resolved

When the user updates a vector from Vald:

1. Vald Ingress receives the request from the user. The request includes the existing vector ID(s) and the new vector(s) to be updated.
2. Vald Ingress will forward the request to the Vald Filter Gateway to pre-process the request data.
3. Vald Filter Gateway will forward the request to the user-defined Vald Ingress Filter. After the Vald Ingress Filter received the request, it will perform the pre-processing logic defined by the user, for example, padding the vector to match the vector dimension in Vald.
4. After the request is processed by the user-defined Vald Ingress Filter, the result will return to the Vald Filter Gateway.
5. Vald Filter Gateway will forward the processed data to the Vald Meta Gateway. Vald Meta Gateway is used to resolve the internal used UUID(s) from Vald Meta to the user inserted vector ID(s) in the Insert Step.
6. Vald Meta Gateway will forward the request to the Vald Meta to confirm whether the metadata, which contains the vector ID(s), exists or not.
7. Vald Meta gets the UUID(s) by vector ID(s). It returns an error if no UUID(s) is found.
8. If Vald Meta Gateway gets the UUID(s), Vald Meta Gateway will forward the request with the UUID(s) to the Vald Backup Gateway.
hlts2 marked this conversation as resolved.
Show resolved Hide resolved
9. Vald Backup Gateway splits the update step into the deletion step and the insertion step. First is the deletion step. Vald Backup Gateway will forward the deletion request with the UUID(s) to the Vald LB Gateway.
10. Vald LB Gateway will broadcast the request with UUID(s) to the Vald Agents. Each Vald Agent will delete the vector data and the metadata if the corresponding UUID(s) is found in the in-memory graph index.
11. Each Vald Agent will return success to the Vald LB Gateway if it deletes the request data successfully.
12. After Vald LB Gateway receives success with the location info (e.g. IP address of pod) from the Vald Agent, Vald LB Gateway will return success to the Vald Backup Gateway.
13. Vald Backup Gateway will forward the request with the UUID(s) to the Vald Compressor.
14. Vald Compressor will forward the UUID(s) to the Vald Backup Manager.
15. Vald Backup Manager will delete the data with the same UUID(s).
16. The insertion step described in 9 will start after the deletion steps. Vald Backup Gateway will forward the insertion request to the Vald LB Gateway. Vald LB Gateway will determine which Vald Agent(s) to process the request based on the resource usage of the nodes and pods, and the number of vector replicas.
17. Vald LB Gateway will forward each set of the UUID and the vector data to the selected Vald Agents in parallel. Vald Agent will insert the vector(s) and the UUID(s) in an in-memory vector queue. A vector queue will be committed to the graph index by a `CreateIndex` instruction which will be executed by the Vald Index Manager.
18. If Vald Agent successfully inserts the request data, it will return success (e.g. IP address of pod) to the Vald LB Gateway.
19. After Vald LB Gateway receives success from the selected Vald Agents, it will return IP addresses of all selected Vald Agents to the Vald Backup Gateway.
20. Vald Backup Gateway will asynchronously send all the inserted data (including vector, vector ID, UUID and IP address) to the Vald Compressor. Vald Compressor will compress the vector data asynchronously to reduce the size of the vector data.
21. Vald Compressor will forward all the compressed data (including compressed vector, vector ID, UUID and IP address) to the Vald Backup Manager.
22. Vald Backup Manager will store all of the data to the persistent layer such as MySQL, Cassandra, etc., to prevent the data lost in Vald.
hlts2 marked this conversation as resolved.
Show resolved Hide resolved
23. Vald Backup Gateway returns success to the Vald Meta Gateway.
24. Vald Meta Gateway will return success to the Vald Filter Gateway.
25. Vald Filter Gateway will return success to the Vald Ingress.

<!-- ### Upsert -->

Expand Down