Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add doc of search-pic-demo #1075

Merged
merged 3 commits into from
May 27, 2024
Merged

add doc of search-pic-demo #1075

merged 3 commits into from
May 27, 2024

Conversation

yangj1211
Copy link
Collaborator

What type of PR is this?

  • Enhancement
  • Displaying
  • Typo
  • Doc Request

Which issue(s) this PR fixes:

issue #1052

What this PR does / why we need it:

@yangj1211 yangj1211 requested a review from dengn May 22, 2024 15:49
Copy link

@zuyu zuyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work on using MO as a vector db!

Just some minor changes request.

以下为以图(文)搜图的流程图:

<div align="center">
<img src=https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/artwork/docs/tutorial/Vector/search-image.png width=80% heigth=80%/>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please upload the image to https://github.com/matrixorigin/artwork, and then update the URL to https://github.com/matrixorigin/artwork/blob/main/docs/tutorial/Vector/search-image.png.

Ditto for img_search.png and text_search_pic.png below.

- 下载安装 `pymysql` 工具。使用下面的代码下载安装 `pymysql` 工具:

```
pip3 install pymysql
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be consistent with pip and pip3:

  • either use pip here, or
  • change pip below to pip3

连接 MatrixOne,建立一个名为 `pic_tab` 的表来存储图片路径信息和对应的向量信息。

```sql
create table pic_tab(pic_path varchar(200),embedding vecf64(512));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a whilespace before embedding


### 遍历图片路径

定义方法 `find_img_files` 遍历本地图片文件夹,这里我预先在本地存了苹果、香蕉、蓝莓、樱桃、杏子五种类别的水果图片,每种类别若干张,格式都为。jpg。
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: .jpg, instead of 。jpg

for file in files:
if file.lower().endswith('.jpg'):
full_path = os.path.join(root, file)
img_files.append(full_path) #构建完整的文件路径
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add one more whitespace before and after #.

Ditto for the comments for storage_img() below.

Btw, black is recommended for formatting Python code.

return img_files
```

- 图像向量化并存入 MO
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about replacing MO as MatrixOne (以下简称 MO )

img_features = img_features .detach().tolist() #分离张量,转换为列表
embeddings = img_features [0]
insert_sql = "insert into pic_tab(pic_path,embedding) values (%s, normalize_l2(%s))"
data_to_insert = (file_path, str(embeddings ))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove extra whitespaces, i.e., = in two assignments above and str(embeddings ).

def create_idx(n):
cursor.execute('SET GLOBAL experimental_ivf_index = 1')
create_sql='create index idx_pic using ivfflat on pic_tab(embedding) lists=%s op_type "vector_l2_ops"'
cursor.execute(create_sql,n)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

  • remove the extra whitespace before lists
  • add a whitespace btw create_sql, and n

if img_path:
result_path = [img_path]+[path for path_tuple in data for path in path_tuple]
else:
result_path = [path for path_tuple in data for path in path_tuple]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the indentation is NOT consistent w/ other places.

Keep comments below consistent so that there are two whitespaces after any code, and one whitespace before the actual comment.

show_img(img_path,1,4)
#text = ["Banana"]
#text_search_img(text,3)
#show_img(None,1,3)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the comment Uncomment below for text_search_img

@dengn
Copy link
Collaborator

dengn commented May 27, 2024

thanks for your nice suggestions! @zuyu Make necessary changes please. @yangj1211

@dengn dengn merged commit 985fd79 into matrixorigin:main May 27, 2024
1 check passed
@yangj1211 yangj1211 deleted the search-pic branch June 6, 2024 06:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants