Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [null & default] Search results show 0.0 rather "None" when upsert "None" for the nullable field after insert #35924

Closed
1 task done
binbinlv opened this issue Sep 3, 2024 · 4 comments
Assignees
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@binbinlv
Copy link
Contributor

binbinlv commented Sep 3, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: master-20240903-6130a854
- Deployment mode(standalone or cluster): both
- MQ type(rocksmq, pulsar or kafka):    all
- SDK version(e.g. pymilvus v2.0.0rc2): 2.5.0rc74
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

Search results show 0.0 rather "None" when upsert "None" for the nullable field after insert

data: ['["id: 1, distance: 0.0, entity: {\'int32\': 10, \'nullableFid\': 0.0}", "id: 2, distance: 24.031944274902344, entity: {\'int32\': 10, \'nullableFid\': 0.0}"]']

Expected Behavior

Search results show "None" rather "0.0" when insert "None" for the nullable field

Steps To Reproduce

from pymilvus import CollectionSchema, FieldSchema
from pymilvus import Collection
from pymilvus import connections
from pymilvus import DataType
from pymilvus import Partition
from pymilvus import utility
import json
import random

connections.connect()

dim = 128
int64_field = FieldSchema(name="int64", dtype=DataType.INT64, is_primary=True)
double_field = FieldSchema(name="nullableFid", dtype=DataType.DOUBLE, nullable=True)
int32_field = FieldSchema(name="int32", dtype=DataType.INT64, default_value=10)
float_vector = FieldSchema(name="float_vector", dtype=DataType.FLOAT_VECTOR, dim=dim)
schema = CollectionSchema(fields=[int64_field, double_field, int32_field,float_vector])
utility.drop_collection("test")
collection = Collection("test", schema=schema)
res = collection.schema
print(res)

index = {"index_type": "DISKANN", "metric_type": "L2", "params": {}}

nb = 2
vectors = [[random.random() for _ in range(dim)] for _ in range(nb)]
data = [[1,2], [3.0,None],[4,None], vectors]
#  equals to data1 = [[1,2], [None,None],[None,None], vectors]
data1 = [[1,2], [],[], vectors]

collection.insert(data=data)
collection.upsert(data=data1)
collection.create_index("float_vector", index, index_name="index_name_1")
collection.load()
collection.flush()
res = collection.num_entities
print(res)
default_search_params = {"metric_type": "L2", "params": {}}
limit = 10
nq = 1
import time
start = time.time()
res1 = collection.search(vectors[:nq], "float_vector", default_search_params, limit, "int64 >= 0", output_fields=["nullableFid", "int32"])
end = time.time() - start
print(res1)
print("search successfully in %f s" % end)
start = time.time()
res = collection.query("int64>=0", output_fields=["nullableFid","int32"])
end = time.time() - start
print(res)
print("query successfully in %f s" % end)

Milvus Log

https://grafana-4am.zilliz.cc/explore?orgId=1&panes=%7B%22CsX%22:%7B%22datasource%22:%22vhI6Vw67k%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bcluster%3D%5C%22devops%5C%22,namespace%3D%5C%22chaos-testing%5C%22,pod%3D~%5C%22default-null-test-yqvkb.%2A%5C%22%7D%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22vhI6Vw67k%22%7D%7D%5D,%22range%22:%7B%22from%22:%22now-1h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1

Anything else?

No response

@binbinlv binbinlv added kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Sep 3, 2024
@binbinlv binbinlv added this to the 2.5.0 milestone Sep 3, 2024
@binbinlv
Copy link
Contributor Author

binbinlv commented Sep 3, 2024

if only insert or upsert, it is right:

  1. only insert:
    data = [[1,2], [3.0,None],[4,None], vectors]

search results:
data: ['["id: 1, distance: 0.0, entity: {'nullableFid': 3.0, 'int32': 4}", "id: 2, distance: 25.33065414428711, entity: {'nullableFid': None, 'int32': 10}"]']
query results:
data: ["{'int32': 4, 'int64': 1, 'nullableFid': 3.0}", "{'int32': 10, 'int64': 2, 'nullableFid': None}"

  1. only upsert:
    data1 = [[1,2], [],[], vectors]

search results:
data: ['["id: 1, distance: 0.0, entity: {'nullableFid': None, 'int32': 10}", "id: 2, distance: 22.605209350585938, entity: {'nullableFid': None, 'int32': 10}"]']
query results:
data: ["{'nullableFid': None, 'int32': 10, 'int64': 1}", "{'nullableFid': None, 'int32': 10, 'int64': 2}"]

@binbinlv binbinlv changed the title [Bug]: Search results show 0.0 rather "None" when upsert "None" for the nullable field after insert [Bug]: [nullable] Search results show 0.0 rather "None" when upsert "None" for the nullable field after insert Sep 3, 2024
@binbinlv binbinlv changed the title [Bug]: [nullable] Search results show 0.0 rather "None" when upsert "None" for the nullable field after insert [Bug]: [null & default] Search results show 0.0 rather "None" when upsert "None" for the nullable field after insert Sep 3, 2024
@binbinlv binbinlv added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Sep 4, 2024
@binbinlv
Copy link
Contributor Author

binbinlv commented Sep 4, 2024

  1. Run the script above for the first time, this issue occurs.
  2. Run the script above for the second time, issue [Bug]: [null & default] Search reports error "proto: cannot parse invalid wire-format data" after insert and upsert "None" data in "nonable" field #35952 occurs.
  3. Run the script above for the third time, this issue occurs.
  4. Run the script above for the fourth time, issue [Bug]: [null & default] Proxy is restarted during search after insert and upsert "None" data in "nonable" field #35953 occurs.

sre-ci-robot pushed a commit that referenced this issue Sep 6, 2024
fix not append valid data when transfer to insert record and add a tiny
check when in groupBy field.
#35924

Signed-off-by: lixinguo <[email protected]>
Co-authored-by: lixinguo <[email protected]>
@smellthemoon
Copy link
Contributor

#36027 merged, could you help to verify? @binbinlv

@binbinlv
Copy link
Contributor Author

binbinlv commented Sep 9, 2024

Verified and fixed.

pymilvus: 2.5.0rc78
milvus:master-20240908-208c8a23

results:

data: ['["id: 1, distance: 0.0, entity: {\'nullableFid\': None, \'int32\': 10}", "id: 2, distance: 22.366727828979492, entity: {\'nullableFid\': None, \'int32\': 10}"]']
search successfully in 0.050094 s
data: ["{'nullableFid': None, 'int32': 10, 'int64': 1}", "{'nullableFid': None, 'int32': 10, 'int64': 2}"]
query successfully in 0.013904 s

@binbinlv binbinlv closed this as completed Sep 9, 2024
chyezh pushed a commit to chyezh/milvus that referenced this issue Sep 11, 2024
…36027)

fix not append valid data when transfer to insert record and add a tiny
check when in groupBy field.
milvus-io#35924

Signed-off-by: lixinguo <[email protected]>
Co-authored-by: lixinguo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

2 participants