Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [benchmark][standalone][multipleChunkedEnable] hybrid_search raises error incomplete query result, missing id %!s(int64=6408144), len(searchIDs) = 100, len(queryIDs) = 93, collection=453463256739087164: inconsistent requery result in concurrent DQL scene #37143

Closed
1 task done
wangting0128 opened this issue Oct 25, 2024 · 12 comments
Assignees
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@wangting0128
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:master-20241025-ad2df904-amd64
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):rocksmq    
- SDK version(e.g. pymilvus v2.0.0rc2):2.5.0rc97
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: memory-opt-scenes-mn2cd

Test case execution succeeds when the parameter multipleChunkedEnable is turned off
server:

NAME                                                              READY   STATUS      RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
memory-opt-scenes-mn2cd-1-etcd-0                                  1/1     Running     0               3h16m   10.104.19.37    4am-node28   <none>           <none>
memory-opt-scenes-mn2cd-1-milvus-standalone-586f69c78f-qg4kv      1/1     Running     1 (3h15m ago)   3h16m   10.104.34.205   4am-node37   <none>           <none>
memory-opt-scenes-mn2cd-1-minio-9b8fd7bcb-q5gg8                   1/1     Running     0               3h16m   10.104.30.75    4am-node38   <none>           <none>

client log:

[2024-10-25 06:12:31,186 - ERROR - fouram]: RPC error: [hybrid_search], <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=6408144), len(searchIDs) = 100, len(queryIDs) = 93, collection=453463256739087164: inconsistent requery result)>, <Time:{'RPC start': '2024-10-25 06:12:10.021510', 'RPC error': '2024-10-25 06:12:31.186452'}> (decorators.py:140)
[2024-10-25 06:12:31,187 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=6408144), len(searchIDs) = 100, len(queryIDs) = 93, collection=453463256739087164: inconsistent requery result)>, [requestId: 11a0ddea-9298-11ef-ad0d-32a2e3b7d54b] (api_request.py:57)
[2024-10-25 06:12:31,187 - ERROR - fouram]: [CheckFunc] hybrid_search request check failed, response:<MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=6408144), len(searchIDs) = 100, len(queryIDs) = 93, collection=453463256739087164: inconsistent requery result)> (func_check.py:101)
[2024-10-25 06:12:44,388 -  INFO - fouram]: Print locust final stats. (locust_runner.py:56)
[2024-10-25 06:12:44,388 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]: grpc     hybrid_search                                                                   4736 4736(100.00%) |  21220   21024   21696  21024 |    0.44        0.44 (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]: grpc     query                                                                           4808     0(0.00%) |    807       6   21259     13 |    0.45        0.00 (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]: grpc     search                                                                          4648     0(0.00%) |    742       7   21012     12 |    0.43        0.00 (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]:          Aggregated                                                                     14192 4736(33.37%) |   7598       6   21696     18 |    1.32        0.44 (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]:  (stats.py:790)

Expected Behavior

No response

Steps To Reproduce

1. deploy a standalone milvus and enabled queryNode.segcore. multipleChunkedEnable=true
2. create a collection with fields ['id', 'float_vector', 'varchar_1', 'varchar_2', 'json_1', 'int64_1']
3. build index of vector field 'float_vector': IVF_SQ8
4. insert 10m data
5. flush
6. build index again
9. load collection
10. concurrent request:
   - query
   - search
   - hybrid_search <- raises error

Milvus Log

No response

Anything else?

test result:

[2024-10-25 06:12:44,388 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]: grpc     hybrid_search                                                                   4736 4736(100.00%) |  21220   21024   21696  21024 |    0.44        0.44 (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]: grpc     query                                                                           4808     0(0.00%) |    807       6   21259     13 |    0.45        0.00 (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]: grpc     search                                                                          4648     0(0.00%) |    742       7   21012     12 |    0.43        0.00 (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]:          Aggregated                                                                     14192 4736(33.37%) |   7598       6   21696     18 |    1.32        0.44 (stats.py:789)
[2024-10-25 06:12:44,388 -  INFO - fouram]:  (stats.py:790)
[2024-10-25 06:12:44,390 -  INFO - fouram]: [PerfTemplate] Report data: 
{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'standalone',
            'config_name': 'standalone_8c16m',
            'config': {'standalone': {'resources': {'limits': {'cpu': '64.0', 'memory': '64Gi'}, 'requests': {'cpu': '16.0', 'memory': '32Gi'}}},
                       'cluster': {'enabled': False},
                       'etcd': {'replicaCount': 1, 'metrics': {'enabled': True, 'podMonitor': {'enabled': True}}},
                       'minio': {'mode': 'standalone', 'metrics': {'podMonitor': {'enabled': True}}},
                       'pulsar': {'enabled': False},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'extraConfigFiles': {'user.yaml': 'queryNode:\n  segcore:\n    multipleChunkedEnable: true\n'},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus', 'tag': 'master-20241025-ad2df904-amd64'}}},
            'host': 'memory-opt-scenes-mn2cd-1-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_concurrent_locust_custom_parameters',
            'test_case_params': {'dataset_params': {'metric_type': 'L2',
                                                    'dim': 128,
                                                    'scalars_params': {'varchar_1': {'params': {'max_length': 65535},
                                                                                     'other_params': {'dataset': 'laion2b_url'}},
                                                                       'varchar_2': {'params': {'max_length': 65535},
                                                                                     'other_params': {'dataset': 'laion2b_caption'}},
                                                                       'json_1': {'other_params': {'dataset': 'laion2b_json'}},
                                                                       'int64_1': {'other_params': {'dataset': 'laion2b_int64'}}},
                                                    'dataset_name': 'sift',
                                                    'dataset_size': '10m',
                                                    'ni_per': 5000},
                                 'collection_params': {'other_fields': ['varchar_1', 'varchar_2', 'json_1', 'int64_1'], 'shards_num': 1},
                                 'index_params': {'index_type': 'IVF_SQ8', 'index_param': {'nlist': 1024}},
                                 'concurrent_params': {'concurrent_number': 10, 'during_time': '3h', 'interval': 20, 'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'query',
                                                       'weight': 1,
                                                       'params': {'expr': '',
                                                                  'output_fields': ['varchar_1', 'varchar_2', 'json_1', 'int64_1'],
                                                                  'timeout': 3000,
                                                                  'random_data': True,
                                                                  'random_count': 100,
                                                                  'random_range': [0, 5000000],
                                                                  'field_name': 'id',
                                                                  'field_type': 'int64'}},
                                                      {'type': 'search',
                                                       'weight': 1,
                                                       'params': {'nq': 10,
                                                                  'top_k': 10,
                                                                  'search_param': {'nprobe': 32},
                                                                  'output_fields': ['varchar_1', 'varchar_2', 'json_1', 'int64_1'],
                                                                  'timeout': 3000,
                                                                  'random_data': True}},
                                                      {'type': 'hybrid_search',
                                                       'weight': 1,
                                                       'params': {'nq': 10,
                                                                  'top_k': 10,
                                                                  'reqs': [{'anns_field': 'float_vector', 'search_param': {'nprobe': 64}}],
                                                                  'rerank': {'RRFRanker': []},
                                                                  'output_fields': ['varchar_1', 'varchar_2', 'json_1', 'int64_1'],
                                                                  'timeout': 3000,
                                                                  'random_data': True}}]},
            'run_id': 2024102550222166,
            'datetime': '2024-10-25 02:57:02.238890',
            'client_version': '2.5.0'},
 'result': {'test_result': {'index': {'RT': 153.9338},
                            'insert': {'total_time': 492.7142, 'VPS': 20295.7414, 'batch_time': 0.2464, 'batch': 5000},
                            'flush': {'RT': 3.0174},
                            'load': {'RT': 6.4894},
                            'Locust': {'Aggregated': {'Requests': 14192,
                                                      'Fails': 4736,
                                                      'RPS': 1.32,
                                                      'fail_s': 0.33,
                                                      'RT_max': 21696.05,
                                                      'RT_avg': 7598.02,
                                                      'TP50': 18,
                                                      'TP99': 21000.0},
                                       'hybrid_search': {'Requests': 4736,
                                                         'Fails': 4736,
                                                         'RPS': 0.44,
                                                         'fail_s': 1.0,
                                                         'RT_max': 21696.05,
                                                         'RT_avg': 21220.45,
                                                         'TP50': 21000.0,
                                                         'TP99': 21000.0},
                                       'query': {'Requests': 4808,
                                                 'Fails': 0,
                                                 'RPS': 0.45,
                                                 'fail_s': 0.0,
                                                 'RT_max': 21259.65,
                                                 'RT_avg': 807.07,
                                                 'TP50': 13,
                                                 'TP99': 20000.0},
                                       'search': {'Requests': 4648,
                                                  'Fails': 0,
                                                  'RPS': 0.43,
                                                  'fail_s': 0.0,
                                                  'RT_max': 21012.5,
                                                  'RT_avg': 742.38,
                                                  'TP50': 12,
                                                  'TP99': 20000.0}}}}}
@wangting0128 wangting0128 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. test/benchmark benchmark test labels Oct 25, 2024
@wangting0128 wangting0128 added this to the 2.5.0 milestone Oct 25, 2024
@xiaofan-luan
Copy link
Collaborator

this seems to be a retrieve issue.

I thought @congqixia is working on retrieve with segmentID and offset instead of retrieving directly with ID. how is that going on?

@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 26, 2024
@yanliang567 yanliang567 removed their assignment Oct 26, 2024
@wangting0128
Copy link
Contributor Author

same case, same error

image: master-20241029-7dd66511-amd64
server:

NAME                                                            READY   STATUS             RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
memory-opt-scenes-q2wkh-1-etcd-0                                1/1     Running            0                3h27m   10.104.24.202   4am-node29   <none>           <none>
memory-opt-scenes-q2wkh-1-milvus-standalone-84c54dc586-7r525    1/1     Running            1 (3h25m ago)    3h27m   10.104.32.66    4am-node39   <none>           <none>
memory-opt-scenes-q2wkh-1-minio-6c8f7d8984-djpf5                1/1     Running            0                3h27m   10.104.21.41    4am-node24   <none>           <none>

client log:
截屏2024-10-30 10 50 06

@sunby
Copy link
Contributor

sunby commented Oct 30, 2024

/assign

@xiaofan-luan
Copy link
Collaborator

/assign @wangting0128
please help on verifying

@wangting0128
Copy link
Contributor Author

/assign @wangting0128 please help on verifying

verification failed

image: 2.5-20241031-6b9b6999-amd64
test case name: test_hybrid_search_locust_multi_ddl_dql_hybrid_search_cluster

client log:
截屏2024-11-01 11 53 07

@sunby
Copy link
Contributor

sunby commented Nov 1, 2024

/assign @wangting0128 please help on verifying

verification failed

image: 2.5-20241031-6b9b6999-amd64 test case name: test_hybrid_search_locust_multi_ddl_dql_hybrid_search_cluster

client log: 截屏2024-11-01 11 53 07

oh sorry I tested with 100,000 dataset and did not notice this problem. But it appeared when I test with 1 million. I have found the root cause and will fix it in another pr.

@xiaofan-luan
Copy link
Collaborator

/assign @wangting0128
please help on verifying it

@wangting0128
Copy link
Contributor Author

verification passed

argo task:memory-opt-scenes-7vrcm
image:master-20241108-a0315783-amd64

@wangting0128
Copy link
Contributor Author

reproduce

argo task:multi-vector-corn-1-1731852000
test case name:test_hybrid_search_locust_dml_dql_cluster
image:master-20241116-00edec2e-amd64

server:

NAME                                                              READY   STATUS      RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
multi-vector-corn-1-1731852000-4-etcd-0                           1/1     Running     0                12h     10.104.19.114   4am-node28   <none>           <none>
multi-vector-corn-1-1731852000-4-etcd-1                           1/1     Running     0                12h     10.104.15.44    4am-node20   <none>           <none>
multi-vector-corn-1-1731852000-4-etcd-2                           1/1     Running     0                12h     10.104.23.206   4am-node27   <none>           <none>
multi-vector-corn-1-1731852000-4-milvus-datanode-58bb488f8lf8zn   1/1     Running     3 (12h ago)      12h     10.104.13.99    4am-node16   <none>           <none>
multi-vector-corn-1-1731852000-4-milvus-indexnode-74444994n6drz   1/1     Running     1 (12h ago)      12h     10.104.14.132   4am-node18   <none>           <none>
multi-vector-corn-1-1731852000-4-milvus-mixcoord-f8f4b66472kqdt   1/1     Running     2 (12h ago)      12h     10.104.14.133   4am-node18   <none>           <none>
multi-vector-corn-1-1731852000-4-milvus-proxy-688d585fdf-89zr2    1/1     Running     3 (12h ago)      12h     10.104.13.98    4am-node16   <none>           <none>
multi-vector-corn-1-1731852000-4-milvus-querynode-5795f596fl8bv   1/1     Running     3 (10h ago)      12h     10.104.1.48     4am-node10   <none>           <none>
multi-vector-corn-1-1731852000-4-minio-0                          1/1     Running     0                12h     10.104.23.205   4am-node27   <none>           <none>
multi-vector-corn-1-1731852000-4-minio-1                          1/1     Running     0                12h     10.104.15.43    4am-node20   <none>           <none>
multi-vector-corn-1-1731852000-4-minio-2                          1/1     Running     0                12h     10.104.20.113   4am-node22   <none>           <none>
multi-vector-corn-1-1731852000-4-minio-3                          1/1     Running     0                12h     10.104.19.122   4am-node28   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-bookie-0                  1/1     Running     0                12h     10.104.24.211   4am-node29   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-bookie-1                  1/1     Running     0                12h     10.104.21.102   4am-node24   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-bookie-2                  1/1     Running     0                12h     10.104.27.18    4am-node31   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-bookie-init-2zmz8         0/1     Completed   0                12h     10.104.19.105   4am-node28   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-broker-0                  1/1     Running     0                12h     10.104.33.116   4am-node36   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-proxy-0                   1/1     Running     0                12h     10.104.21.97    4am-node24   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-pulsar-init-dwgl5         0/1     Completed   0                12h     10.104.4.209    4am-node11   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-recovery-0                1/1     Running     0                12h     10.104.6.27     4am-node13   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-zookeeper-0               1/1     Running     0                12h     10.104.19.109   4am-node28   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-zookeeper-1               1/1     Running     0                12h     10.104.17.79    4am-node23   <none>           <none>
multi-vector-corn-1-1731852000-4-pulsar-zookeeper-2               1/1     Running     0                12h     10.104.15.57    4am-node20   <none>           <none>

client log:

[2024-11-17 14:53:49,595 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 50e960fe-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:53:49,729 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 4ffe9c18-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:53:50,803 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 6a74fd8a-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:53:51,055 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 8396580e-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:53:51,056 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 6ab5725c-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:53:51,056 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 6a84410a-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:53:51,290 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 9bc85d82-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:53:51,291 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 9bb9b886-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:53:51,545 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: b470fb64-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:57:07,188 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: c1213fa4-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:57:09,427 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: c135a20a-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:57:33,728 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: c20031aa-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:57:34,078 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: c2005f90-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:57:34,090 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: d9a95d0e-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:57:34,091 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: f16f7b76-a4f3-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:57:34,095 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 09a3e6be-a4f4-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:57:34,103 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 2cdc51ca-a4f4-11ef-b26a-0678f81f0d21] (api_request.py:57)
[2024-11-17 14:57:34,376 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=0), len(searchIDs) = 100, len(queryIDs) = 99, collection=453994766249493140: inconsistent requery result)>, [requestId: 09a4b102-a4f4-11ef-b26a-0678f81f0d21] (api_request.py:57)

test steps:

        concurrent test and calculation of RT and QPS

        :purpose:  `DML & DQL`
            verify DML & DQL scenario,
            which has 4 vector fields(IVF_FLAT, HNSW, DISKANN, IVF_SQ8) and scalar fields: `int64_1`, `varchar_1`

        :test steps:
            1. create collection with fields:
                'float_vector': 128dim,
                'float_vector_1': 128dim,
                'float_vector_2': 128dim,
                'float_vector_3': 128dim,
                scalar field: int64_1, varchar_1
            2. build indexes:
                IVF_FLAT: 'float_vector'
                HNSW: 'float_vector_1',
                DISKANN: 'float_vector_2'
                IVF_SQ8: 'float_vector_3'
                INVERTED: 'int64_1', 'varchar_1'
                default scalar index: 'id'
            3. insert 1 million data
            4. flush collection
            5. build indexes again using the same params
            6. load collection
                replica: 1
            7. concurrent request:
                - insert
                - delete
                - flush
                - load
                - search
                - hybrid_search
                - query

@wangting0128 wangting0128 reopened this Nov 18, 2024
@wangting0128
Copy link
Contributor Author

different case,same error

argo task:inverted-corn-1731877200
test case name:test_inverted_locust_hnsw_diskann_dml_dql_cluster
image:master-20241116-00edec2e-amd64

server:

NAME                                                              READY   STATUS      RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
inverted-corn-177200-8-91-5695-etcd-0                             1/1     Running     0                4h11m   10.104.17.80    4am-node23   <none>           <none>
inverted-corn-177200-8-91-5695-etcd-1                             1/1     Running     0                4h11m   10.104.34.84    4am-node37   <none>           <none>
inverted-corn-177200-8-91-5695-etcd-2                             1/1     Running     0                4h11m   10.104.19.179   4am-node28   <none>           <none>
inverted-corn-177200-8-91-5695-milvus-datanode-c477df79-nwjg6     1/1     Running     2 (4h10m ago)    4h11m   10.104.25.23    4am-node30   <none>           <none>
inverted-corn-177200-8-91-5695-milvus-indexnode-56c59475f82p6k7   1/1     Running     2 (4h10m ago)    4h11m   10.104.18.151   4am-node25   <none>           <none>
inverted-corn-177200-8-91-5695-milvus-indexnode-56c59475f8gmphm   1/1     Running     2 (4h10m ago)    4h11m   10.104.14.187   4am-node18   <none>           <none>
inverted-corn-177200-8-91-5695-milvus-indexnode-56c59475f8njtfj   1/1     Running     2 (4h10m ago)    4h11m   10.104.26.219   4am-node32   <none>           <none>
inverted-corn-177200-8-91-5695-milvus-indexnode-56c59475f8r4fpj   1/1     Running     2 (4h10m ago)    4h11m   10.104.13.174   4am-node16   <none>           <none>
inverted-corn-177200-8-91-5695-milvus-mixcoord-558b646699-lg784   1/1     Running     2 (4h10m ago)    4h11m   10.104.18.149   4am-node25   <none>           <none>
inverted-corn-177200-8-91-5695-milvus-proxy-6bdfd9f975-v4jxp      1/1     Running     2 (4h10m ago)    4h11m   10.104.32.118   4am-node39   <none>           <none>
inverted-corn-177200-8-91-5695-milvus-querynode-64fcfdcb-c9xhs    1/1     Running     2 (4h10m ago)    4h11m   10.104.32.119   4am-node39   <none>           <none>
inverted-corn-177200-8-91-5695-milvus-querynode-64fcfdcb-ls5gh    1/1     Running     3 (4h10m ago)    4h11m   10.104.27.133   4am-node31   <none>           <none>
inverted-corn-177200-8-91-5695-minio-0                            1/1     Running     0                4h11m   10.104.18.156   4am-node25   <none>           <none>
inverted-corn-177200-8-91-5695-minio-1                            1/1     Running     0                4h11m   10.104.17.78    4am-node23   <none>           <none>
inverted-corn-177200-8-91-5695-minio-2                            1/1     Running     0                4h11m   10.104.19.177   4am-node28   <none>           <none>
inverted-corn-177200-8-91-5695-minio-3                            1/1     Running     0                4h11m   10.104.34.83    4am-node37   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-bookie-0                    1/1     Running     0                4h11m   10.104.17.81    4am-node23   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-bookie-1                    1/1     Running     0                4h11m   10.104.19.180   4am-node28   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-bookie-2                    1/1     Running     0                4h11m   10.104.34.85    4am-node37   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-bookie-init-lpxmq           0/1     Completed   0                4h11m   10.104.18.150   4am-node25   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-broker-0                    1/1     Running     0                4h11m   10.104.34.75    4am-node37   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-proxy-0                     1/1     Running     0                4h11m   10.104.6.139    4am-node13   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-pulsar-init-d822r           0/1     Completed   0                4h11m   10.104.13.173   4am-node16   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-recovery-0                  1/1     Running     0                4h11m   10.104.19.169   4am-node28   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-zookeeper-0                 1/1     Running     0                4h11m   10.104.19.178   4am-node28   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-zookeeper-1                 1/1     Running     0                4h10m   10.104.17.90    4am-node23   <none>           <none>
inverted-corn-177200-8-91-5695-pulsar-zookeeper-2                 1/1     Running     0                4h9m    10.104.33.220   4am-node36   <none>           <none> 

client log:

[2024-11-17 22:27:04,435 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=3), len(searchIDs) = 1000, len(queryIDs) = 196, collection=454001310100619578: inconsistent requery result)>, [requestId: 049d0a80-a533-11ef-86b4-6ebe96e6d0f5] (api_request.py:57)
[2024-11-17 22:27:04,456 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=3), len(searchIDs) = 1000, len(queryIDs) = 204, collection=454001310100619578: inconsistent requery result)>, [requestId: 04c5a814-a533-11ef-86b4-6ebe96e6d0f5] (api_request.py:57)
[2024-11-17 22:27:23,160 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=3), len(searchIDs) = 1000, len(queryIDs) = 182, collection=454001310100619578: inconsistent requery result)>, [requestId: 0fb759de-a533-11ef-86b4-6ebe96e6d0f5] (api_request.py:57)
[2024-11-17 22:27:30,639 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=3), len(searchIDs) = 1000, len(queryIDs) = 180, collection=454001310100619578: inconsistent requery result)>, [requestId: 146865a4-a533-11ef-86b4-6ebe96e6d0f5] (api_request.py:57)
[2024-11-17 22:27:31,219 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=3), len(searchIDs) = 1000, len(queryIDs) = 192, collection=454001310100619578: inconsistent requery result)>, [requestId: 1485430e-a533-11ef-86b4-6ebe96e6d0f5] (api_request.py:57)

...

[2024-11-17 22:55:07,864 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=3), len(searchIDs) = 1000, len(queryIDs) = 351, collection=454001310100619578: inconsistent requery result)>, [requestId: f01c88e8-a536-11ef-86b4-6ebe96e6d0f5] (api_request.py:57)
[2024-11-17 22:55:24,828 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=3), len(searchIDs) = 1000, len(queryIDs) = 351, collection=454001310100619578: inconsistent requery result)>, [requestId: fa116288-a536-11ef-86b4-6ebe96e6d0f5] (api_request.py:57)
[2024-11-17 22:55:27,691 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=3), len(searchIDs) = 1000, len(queryIDs) = 348, collection=454001310100619578: inconsistent requery result)>, [requestId: fbbefdf2-a536-11ef-86b4-6ebe96e6d0f5] (api_request.py:57)
[2024-11-17 22:55:31,487 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=2200, message=incomplete query result, missing id %!s(int64=3), len(searchIDs) = 1000, len(queryIDs) = 367, collection=454001310100619578: inconsistent requery result)>, [requestId: fdedae2a-a536-11ef-86b4-6ebe96e6d0f5] (api_request.py:57)

test steps:

        concurrent test and calculation of RT and QPS

        :purpose:  `vector: memory and disk index`
            verify concurrent DML & DQL scenario which has 4 float_vector fields & 16 scalar fields

        :test steps:
            1. create collection with fields:
                'float_vector': 128dim,
                'float_vector_1': 128dim,
                'float_vector_2': 200dim,
                'float_vector_3': 200dim,
                'int8_1', 'int16_1', 'int32_1', 'int64_1', 'double_1', 'float_1', 'varchar_1', 'bool_1',
                'int8_2', 'int16_2', 'int32_2', 'int64_2', 'double_2', 'float_2', 'varchar_2', 'bool_2'
            2. build indexes:
                HNSW: 'float_vector'
                DIAKANN_IP: 'float_vector_1'
                HNSW: 'float_vector_2'
                DIAKANN_L2: 'float_vector_3'
                scalar_default_index: 'int8_1', 'int16_1', 'int32_1', 'int64_1', 'double_1', 'float_1', 'varchar_1'
                scalar_INVERTED_index: 'int8_2', 'int16_2', 'int32_2', 'int64_2', 'double_2', 'float_2', 'varchar_2', 'bool_2'
            3. insert 5 million data
            4. flush collection
            5. build indexes again using the same params
            6. load collection
            7. concurrent request:
                - insert
                - delete
                - flush
                - load
                - search
                - hybrid_search
                - query

@xiaofan-luan
Copy link
Collaborator

please leave the enviroment and don't delete the data so @sunby could investigate on that

@wangting0128
Copy link
Contributor Author

not reproduce recently

argo task:fouramf-bitmap-scenes-6km2j
image:master-20241121-b983ef9f-amd64

close it now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants