Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [benchmark][standalone] Milvus panic panic: runtime error: index out of range [-1] in concurrent dql scene #35505

Closed
1 task done
wangting0128 opened this issue Aug 16, 2024 · 6 comments
Assignees
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@wangting0128
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: master-20240814-c6ae7d4d-amd64
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):rocksmq    
- SDK version(e.g. pymilvus v2.0.0rc2):2.5.0rc36
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

test case name: test_bitmap_locust_pk_varchar_dql_standalone

server:

NAME                                                              READY   STATUS             RESTARTS           AGE     IP              NODE         NOMINATED NODE   READINESS GATES
wt-test-etcd-0                                                    1/1     Running            0                  38h     10.104.18.219   4am-node25   <none>           <none>
wt-test-milvus-standalone-5d9cc7f44f-fbbwg                        1/1     Running            2 (10h ago)        38h     10.104.23.101   4am-node27   <none>           <none>
wt-test-minio-7b8f7b7444-l9znf                                    1/1     Running            0                  38h     10.104.24.28    4am-node29   <none>           <none>
截屏2024-08-16 10 59 27 截屏2024-08-16 10 59 46

client log:
fouram_log.log.zip
image

Expected Behavior

No response

Steps To Reproduce

concurrent test and calculation of RT and QPS

        :purpose:  `primary key: VARCHAR`
            1. building `BITMAP` index on VARCHAR primary key and all supported 12 scalar fields
            2. the other 22 scalar fields build `INVERTED`, `Trie`, `STL_SORT` indexes
            3. 4 fields of different vector types
            4. search for different expressions on BITMAP index fields

        :test steps:
            1. create collection with fields:
                'binary_vector': 128dim
                'float16_vector': 128dim
                'bfloat16_vector': 128dim
                'sparse_float_vector': sparse_range=[1, 100] <- the range of non-zero values of a sparse vector
                'id': primary key type is VARCHAR

                all scalar fields: varchar max_length=100, array max_capacity=9
            2. build indexes:
                BIN_IVF_FLAT: 'binary_vector'
                IVF_SQ8: 'float16_vector'
                HNSW: 'bfloat16_vector'
                SPARSE_WAND: 'sparse_float_vector'
                BITMAP: 'id', '*_1' all supported field names
                INVERTED: 'array_float_1', 'array_double_1', 'float_2', 'double_2', 'bool_2', 'array_int8_2',
                          'array_int16_2', 'array_int32_2', 'array_int64_2', 'array_varchar_2', 'array_bool_2',
                          'array_float_2', 'array_double_2'
                Trie: 'varchar_2'
                STL_SORT: 'float_1', 'double_1', 'int8_2', 'int16_2', 'int32_2', 'int64_2'
            3. insert 5 million data
                'id': [-1000, 1000)
            4. flush collection
            5. build indexes again using the same params
            6. load collection
            7. concurrent request:
                - search
                - query
                - hybrid_search

Milvus Log

No response

Anything else?

test result:

[2024-08-15 17:04:58,426 -  INFO - fouram]: Print locust final stats. (locust_runner.py:56)
[2024-08-15 17:04:58,426 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-08-15 17:04:58,426 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-08-15 17:04:58,426 -  INFO - fouram]: grpc     hybrid_search                                                                   4551 4551(100.00%) |      0       0       0      0 |    7.54        7.54 (stats.py:789)
[2024-08-15 17:04:58,426 -  INFO - fouram]: grpc     query                                                                           4451 4451(100.00%) |      0       0       0      0 |    7.38        7.38 (stats.py:789)
[2024-08-15 17:04:58,426 -  INFO - fouram]: grpc     search                                                                          4509 4509(100.00%) |      0       0       0      0 |    7.48        7.48 (stats.py:789)
[2024-08-15 17:04:58,426 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-08-15 17:04:58,426 -  INFO - fouram]:          Aggregated                                                                     13511 13511(100.00%) |      0       0       0      0 |   22.40       22.40 (stats.py:789)
[2024-08-15 17:04:58,426 -  INFO - fouram]:  (stats.py:790)
[2024-08-15 17:04:58,430 -  INFO - fouram]: [PerfTemplate] Report data: 
{'server': {'deploy_tool': '',
            'deploy_mode': '',
            'config_name': '',
            'config': {},
            'host': 'wt-test-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_bitmap_locust_pk_varchar_dql_standalone',
            'test_case_params': {'dataset_params': {'metric_type': 'JACCARD',
                                                    'vector_field_name': 'binary_vector',
                                                    'dim': 128,
                                                    'sparse_range': [1, 100],
                                                    'max_length': 100,
                                                    'varchar_filled': True,
                                                    'scalars_index': {'id': {'index_type': 'BITMAP'},
                                                                      'int8_1': {'index_type': 'BITMAP'},
                                                                      'int16_1': {'index_type': 'BITMAP'},
                                                                      'int32_1': {'index_type': 'BITMAP'},
                                                                      'int64_1': {'index_type': 'BITMAP'},
                                                                      'varchar_1': {'index_type': 'BITMAP'},
                                                                      'bool_1': {'index_type': 'BITMAP'},
                                                                      'array_int8_1': {'index_type': 'BITMAP'},
                                                                      'array_int16_1': {'index_type': 'BITMAP'},
                                                                      'array_int32_1': {'index_type': 'BITMAP'},
                                                                      'array_int64_1': {'index_type': 'BITMAP'},
                                                                      'array_varchar_1': {'index_type': 'BITMAP'},
                                                                      'array_bool_1': {'index_type': 'BITMAP'},
                                                                      'array_float_1': {'index_type': 'INVERTED'},
                                                                      'array_double_1': {'index_type': 'INVERTED'},
                                                                      'float_2': {'index_type': 'INVERTED'},
                                                                      'double_2': {'index_type': 'INVERTED'},
                                                                      'bool_2': {'index_type': 'INVERTED'},
                                                                      'array_int8_2': {'index_type': 'INVERTED'},
                                                                      'array_int16_2': {'index_type': 'INVERTED'},
                                                                      'array_int32_2': {'index_type': 'INVERTED'},
                                                                      'array_int64_2': {'index_type': 'INVERTED'},
                                                                      'array_varchar_2': {'index_type': 'INVERTED'},
                                                                      'array_bool_2': {'index_type': 'INVERTED'},
                                                                      'array_float_2': {'index_type': 'INVERTED'},
                                                                      'array_double_2': {'index_type': 'INVERTED'},
                                                                      'varchar_2': {'index_type': 'Trie'},
                                                                      'float_1': {'index_type': 'STL_SORT'},
                                                                      'double_1': {'index_type': 'STL_SORT'},
                                                                      'int8_2': {'index_type': 'STL_SORT'},
                                                                      'int16_2': {'index_type': 'STL_SORT'},
                                                                      'int32_2': {'index_type': 'STL_SORT'},
                                                                      'int64_2': {'index_type': 'STL_SORT'}},
                                                    'vectors_index': {'float16_vector': {'index_type': 'IVF_SQ8',
                                                                                         'index_param': {'nlist': 1024},
                                                                                         'metric_type': 'L2'},
                                                                      'bfloat16_vector': {'index_type': 'HNSW',
                                                                                          'index_param': {'M': 8, 'efConstruction': 200},
                                                                                          'metric_type': 'L2'},
                                                                      'sparse_float_vector': {'index_type': 'SPARSE_WAND',
                                                                                              'index_param': {'drop_ratio_build': 0.2},
                                                                                              'metric_type': 'IP'}},
                                                    'scalars_params': {'array_int8_1': {'params': {'max_capacity': 9}},
                                                                       'array_int16_1': {'params': {'max_capacity': 9}},
                                                                       'array_int32_1': {'params': {'max_capacity': 9}},
                                                                       'array_int64_1': {'params': {'max_capacity': 9}},
                                                                       'array_double_1': {'params': {'max_capacity': 9}},
                                                                       'array_float_1': {'params': {'max_capacity': 9}},
                                                                       'array_varchar_1': {'params': {'max_capacity': 9}},
                                                                       'array_bool_1': {'params': {'max_capacity': 9}},
                                                                       'array_int8_2': {'params': {'max_capacity': 9}},
                                                                       'array_int16_2': {'params': {'max_capacity': 9}},
                                                                       'array_int32_2': {'params': {'max_capacity': 9}},
                                                                       'array_int64_2': {'params': {'max_capacity': 9}},
                                                                       'array_double_2': {'params': {'max_capacity': 9}},
                                                                       'array_float_2': {'params': {'max_capacity': 9}},
                                                                       'array_varchar_2': {'params': {'max_capacity': 9}},
                                                                       'array_bool_2': {'params': {'max_capacity': 9}},
                                                                       'id': {'other_params': {'dataset': 'random_algorithm',
                                                                                               'algorithm_params': {'algorithm_name': 'specify_scope',
                                                                                                                    'specify_range': [-1000, 1000],
                                                                                                                    'max_capacity': 1}}}},
                                                    'dataset_name': 'local',
                                                    'dataset_size': 5000000,
                                                    'ni_per': 5000},
                                 'collection_params': {'other_fields': ['float16_vector', 'bfloat16_vector', 'sparse_float_vector', 'int8_1', 'int16_1',
                                                                        'int32_1', 'int64_1', 'double_1', 'float_1', 'varchar_1', 'bool_1', 'json_1',
                                                                        'array_int8_1', 'array_int16_1', 'array_int32_1', 'array_int64_1', 'array_double_1',
                                                                        'array_float_1', 'array_varchar_1', 'array_bool_1', 'int8_2', 'int16_2', 'int32_2',
                                                                        'int64_2', 'double_2', 'float_2', 'varchar_2', 'bool_2', 'json_2', 'array_int8_2',
                                                                        'array_int16_2', 'array_int32_2', 'array_int64_2', 'array_double_2', 'array_float_2',
                                                                        'array_varchar_2', 'array_bool_2'],
                                                       'shards_num': 2,
                                                       'varchar_id': True},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False, 'reset_db': False},
                                 'index_params': {'index_type': 'BIN_IVF_FLAT', 'index_param': {'nlist': 2048}},
                                 'concurrent_params': {'concurrent_number': 20, 'during_time': 1800, 'interval': 20, 'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'search',
                                                       'weight': 1,
                                                       'params': {'nq': 1000,
                                                                  'top_k': 10,
                                                                  'search_param': {'nprobe': 16},
                                                                  'expr': 'id >= "100"',
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'output_fields': ['*'],
                                                                  'ignore_growing': False,
                                                                  'group_by_field': None,
                                                                  'timeout': None,
                                                                  'random_data': True,
                                                                  'check_task': 'check_search_output',
                                                                  'check_items': {'output_fields': ['float16_vector', 'bfloat16_vector', 'sparse_float_vector',
                                                                                                    'int8_1', 'int16_1', 'int32_1', 'int64_1', 'double_1',
                                                                                                    'float_1', 'varchar_1', 'bool_1', 'json_1', 'array_int8_1',
                                                                                                    'array_int16_1', 'array_int32_1', 'array_int64_1',
                                                                                                    'array_double_1', 'array_float_1', 'array_varchar_1',
                                                                                                    'array_bool_1', 'int8_2', 'int16_2', 'int32_2', 'int64_2',
                                                                                                    'double_2', 'float_2', 'varchar_2', 'bool_2', 'json_2',
                                                                                                    'array_int8_2', 'array_int16_2', 'array_int32_2',
                                                                                                    'array_int64_2', 'array_double_2', 'array_float_2',
                                                                                                    'array_varchar_2', 'array_bool_2', 'id',
                                                                                                    'binary_vector']}}},
                                                      {'type': 'query',
                                                       'weight': 1,
                                                       'params': {'ids': None,
                                                                  'expr': 'id > "-1" && ',
                                                                  'output_fields': ['id', 'binary_vector', 'int64_1'],
                                                                  'offset': None,
                                                                  'limit': None,
                                                                  'ignore_growing': False,
                                                                  'partition_names': None,
                                                                  'timeout': None,
                                                                  'random_data': True,
                                                                  'random_count': 10,
                                                                  'random_range': [-1000, 1000],
                                                                  'field_name': 'id',
                                                                  'field_type': 'varchar',
                                                                  'check_task': 'check_query_output',
                                                                  'check_items': None}},
                                                      {'type': 'hybrid_search',
                                                       'weight': 1,
                                                       'params': {'nq': 10,
                                                                  'top_k': 10,
                                                                  'reqs': [{'search_param': {'nprobe': 128},
                                                                            'anns_field': 'binary_vector',
                                                                            'expr': '(int64_1 % 10) == 1',
                                                                            'top_k': 100},
                                                                           {'search_param': {'nprobe': 64},
                                                                            'anns_field': 'float16_vector',
                                                                            'expr': 'ARRAY_LENGTH(array_int16_1) >= 5 && array_contains_any(array_bool_1, '
                                                                                    '[True])',
                                                                            'top_k': 10},
                                                                           {'search_param': {'ef': 32},
                                                                            'anns_field': 'bfloat16_vector',
                                                                            'expr': '(int32_1 % 100) <= 50',
                                                                            'top_k': 30},
                                                                           {'search_param': {'drop_ratio_search': 0.1},
                                                                            'anns_field': 'sparse_float_vector',
                                                                            'expr': '(varchar_1 like "1%") && (bool_1 == True)'}],
                                                                  'rerank': {'RRFRanker': []},
                                                                  'output_fields': ['*'],
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': None,
                                                                  'random_data': True,
                                                                  'check_task': 'check_search_output',
                                                                  'check_items': {'output_fields': ['float16_vector', 'bfloat16_vector', 'sparse_float_vector',
                                                                                                    'int8_1', 'int16_1', 'int32_1', 'int64_1', 'double_1',
                                                                                                    'float_1', 'varchar_1', 'bool_1', 'json_1', 'array_int8_1',
                                                                                                    'array_int16_1', 'array_int32_1', 'array_int64_1',
                                                                                                    'array_double_1', 'array_float_1', 'array_varchar_1',
                                                                                                    'array_bool_1', 'int8_2', 'int16_2', 'int32_2', 'int64_2',
                                                                                                    'double_2', 'float_2', 'varchar_2', 'bool_2', 'json_2',
                                                                                                    'array_int8_2', 'array_int16_2', 'array_int32_2',
                                                                                                    'array_int64_2', 'array_double_2', 'array_float_2',
                                                                                                    'array_varchar_2', 'array_bool_2', 'id', 'binary_vector'],
                                                                                  'nq': 10}}}]},
            'run_id': 2024081557998754,
            'datetime': '2024-08-15 09:56:39.518759',
            'client_version': '2.2'},
 'result': {'test_result': {'index': {'RT': 12317.9538,
                                      'float16_vector': {'RT': 4527.8769},
                                      'bfloat16_vector': {'RT': 2034.6799},
                                      'sparse_float_vector': {'RT': 1229.0079},
                                      'id': {'RT': 384.1398},
                                      'int8_1': {'RT': 0.5238},
                                      'int16_1': {'RT': 0.5216},
                                      'int32_1': {'RT': 0.52},
                                      'int64_1': {'RT': 0.5199},
                                      'varchar_1': {'RT': 0.5222},
                                      'bool_1': {'RT': 0.5216},
                                      'array_int8_1': {'RT': 0.5227},
                                      'array_int16_1': {'RT': 0.5211},
                                      'array_int32_1': {'RT': 0.5208},
                                      'array_int64_1': {'RT': 0.5202},
                                      'array_varchar_1': {'RT': 0.7305},
                                      'array_bool_1': {'RT': 0.5233},
                                      'array_float_1': {'RT': 0.5232},
                                      'array_double_1': {'RT': 0.5213},
                                      'float_2': {'RT': 0.5218},
                                      'double_2': {'RT': 0.5194},
                                      'bool_2': {'RT': 0.5211},
                                      'array_int8_2': {'RT': 0.5193},
                                      'array_int16_2': {'RT': 0.5214},
                                      'array_int32_2': {'RT': 0.5256},
                                      'array_int64_2': {'RT': 0.5214},
                                      'array_varchar_2': {'RT': 0.5258},
                                      'array_bool_2': {'RT': 0.5227},
                                      'array_float_2': {'RT': 0.5225},
                                      'array_double_2': {'RT': 0.5216},
                                      'varchar_2': {'RT': 0.5231},
                                      'float_1': {'RT': 0.5207},
                                      'double_1': {'RT': 0.5223},
                                      'int8_2': {'RT': 0.5224},
                                      'int16_2': {'RT': 0.526},
                                      'int32_2': {'RT': 0.521},
                                      'int64_2': {'RT': 0.5249}},
                            'insert': {'total_time': 1049.0983, 'VPS': 4765.9976, 'batch_time': 1.0491, 'batch': 5000},
                            'flush': {'RT': 2.519},
                            'load': {'RT': 557.6463},
                            'Locust': {'Aggregated': {'Requests': 13511,
                                                      'Fails': 13511,
                                                      'RPS': 22.4,
                                                      'fail_s': 1.0,
                                                      'RT_max': 0,
                                                      'RT_avg': 0.0,
                                                      'TP50': 0,
                                                      'TP99': 0},
                                       'hybrid_search': {'Requests': 4551,
                                                         'Fails': 4551,
                                                         'RPS': 7.54,
                                                         'fail_s': 1.0,
                                                         'RT_max': 0,
                                                         'RT_avg': 0.0,
                                                         'TP50': 0,
                                                         'TP99': 0},
                                       'query': {'Requests': 4451, 'Fails': 4451, 'RPS': 7.38, 'fail_s': 1.0, 'RT_max': 0, 'RT_avg': 0.0, 'TP50': 0, 'TP99': 0},
                                       'search': {'Requests': 4509,
                                                  'Fails': 4509,
                                                  'RPS': 7.48,
                                                  'fail_s': 1.0,
                                                  'RT_max': 0,
                                                  'RT_avg': 0.0,
                                                  'TP50': 0,
                                                  'TP99': 0}}}}}
@wangting0128 wangting0128 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. test/benchmark benchmark test labels Aug 16, 2024
@wangting0128 wangting0128 added this to the 2.5.0 milestone Aug 16, 2024
@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 16, 2024
@yanliang567 yanliang567 removed their assignment Aug 16, 2024
@xiaofan-luan
Copy link
Collaborator

/assign @longjiquan

@wangting0128
Copy link
Contributor Author

Recurrent

argo task:fouramf-l2hz4
test case name: test_bitmap_locust_pk_varchar_dql_cluster
image: master-20240827-9868fe4e-amd64

server:
querynode panic

[2024-08-27 13:40:06,810 -  INFO - fouram]: [Base] Deploy initial state: 
I0827 09:57:34.093466    1153 request.go:665] Waited for 1.119604591s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/triggers.tekton.dev/v1alpha1?timeout=32s
I0827 09:57:44.297915    1153 request.go:665] Waited for 4.202034525s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/storage.k8s.io/v1beta1?timeout=32s
NAME                                                       READY   STATUS      RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-l2hz4-66-6778-etcd-0                               1/1     Running     0                5m21s   10.104.16.33    4am-node21   <none>           <none>
fouramf-l2hz4-66-6778-etcd-1                               1/1     Running     0                5m20s   10.104.23.208   4am-node27   <none>           <none>
fouramf-l2hz4-66-6778-etcd-2                               1/1     Running     0                5m20s   10.104.34.112   4am-node37   <none>           <none>
fouramf-l2hz4-66-6778-milvus-datanode-546fb74858-9zwml     1/1     Running     1 (4m45s ago)    5m21s   10.104.13.84    4am-node16   <none>           <none>
fouramf-l2hz4-66-6778-milvus-indexnode-5bfc648777-5dqc2    1/1     Running     1 (4m53s ago)    5m21s   10.104.9.92     4am-node14   <none>           <none>
fouramf-l2hz4-66-6778-milvus-indexnode-5bfc648777-5dtcj    1/1     Running     1 (4m53s ago)    5m21s   10.104.4.43     4am-node11   <none>           <none>
fouramf-l2hz4-66-6778-milvus-indexnode-5bfc648777-69hxr    1/1     Running     1 (4m48s ago)    5m21s   10.104.1.28     4am-node10   <none>           <none>
fouramf-l2hz4-66-6778-milvus-indexnode-5bfc648777-w7bft    1/1     Running     0                5m21s   10.104.14.245   4am-node18   <none>           <none>
fouramf-l2hz4-66-6778-milvus-mixcoord-686899dc84-t2znf     1/1     Running     0                5m21s   10.104.14.246   4am-node18   <none>           <none>
fouramf-l2hz4-66-6778-milvus-proxy-6c4bf6b946-ldn4b        1/1     Running     0                5m21s   10.104.14.242   4am-node18   <none>           <none>
fouramf-l2hz4-66-6778-milvus-querynode-67cfdc8777-q42sv    1/1     Running     0                5m21s   10.104.6.59     4am-node13   <none>           <none>
fouramf-l2hz4-66-6778-minio-0                              1/1     Running     0                5m21s   10.104.30.138   4am-node38   <none>           <none>
fouramf-l2hz4-66-6778-minio-1                              1/1     Running     0                5m20s   10.104.16.34    4am-node21   <none>           <none>
fouramf-l2hz4-66-6778-minio-2                              1/1     Running     0                5m20s   10.104.27.114   4am-node31   <none>           <none>
fouramf-l2hz4-66-6778-minio-3                              1/1     Running     0                5m20s   10.104.23.211   4am-node27   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-bookie-0                      1/1     Running     0                5m21s   10.104.30.139   4am-node38   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-bookie-1                      1/1     Running     0                5m20s   10.104.18.207   4am-node25   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-bookie-2                      1/1     Running     0                5m20s   10.104.16.35    4am-node21   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-bookie-init-664ww             0/1     Completed   0                5m21s   10.104.14.247   4am-node18   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-broker-0                      1/1     Running     0                5m20s   10.104.9.93     4am-node14   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-proxy-0                       1/1     Running     0                5m20s   10.104.14.248   4am-node18   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-pulsar-init-l5tkt             0/1     Completed   0                5m21s   10.104.4.41     4am-node11   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-recovery-0                    1/1     Running     0                5m21s   10.104.4.42     4am-node11   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-zookeeper-0                   1/1     Running     0                5m21s   10.104.16.31    4am-node21   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-zookeeper-1                   1/1     Running     0                4m32s   10.104.23.213   4am-node27   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-zookeeper-2                   1/1     Running     0                3m57s   10.104.27.116   4am-node31   <none>           <none> (base.py:261)
[2024-08-27 13:40:06,810 -  INFO - fouram]: [Cmd Exe]  kubectl get pods  -n qa-milvus  -o wide | grep -E 'NAME|fouramf-l2hz4-66-6778-milvus|fouramf-l2hz4-66-6778-minio|fouramf-l2hz4-66-6778-etcd|fouramf-l2hz4-66-6778-pulsar|fouramf-l2hz4-66-6778-zookeeper|fouramf-l2hz4-66-6778-kafka|fouramf-l2hz4-66-6778-log|fouramf-l2hz4-66-6778-tikv'  (util_cmd.py:14)
[2024-08-27 13:40:27,301 -  INFO - fouram]: [CliClient] pod details of release(fouramf-l2hz4-66-6778): 
 I0827 13:40:08.071193    1225 request.go:665] Waited for 1.162876928s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/discovery.k8s.io/v1?timeout=32s
I0827 13:40:18.071448    1225 request.go:665] Waited for 3.996499094s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/tekton.dev/v1?timeout=32s
NAME                                                              READY   STATUS      RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-l2hz4-66-6778-etcd-0                                      1/1     Running     0               3h47m   10.104.16.33    4am-node21   <none>           <none>
fouramf-l2hz4-66-6778-etcd-1                                      1/1     Running     0               3h47m   10.104.23.208   4am-node27   <none>           <none>
fouramf-l2hz4-66-6778-etcd-2                                      1/1     Running     0               3h47m   10.104.34.112   4am-node37   <none>           <none>
fouramf-l2hz4-66-6778-milvus-datanode-546fb74858-9zwml            1/1     Running     1 (3h47m ago)   3h47m   10.104.13.84    4am-node16   <none>           <none>
fouramf-l2hz4-66-6778-milvus-indexnode-5bfc648777-5dqc2           1/1     Running     1 (3h47m ago)   3h47m   10.104.9.92     4am-node14   <none>           <none>
fouramf-l2hz4-66-6778-milvus-indexnode-5bfc648777-5dtcj           1/1     Running     1 (3h47m ago)   3h47m   10.104.4.43     4am-node11   <none>           <none>
fouramf-l2hz4-66-6778-milvus-indexnode-5bfc648777-69hxr           1/1     Running     1 (3h47m ago)   3h47m   10.104.1.28     4am-node10   <none>           <none>
fouramf-l2hz4-66-6778-milvus-indexnode-5bfc648777-w7bft           1/1     Running     0               3h47m   10.104.14.245   4am-node18   <none>           <none>
fouramf-l2hz4-66-6778-milvus-mixcoord-686899dc84-t2znf            1/1     Running     0               3h47m   10.104.14.246   4am-node18   <none>           <none>
fouramf-l2hz4-66-6778-milvus-proxy-6c4bf6b946-ldn4b               1/1     Running     0               3h47m   10.104.14.242   4am-node18   <none>           <none>
fouramf-l2hz4-66-6778-milvus-querynode-67cfdc8777-q42sv           1/1     Running     1 (80m ago)     3h47m   10.104.6.59     4am-node13   <none>           <none>
fouramf-l2hz4-66-6778-minio-0                                     1/1     Running     0               3h47m   10.104.30.138   4am-node38   <none>           <none>
fouramf-l2hz4-66-6778-minio-1                                     1/1     Running     0               3h47m   10.104.16.34    4am-node21   <none>           <none>
fouramf-l2hz4-66-6778-minio-2                                     1/1     Running     0               3h47m   10.104.27.114   4am-node31   <none>           <none>
fouramf-l2hz4-66-6778-minio-3                                     1/1     Running     0               3h47m   10.104.23.211   4am-node27   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-bookie-0                             1/1     Running     0               3h47m   10.104.30.139   4am-node38   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-bookie-1                             1/1     Running     0               3h47m   10.104.18.207   4am-node25   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-bookie-2                             1/1     Running     0               3h47m   10.104.16.35    4am-node21   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-bookie-init-664ww                    0/1     Completed   0               3h47m   10.104.14.247   4am-node18   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-broker-0                             1/1     Running     0               3h47m   10.104.9.93     4am-node14   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-proxy-0                              1/1     Running     0               3h47m   10.104.14.248   4am-node18   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-pulsar-init-l5tkt                    0/1     Completed   0               3h47m   10.104.4.41     4am-node11   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-recovery-0                           1/1     Running     0               3h47m   10.104.4.42     4am-node11   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-zookeeper-0                          1/1     Running     0               3h47m   10.104.16.31    4am-node21   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-zookeeper-1                          1/1     Running     0               3h47m   10.104.23.213   4am-node27   <none>           <none>
fouramf-l2hz4-66-6778-pulsar-zookeeper-2                          1/1     Running     0               3h46m   10.104.27.116   4am-node31   <none>           <none>
截屏2024-08-29 11 44 23

concurrent_number=100
client log:
fouram_log.log
截屏2024-08-29 11 43 39

@longjiquan
Copy link
Contributor

This issue can be reproduced under below situations:

  1. There are many segments;
  2. In the reduce phase, one of segment returns a result, and it is the first time the segment is selected. This result has the same pk as the previous segment and larger timestamp.

@longjiquan
Copy link
Contributor

I believe @congqixia can help to solve this issue.
/assign @congqixia

@wangting0128
Copy link
Contributor Author

wangting0128 commented Sep 10, 2024

different case, same panic

argo task: fouramf-bitmap-scenes-7jvtx
test case name: test_bitmap_locust_dql_dml_partition_key_repeated_cluster

server:
queryNode panic

[2024-09-09 10:07:49,366 -  INFO - fouram]: [Base] Deploy initial state: 
I0909 06:37:01.971185     399 request.go:665] Waited for 1.178444307s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/acme.cert-manager.io/v1?timeout=32s
I0909 06:37:12.170349     399 request.go:665] Waited for 4.193803599s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/resolution.tekton.dev/v1beta1?timeout=32s
NAME                                                              READY   STATUS             RESTARTS          AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-bitmap-scenes-7jvtx-12-etcd-0                             1/1     Running            0                 5m25s   10.104.16.93    4am-node21   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-etcd-1                             1/1     Running            0                 5m25s   10.104.15.168   4am-node20   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-etcd-2                             1/1     Running            0                 5m24s   10.104.17.68    4am-node23   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-datanode-5fff94449b-6xq25   1/1     Running            2 (4m39s ago)     5m25s   10.104.13.9     4am-node16   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-datanode-5fff94449b-x9rdf   1/1     Running            3 (4m26s ago)     5m25s   10.104.32.240   4am-node39   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-datanode-5fff94449b-xfhq9   1/1     Running            3 (4m28s ago)     5m25s   10.104.30.230   4am-node38   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-indexnode-58d7d4d6cc6xdjs   1/1     Running            3 (4m27s ago)     5m25s   10.104.19.43    4am-node28   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-indexnode-58d7d4d6ccfdc8q   1/1     Running            3 (4m29s ago)     5m25s   10.104.34.148   4am-node37   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-mixcoord-5fd57b48f8-cjhbk   1/1     Running            3 (4m29s ago)     5m25s   10.104.32.238   4am-node39   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-proxy-69fc7d7b6-mg79j       1/1     Running            3 (4m27s ago)     5m25s   10.104.32.239   4am-node39   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-querynode-5dc96d68c-gq68k   1/1     Running            3 (4m28s ago)     5m25s   10.104.14.27    4am-node18   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-minio-0                            1/1     Running            0                 5m25s   10.104.16.90    4am-node21   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-minio-1                            1/1     Running            0                 5m25s   10.104.15.166   4am-node20   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-minio-2                            1/1     Running            0                 5m25s   10.104.17.67    4am-node23   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-minio-3                            1/1     Running            0                 5m25s   10.104.32.249   4am-node39   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-bookie-0                    1/1     Running            0                 5m25s   10.104.17.65    4am-node23   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-bookie-1                    1/1     Running            0                 5m25s   10.104.16.94    4am-node21   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-bookie-2                    1/1     Running            0                 5m24s   10.104.19.46    4am-node28   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-bookie-init-jbv87           0/1     Completed          0                 5m25s   10.104.32.241   4am-node39   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-broker-0                    1/1     Running            0                 5m25s   10.104.15.162   4am-node20   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-proxy-0                     1/1     Running            0                 5m25s   10.104.5.167    4am-node12   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-pulsar-init-mpb76           0/1     Completed          0                 5m25s   10.104.14.26    4am-node18   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-recovery-0                  1/1     Running            0                 5m25s   10.104.4.28     4am-node11   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-zookeeper-0                 1/1     Running            0                 5m25s   10.104.17.63    4am-node23   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-zookeeper-1                 1/1     Running            0                 4m31s   10.104.15.172   4am-node20   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-zookeeper-2                 1/1     Running            0                 3m55s   10.104.16.96    4am-node21   <none>           <none> (base.py:261)
[2024-09-09 10:07:49,366 -  INFO - fouram]: [Cmd Exe]  kubectl get pods  -n qa-milvus  -o wide | grep -E 'NAME|fouramf-bitmap-scenes-7jvtx-12-milvus|fouramf-bitmap-scenes-7jvtx-12-minio|fouramf-bitmap-scenes-7jvtx-12-etcd|fouramf-bitmap-scenes-7jvtx-12-pulsar|fouramf-bitmap-scenes-7jvtx-12-zookeeper|fouramf-bitmap-scenes-7jvtx-12-kafka|fouramf-bitmap-scenes-7jvtx-12-log|fouramf-bitmap-scenes-7jvtx-12-tikv'  (util_cmd.py:14)
[2024-09-09 10:08:10,038 -  INFO - fouram]: [CliClient] pod details of release(fouramf-bitmap-scenes-7jvtx-12): 
 I0909 10:07:50.856990     504 request.go:665] Waited for 1.171001391s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/milvus.io/v1alpha1?timeout=32s
I0909 10:08:00.857864     504 request.go:665] Waited for 3.998044935s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/discovery.k8s.io/v1?timeout=32s
NAME                                                              READY   STATUS             RESTARTS          AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-bitmap-scenes-7jvtx-12-etcd-0                             1/1     Running            0                 3h36m   10.104.16.93    4am-node21   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-etcd-1                             1/1     Running            0                 3h36m   10.104.15.168   4am-node20   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-etcd-2                             1/1     Running            0                 3h36m   10.104.17.68    4am-node23   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-datanode-5fff94449b-6xq25   1/1     Running            2 (3h35m ago)     3h36m   10.104.13.9     4am-node16   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-datanode-5fff94449b-x9rdf   1/1     Running            3 (3h35m ago)     3h36m   10.104.32.240   4am-node39   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-datanode-5fff94449b-xfhq9   1/1     Running            3 (3h35m ago)     3h36m   10.104.30.230   4am-node38   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-indexnode-58d7d4d6cc6xdjs   1/1     Running            3 (3h35m ago)     3h36m   10.104.19.43    4am-node28   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-indexnode-58d7d4d6ccfdc8q   1/1     Running            3 (3h35m ago)     3h36m   10.104.34.148   4am-node37   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-mixcoord-5fd57b48f8-cjhbk   1/1     Running            3 (3h35m ago)     3h36m   10.104.32.238   4am-node39   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-proxy-69fc7d7b6-mg79j       1/1     Running            3 (3h35m ago)     3h36m   10.104.32.239   4am-node39   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-milvus-querynode-5dc96d68c-gq68k   1/1     Running            5 (19m ago)       3h36m   10.104.14.27    4am-node18   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-minio-0                            1/1     Running            0                 3h36m   10.104.16.90    4am-node21   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-minio-1                            1/1     Running            0                 3h36m   10.104.15.166   4am-node20   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-minio-2                            1/1     Running            0                 3h36m   10.104.17.67    4am-node23   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-minio-3                            1/1     Running            0                 3h36m   10.104.32.249   4am-node39   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-bookie-0                    1/1     Running            0                 3h36m   10.104.17.65    4am-node23   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-bookie-1                    1/1     Running            0                 3h36m   10.104.16.94    4am-node21   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-bookie-2                    1/1     Running            0                 3h36m   10.104.19.46    4am-node28   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-bookie-init-jbv87           0/1     Completed          0                 3h36m   10.104.32.241   4am-node39   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-broker-0                    1/1     Running            0                 3h36m   10.104.15.162   4am-node20   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-proxy-0                     1/1     Running            0                 3h36m   10.104.5.167    4am-node12   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-pulsar-init-mpb76           0/1     Completed          0                 3h36m   10.104.14.26    4am-node18   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-recovery-0                  1/1     Running            0                 3h36m   10.104.4.28     4am-node11   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-zookeeper-0                 1/1     Running            0                 3h36m   10.104.17.63    4am-node23   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-zookeeper-1                 1/1     Running            0                 3h35m   10.104.15.172   4am-node20   <none>           <none>
fouramf-bitmap-scenes-7jvtx-12-pulsar-zookeeper-2                 1/1     Running            0                 3h34m   10.104.16.96    4am-node21   <none>           <none> 
截屏2024-09-10 11 49 44 截屏2024-09-10 11 50 12

client log:
client.log

test steps:

        concurrent test and calculation of RT and QPS

        :purpose:  `partition_key on scalar int64_1 field`, shards_num=16
            verify DQL & DML scenario,
            which has 1 vector fields(IVF_SQ8) and building `BITMAP` index on all supported 12 scalar fields

        :test steps:
            1. create collection with fields:
                'float_vector': 128dim

                'int64_1': partition_key, num_partitions=1024, data range: 0 ~ 9
                all scalar fields: varchar max_length=100, array max_capacity=13
            2. build indexes:
                IVF_SQ8: 'float_vector'

                BITMAP: all scalar fields
            3. insert 5 million data
            4. flush collection
            5. build indexes again using the same params
            6. load collection
                replica: 1
            7. concurrent request:
                - search
                - query
                - hybrid_search
                - load
                - insert
                - delete: delete all
                - flush: ignore RateLimiter

congqixia added a commit to congqixia/milvus that referenced this issue Sep 13, 2024
sre-ci-robot pushed a commit that referenced this issue Sep 14, 2024
congqixia added a commit to congqixia/milvus that referenced this issue Sep 14, 2024
sre-ci-robot pushed a commit that referenced this issue Sep 14, 2024
…36274)

Cherry-pick from master
pr: #35826
Related to #35505

---------

Signed-off-by: Congqi Xia <[email protected]>
congqixia added a commit to congqixia/milvus that referenced this issue Sep 23, 2024
sre-ci-robot pushed a commit that referenced this issue Sep 24, 2024
congqixia added a commit to congqixia/milvus that referenced this issue Sep 24, 2024
sre-ci-robot pushed a commit that referenced this issue Sep 24, 2024
… multi segments (#36433) (#36460)

Cherry-pick from master
pr: #36433
Related to #35505 #36362

Signed-off-by: Congqi Xia <[email protected]>
@wangting0128
Copy link
Contributor Author

verification passed

image: master-20240920-bfd68cc0-amd64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

6 participants