-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client and server interaction has a long latency. #2011
Comments
Hi @Daniel-blue, Thanks for your reporting. How do you start the vineyard? It's advisable to include the test code here to identify the source of the latency. |
put:create_buffer_request-->seal_request-->create_data_request-->persist_request-->put_name_request
|
Hi @Daniel-blue, How do you start the vineyardd? Could you please provide some details about it? |
Have you started the etcd? |
Deploy the Vineyard server and client according to the guide at https://v6d.io/docs.html, and use kubectl exec to enter the client and operate with Python 3.0. |
Hi @Daniel-blue. Thanks for the detail. In the first part, you can reduce the memory alloc in the vineyard server by adding the In the second part, the import vineyard
import numpy as np
import time
from threading import Thread
from vineyard.io.recordbatch import RecordBatchStream
chunk_size = 1000
def stream_producer(vineyard_client):
data = np.random.rand(10, 10).astype(np.float32)
stream = RecordBatchStream.new(vineyard_client)
vineyard_client.persist(stream.id)
vineyard_client.put_name(stream.id, "stream11")
chunk_list = []
for _ in range(chunk_size):
chunk_id = vineyard_client.put(data)
chunk_list.append(chunk_id)
start = time.time()
writer = stream.open_writer(vineyard_client)
for _ in range(chunk_size):
writer.append(chunk_id)
writer.finish()
end = time.time()
per_chunk = (end - start) / chunk_size
print(f"Producer sent {chunk_size} chunks in {end - start:.5f} seconds, per chunk cost {per_chunk:.5f} seconds")
def stream_consumer(vineyard_client):
start = time.time()
stream_id = vineyard_client.get_name("stream11", wait=True)
stream = vineyard_client.get(stream_id)
reader = stream.open_reader(vineyard_client)
count = 0
while True:
try:
chunk_id = reader.next_chunk_id()
# data = vineyard_client.get(chunk_id)
count += 1
except StopIteration:
break
end = time.time()
per_chunk = (end - start) / chunk_size
print(f"Consumer received {count} chunks in {end - start:.5f} seconds, per chunk cost {per_chunk:.5f} seconds")
if __name__ == "__main__":
endpoint = "172.20.6.103:9600"
rpc_client = vineyard.connect(endpoint=endpoint)
producer_thread = Thread(target=stream_producer, args=(rpc_client,))
producer_thread.start()
producer_thread.join()
consumer_thread = Thread(target=stream_consumer, args=(rpc_client,))
consumer_thread.start()
consumer_thread.join() |
Is it effective to merge the process and change the ordermap for name to an unordered_map? |
The scenario may be more suitable for the third case---'the client and server are distributed'. Does 'putting multiple get operations into a single batch' mean that the metadata does not include the data object? How can this be done? |
It's hard to say it can reduce a lot latency.
Yes, you could try it in multithreads.
You can replace https://github.com/v6d-io/v6d/blob/main/python/vineyard/core/client.py#L600-L606 |
Describe your problem
During testing, it was found that the interaction between the client and server(vineyard instance) has a significant impact on latency, especially in scenarios involving multiple consecutive interactions. What is the purpose of breaking down the process between the client and the server into multiple interactions? Is there potential for improvement?
The text was updated successfully, but these errors were encountered: