Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reload buffer configuration #18933

Open
jszwedko opened this issue Oct 25, 2023 · 1 comment
Open

Reload buffer configuration #18933

jszwedko opened this issue Oct 25, 2023 · 1 comment
Labels
domain: buffers Anything related to Vector's memory/disk buffers domain: reload Anything related to reloading Vector (updating configuration) meta: confirmed A bug that has been reproduced or confirmed. type: bug A code related bug.

Comments

@jszwedko
Copy link
Member

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

It looks like Vector doesn't reload buffer configuration on sinks for memory or disk buffers. With the below configuration, I started Vector, scraped the internal metrics, updated the max_events, triggered a reload, and then scraped the internal metrics again.

Before update:

❯ curl localhost:8657/metrics | grep buffer | grep sink1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 64881  100 64881    0     0  11.5M      0 --:--:-- --:--:-- --:--:-- 61.8M
vector_buffer_byte_size{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 4811492 1698250398509
vector_buffer_events{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 3713 1698250398509
vector_buffer_max_event_size{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 10000 1698250398509
vector_buffer_received_bytes_total{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 5337749 1698250398509
vector_buffer_received_events_total{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 4119 1698250398509
vector_buffer_sent_bytes_total{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 526257 1698250398509
vector_buffer_sent_events_total{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 406 1698250398509

After update:

vector on  master [$] is 📦 v0.34.0 via  v21.1.0 via 💎 v3.1.4 via 🦀 v1.72.1 with 🦬 on ☁️
❯ curl localhost:8657/metrics | grep buffer | grep sink1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 64880  100 64880    0     0  11.1M      0 --:--:-- --:--:-- --:--:-- 61.8M
vector_buffer_byte_size{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 5329566 1698250402508
vector_buffer_events{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 4113 1698250402508
vector_buffer_max_event_size{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 10000 1698250402508
vector_buffer_received_bytes_total{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 5855823 1698250402508
vector_buffer_received_events_total{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 4519 1698250402508
vector_buffer_sent_bytes_total{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 526257 1698250402508
vector_buffer_sent_events_total{buffer_type="memory",component_id="sink1",component_kind="sink",component_name="sink1",component_type="vector",host="COMP-J4C4P27K9Q",stage="0"} 406 1698250402508

Here we can see vector_buffer_max_event_size stays at 10000.

I saw the same behavior with disk buffers when updating max_size.

Configuration

data_dir: /tmp/vector/
sources:
  source0:
    format: json
    interval: 0.01
    type: demo_logs
    decoding:
      codec: json
  sources1:
    type: internal_metrics
sinks:
  sink0:
    buffer:
      type: memory
      max_events: 10000 # I updated this to 20000 and triggered a reload
    type: vector
    inputs:
      - source0
    address: http://localhost:8081
  sink1:
    type: prometheus_exporter
    address: 0.0.0.0:8657
    inputs:
      - sources1

Version

vector 0.33.0

Debug Output

2023-10-25T16:10:08.859523Z  INFO vector::app: Log level is enabled. level="vector=info,codec=info,vrl=info,file_source=info,tower_limit=info,rdkafka=info,buffers=info,lapin=info,kube=info"
2023-10-25T16:10:08.860269Z  INFO vector::app: Loading configs. paths=["/tmp/tmp.yaml"]
2023-10-25T16:10:08.887439Z  INFO vector::topology::running: Running healthchecks.
2023-10-25T16:10:08.887590Z  INFO vector: Vector has started. debug="false" version="0.33.0" arch="aarch64" revision=""
2023-10-25T16:10:08.887608Z  INFO vector::app: API is disabled, enable by setting `api.enabled` to `true` and use commands like `vector top`.
2023-10-25T16:10:08.887985Z  INFO vector::sinks::prometheus::exporter: Building HTTP server. address=0.0.0.0:8657
2023-10-25T16:10:08.890246Z  INFO vector::topology::builder: Healthcheck passed.
2023-10-25T16:10:08.890913Z ERROR vector::topology::builder: msg="Healthcheck failed." error=Request failed: status: Unavailable, message: "error trying to connect: tcp connect error: Connection refused (os error 61)", details: [], metadata: MetadataMap { headers: {} } component_kind="sink" component_type="vector" component_id=sink1 component_name=sink1
2023-10-25T16:10:09.897414Z  WARN sink{component_kind="sink" component_id=sink1 component_type=vector component_name=sink1}:request{request_id=1}: vector::sinks::util::retries: Retrying after error. error=Request failed: status: Unavailable, message: "error trying to connect: tcp connect error: Connection refused (os error 61)", details: [], metadata: MetadataMap { headers: {} } internal_log_rate_limit=true
2023-10-25T16:10:10.900333Z  WARN sink{component_kind="sink" component_id=sink1 component_type=vector component_name=sink1}:request{request_id=1}: vector::sinks::util::retries: Internal log [Retrying after error.] is being suppressed to avoid flooding.
2023-10-25T16:10:21.918383Z  WARN sink{component_kind="sink" component_id=sink1 component_type=vector component_name=sink1}:request{request_id=1}: vector::sinks::util::retries: Internal log [Retrying after error.] has been suppressed 4 times.
2023-10-25T16:10:21.918424Z  WARN sink{component_kind="sink" component_id=sink1 component_type=vector component_name=sink1}:request{request_id=1}: vector::sinks::util::retries: Retrying after error. error=Request failed: status: Unavailable, message: "error trying to connect: tcp connect error: Connection refused (os error 61)", details: [], metadata: MetadataMap { headers: {} } internal_log_rate_limit=true
2023-10-25T16:10:26.936517Z  INFO vector::signal: Signal received. signal="SIGHUP"
2023-10-25T16:10:26.937859Z  INFO vector::topology::running: Reloading running topology with new configuration.
2023-10-25T16:10:26.971718Z  INFO vector::topology::running: Running healthchecks.
2023-10-25T16:10:26.971789Z  INFO vector::topology::running: New configuration loaded successfully.
2023-10-25T16:10:26.971797Z  INFO vector: Vector has reloaded. path=[File("/tmp/tmp.yaml", None)]
2023-10-25T16:10:26.972798Z ERROR vector::topology::builder: msg="Healthcheck failed." error=Request failed: status: Unavailable, message: "error trying to connect: tcp connect error: Connection refused (os error 61)", details: [], metadata: MetadataMap { headers: {} } component_kind="sink" component_type="vector" component_id=sink1 component_name=sink1
2023-10-25T16:10:27.979495Z  WARN sink{component_kind="sink" component_id=sink1 component_type=vector component_name=sink1}:request{request_id=1}: vector::sinks::util::retries: Internal log [Retrying after error.] is being suppressed to avoid flooding.
2023-10-25T16:10:31.992526Z  WARN sink{component_kind="sink" component_id=sink1 component_type=vector component_name=sink1}:request{request_id=1}: vector::sinks::util::retries: Internal log [Retrying after error.] has been suppressed 4 times.
2023-10-25T16:10:31.992580Z  WARN sink{component_kind="sink" component_id=sink1 component_type=vector component_name=sink1}:request{request_id=1}: vector::sinks::util::retries: Retrying after error. error=Request failed: status: Unavailable, message: "error trying to connect: tcp connect error: Connection refused (os error 61)", details: [], metadata: MetadataMap { headers: {} } internal_log_rate_limit=true
^C2023-10-25T16:10:32.554209Z  INFO vector::signal: Signal received. signal="SIGINT"
2023-10-25T16:10:32.554545Z  INFO vector: Vector has stopped.
2023-10-25T16:10:32.557065Z  INFO vector::topology::running: Shutting down... Waiting on running components. remaining_components="sink1, source0" time_remaining="59 seconds left"
2023-10-25T16:10:34.998659Z  WARN sink{component_kind="sink" component_id=sink1 component_type=vector component_name=sink1}:request{request_id=1}: vector::sinks::util::retries: Internal log [Retrying after error.] is being suppressed to avoid flooding.
2023-10-25T16:10:37.557019Z  INFO vector::topology::running: Shutting down... Waiting on running components. remaining_components="sink1" time_remaining="54 seconds left"

Example Data

No response

Additional Context

No response

References

No response

@jszwedko jszwedko added type: bug A code related bug. domain: buffers Anything related to Vector's memory/disk buffers domain: reload Anything related to reloading Vector (updating configuration) labels Oct 25, 2023
@neuronull neuronull added the meta: confirmed A bug that has been reproduced or confirmed. label Oct 25, 2023
@StephenWakely
Copy link
Contributor

This is really hard. There are many edge cases and areas where problems could occur in enabling this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: buffers Anything related to Vector's memory/disk buffers domain: reload Anything related to reloading Vector (updating configuration) meta: confirmed A bug that has been reproduced or confirmed. type: bug A code related bug.
Projects
None yet
Development

No branches or pull requests

3 participants