Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(new source): A generic http_scrape source #13793

Merged
merged 58 commits into from
Aug 26, 2022

Conversation

neuronull
Copy link
Contributor

@neuronull neuronull commented Aug 1, 2022

Closes: #3532.
Closes: #3702.

This is an implementation of a generic HTTP scraping source.

Much of the scraping logic was leveraged from the existing prometheus_scrape source.
Or rather, the common/generic logic was extracted and now both prometheus_scrape and http_scrape leverage the common logic.

NOTES

  • I suspect I might be asked to move the common logic out of the src/sources/http_scrape/mod.rs , and into somewhere in the vector common area. Thinking about the effort to extract components into crates, the prometheus_scrape crate would have a dependency on the http_scrape crate . But I wasn't exactly sure where to put it.

TODO

  • The placement of sending the HttpScrapeEventsSent seems wrong to me, but I was not sure how to get the data needed to send the event, after the stream was written.
  • (int test) TLS test case using certs for the dufs container
  • (int test) add a test case to validate headers applied
  • (int test) add a shutdown signal test
  • (int test) validate that a failure occurs when invalid auth is passed or auth is omitted but required. (ie, I see provisions for happy path testing but need to assert the failure in this case).
  • (int test) same as above but for invalid endpoint
  • Multi-value header support
  • Add log_namespace support

Un-implemented features tracked in the below issues

#13889 (OAuth support)
#13888 (Compression support)

@neuronull neuronull added domain: sources Anything related to the Vector's sources source: http_server Anything `http_server` source related domain: external docs Anything related to Vector's external, public documentation ci-condition: integration tests enable Run integration tests on this PR labels Aug 1, 2022
@neuronull neuronull self-assigned this Aug 1, 2022
@netlify
Copy link

netlify bot commented Aug 1, 2022

Deploy Preview for vector-project ready!

Name Link
🔨 Latest commit 33c2364
🔍 Latest deploy log https://app.netlify.com/sites/vector-project/deploys/6308d8ec5ffdce0008889239
😎 Deploy Preview https://deploy-preview-13793--vector-project.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@github-actions github-actions bot added the domain: ci Anything related to Vector's CI environment label Aug 1, 2022
@github-actions
Copy link

Soak Test Results

Baseline: d4f0afa
Comparison: 29f763a
Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
http_to_http_acks 186.11KiB 1.06 58.60% 17.21MiB 7.74MiB 161.75KiB 0 0.44957 17.39MiB 7.67MiB 160.42KiB 0 0.441298 True True
datadog_agent_remap_blackhole_acks 668.83KiB 1.04 100.00% 62.72MiB 4.56MiB 94.84KiB 0 0.0726138 63.37MiB 2.99MiB 62.53KiB 0 0.0471495 False False
syslog_splunk_hec_logs 149.08KiB 0.88 100.00% 16.55MiB 637.94KiB 13.0KiB 0 0.0376335 16.7MiB 593.04KiB 12.1KiB 0 0.0346795 False False
splunk_hec_route_s3 150.37KiB 0.82 96.32% 17.82MiB 2.47MiB 51.47KiB 0 0.138559 17.97MiB 2.41MiB 50.33KiB 0 0.133917 False False
socket_to_socket_blackhole 184.94KiB 0.77 100.00% 23.34MiB 186.49KiB 3.81KiB 0 0.0078014 23.52MiB 169.05KiB 3.45KiB 0 0.00701743 False False
syslog_regex_logs2metric_ddmetrics 91.7KiB 0.73 100.00% 12.26MiB 643.32KiB 13.1KiB 0 0.0512347 12.35MiB 634.12KiB 12.92KiB 0 0.0501356 False False
syslog_humio_logs 117.72KiB 0.69 100.00% 16.69MiB 105.02KiB 2.14KiB 0 0.00614526 16.8MiB 107.49KiB 2.2KiB 0 0.00624655 False False
datadog_agent_remap_blackhole 427.16KiB 0.69 99.39% 60.37MiB 5.59MiB 116.67KiB 0 0.0926454 60.79MiB 4.94MiB 103.06KiB 0 0.081224 False False
syslog_log2metric_splunk_hec_metrics 63.54KiB 0.35 99.91% 17.86MiB 562.14KiB 11.46KiB 0 0.0307385 17.92MiB 747.87KiB 15.23KiB 0 0.0407524 False False
http_pipelines_blackhole_acks 3.89KiB 0.31 85.26% 1.23MiB 114.62KiB 2.33KiB 0 0.0911243 1.23MiB 64.99KiB 1.33KiB 0 0.0515067 False False
syslog_loki 37.82KiB 0.25 97.66% 14.81MiB 353.55KiB 7.23KiB 0 0.0233111 14.85MiB 739.23KiB 15.03KiB 0 0.0486193 False False
syslog_log2metric_humio_metrics 24.66KiB 0.19 92.49% 12.77MiB 381.19KiB 7.78KiB 0 0.0291516 12.79MiB 563.13KiB 11.46KiB 0 0.0429846 False False
splunk_hec_to_splunk_hec_logs_noack 10.2KiB 0.04 64.25% 23.83MiB 430.54KiB 8.79KiB 0 0.0176422 23.84MiB 330.7KiB 6.75KiB 0 0.0135454 False False
enterprise_http_to_http -551.2B -0 5.93% 23.85MiB 250.93KiB 5.12KiB 0 0.0102741 23.85MiB 249.77KiB 5.11KiB 0 0.0102268 False False
splunk_hec_indexer_ack_blackhole -924.21B -0 2.60% 23.74MiB 956.29KiB 19.45KiB 0 0.0393299 23.74MiB 966.91KiB 19.66KiB 0 0.0397682 False False
splunk_hec_to_splunk_hec_logs_acks -2.61KiB -0.01 8.18% 23.75MiB 880.75KiB 17.91KiB 0 0.0362148 23.74MiB 888.37KiB 18.07KiB 0 0.0365321 False False
file_to_blackhole -41.38KiB -0.04 31.22% 95.34MiB 3.23MiB 67.0KiB 0 0.0338916 95.3MiB 3.76MiB 78.22KiB 0 0.0394404 False False
http_to_http_json -30.12KiB -0.12 98.65% 23.85MiB 332.87KiB 6.8KiB 0 0.013629 23.82MiB 494.73KiB 10.11KiB 0 0.020281 False False
http_to_http_noack -113.23KiB -0.46 100.00% 23.85MiB 245.86KiB 5.03KiB 0 0.0100669 23.73MiB 1.15MiB 24.0KiB 0 0.0485017 False False
fluent_elasticsearch -409.62KiB -0.5 100.00% 79.47MiB 53.08KiB 1.07KiB 0 0.000652167 79.07MiB 4.3MiB 88.44KiB 0 0.0544174 False False
http_pipelines_blackhole -9.59KiB -0.55 99.99% 1.7MiB 46.83KiB 979.96B 0 0.026947 1.69MiB 109.03KiB 2.22KiB 0 0.063083 False False
datadog_agent_remap_datadog_logs_acks -560.32KiB -0.87 100.00% 62.92MiB 3.11MiB 65.04KiB 0 0.049445 62.37MiB 4.32MiB 89.96KiB 0 0.069261 False False
http_pipelines_no_grok_blackhole -100.99KiB -0.91 100.00% 10.84MiB 198.28KiB 4.05KiB 0 0.0178607 10.74MiB 1.03MiB 21.36KiB 0 0.095465 False False
datadog_agent_remap_datadog_logs -785.68KiB -1.22 100.00% 62.9MiB 415.95KiB 8.52KiB 0 0.0064567 62.13MiB 4.3MiB 89.55KiB 0 0.0692116 False False
http_text_to_http_json -989.91KiB -2.49 100.00% 38.84MiB 944.64KiB 19.28KiB 0 0.0237464 37.87MiB 966.55KiB 19.73KiB 0 0.0249174 False False

@neuronull neuronull requested review from bruceg and hhromic August 22, 2022 17:33
@github-actions
Copy link

Soak Test Results

Baseline: 8e2201b
Comparison: 1193263
Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
syslog_regex_logs2metric_ddmetrics 280.38KiB 2.31 100.00% 11.85MiB 535.18KiB 10.91KiB 0 0.0440826 12.13MiB 527.04KiB 10.75KiB 0 0.0424327 False False
socket_to_socket_blackhole 318.36KiB 1.4 100.00% 22.2MiB 293.73KiB 6.0KiB 0 0.0129196 22.51MiB 239.97KiB 4.9KiB 0 0.0104092 False False
datadog_agent_remap_blackhole 769.57KiB 1.22 100.00% 61.49MiB 4.14MiB 86.38KiB 0 0.0673914 62.24MiB 3.27MiB 68.16KiB 0 0.0524813 False False
syslog_splunk_hec_logs 181.92KiB 1.09 100.00% 16.27MiB 704.8KiB 14.36KiB 0 0.0422974 16.45MiB 450.7KiB 9.2KiB 0 0.0267557 False False
syslog_log2metric_splunk_hec_metrics 176.8KiB 1 100.00% 17.29MiB 1.07MiB 22.27KiB 0 0.0616926 17.46MiB 1019.49KiB 20.78KiB 0 0.0570013 False False
syslog_humio_logs 117.2KiB 0.71 100.00% 16.08MiB 454.4KiB 9.28KiB 0 0.027586 16.2MiB 516.68KiB 10.57KiB 0 0.0311457 False False
http_pipelines_blackhole_acks 7.31KiB 0.59 99.16% 1.21MiB 109.85KiB 2.24KiB 0 0.0885494 1.22MiB 80.33KiB 1.64KiB 0 0.0643701 False False
syslog_log2metric_humio_metrics 33.01KiB 0.26 82.04% 12.36MiB 791.82KiB 16.16KiB 0 0.0625585 12.39MiB 910.08KiB 18.53KiB 0 0.0717144 False False
splunk_hec_to_splunk_hec_logs_noack 25.42KiB 0.1 95.50% 23.81MiB 518.82KiB 10.58KiB 0 0.0212714 23.84MiB 341.93KiB 6.98KiB 0 0.0140044 False False
http_pipelines_no_grok_blackhole 8.23KiB 0.08 23.40% 10.65MiB 586.47KiB 11.97KiB 0 0.0537483 10.66MiB 1.2MiB 24.91KiB 0 0.112237 False False
enterprise_http_to_http -1.09KiB -0 12.08% 23.85MiB 246.33KiB 5.03KiB 0 0.0100856 23.85MiB 251.81KiB 5.15KiB 0 0.0103104 False False
http_to_http_acks -262.47B -0 0.09% 17.17MiB 7.97MiB 166.66KiB 0 0.464213 17.17MiB 7.94MiB 165.84KiB 0 0.462113 True True
splunk_hec_to_splunk_hec_logs_acks -4.87KiB -0.02 15.87% 23.76MiB 839.6KiB 17.08KiB 0 0.0345069 23.75MiB 851.69KiB 17.32KiB 0 0.035011 False False
splunk_hec_indexer_ack_blackhole -6.88KiB -0.03 22.24% 23.76MiB 829.35KiB 16.87KiB 0 0.0340817 23.75MiB 864.97KiB 17.59KiB 0 0.0355555 False False
file_to_blackhole -58.21KiB -0.06 59.77% 95.37MiB 2.29MiB 47.42KiB 0 0.023979 95.31MiB 2.45MiB 50.8KiB 0 0.0256474 False False
http_to_http_json -36.11KiB -0.15 99.48% 23.84MiB 340.67KiB 6.96KiB 0 0.0139503 23.81MiB 532.81KiB 10.87KiB 0 0.0218509 False False
fluent_elasticsearch -140.71KiB -0.17 100.00% 79.47MiB 54.84KiB 1.11KiB 0 0.000673725 79.34MiB 1.32MiB 27.24KiB 0 0.0166737 False False
http_to_http_noack -72.62KiB -0.3 99.61% 23.82MiB 601.71KiB 12.3KiB 0 0.0246637 23.75MiB 1.05MiB 21.96KiB 0 0.0443431 False False
datadog_agent_remap_blackhole_acks -275.77KiB -0.44 97.88% 61.86MiB 4.72MiB 98.37KiB 0 0.0763619 61.59MiB 3.26MiB 68.08KiB 0 0.0528498 False False
splunk_hec_route_s3 -150.9KiB -0.79 97.73% 18.7MiB 2.28MiB 47.51KiB 0 0.12195 18.55MiB 2.21MiB 46.13KiB 0 0.118994 False False
syslog_loki -168.52KiB -1.18 100.00% 13.99MiB 893.8KiB 18.28KiB 0 0.0623786 13.83MiB 1013.83KiB 20.61KiB 0 0.0715982 False False
datadog_agent_remap_datadog_logs_acks -981.48KiB -1.51 100.00% 63.29MiB 3.9MiB 81.31KiB 0 0.0615806 62.33MiB 4.43MiB 92.21KiB 0 0.0710536 False False
datadog_agent_remap_datadog_logs -1.04MiB -1.63 100.00% 63.38MiB 727.17KiB 14.88KiB 0 0.0112023 62.34MiB 4.06MiB 84.59KiB 0 0.0651313 False False
http_pipelines_blackhole -32.59KiB -1.94 100.00% 1.64MiB 66.57KiB 1.36KiB 0 0.0395844 1.61MiB 136.08KiB 2.77KiB 0 0.0825109 False False
http_text_to_http_json -1.33MiB -3.36 100.00% 39.52MiB 1.05MiB 21.88KiB 0 0.0264914 38.19MiB 1.08MiB 22.62KiB 0 0.0283259 False False

src/sources/http_scrape/scrape.rs Show resolved Hide resolved
src/sources/http_scrape/tests.rs Show resolved Hide resolved
src/sources/util/http_scrape.rs Outdated Show resolved Hide resolved
@neuronull neuronull requested review from bruceg and removed request for hhromic August 24, 2022 20:46
@github-actions
Copy link

Soak Test Results

Baseline: 8426cb2
Comparison: 75bec10
Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
datadog_agent_remap_blackhole_acks 1.01MiB 1.6 100.00% 63.3MiB 4.28MiB 89.04KiB 0 0.0675335 64.32MiB 2.93MiB 61.3KiB 0 0.0455706 False False
syslog_regex_logs2metric_ddmetrics 159.41KiB 1.27 100.00% 12.21MiB 589.02KiB 12.0KiB 0 0.0470862 12.37MiB 589.82KiB 12.02KiB 0 0.0465568 False False
datadog_agent_remap_blackhole 756.71KiB 1.18 100.00% 62.52MiB 4.1MiB 85.5KiB 0 0.0656175 63.26MiB 2.7MiB 56.33KiB 0 0.0426533 False False
syslog_log2metric_splunk_hec_metrics 197.19KiB 1.12 100.00% 17.21MiB 1.12MiB 23.3KiB 0 0.064878 17.4MiB 1.14MiB 23.71KiB 0 0.0653761 False False
http_pipelines_blackhole_acks 13.02KiB 1.08 99.99% 1.18MiB 138.93KiB 2.83KiB 0 0.115315 1.19MiB 85.92KiB 1.75KiB 0 0.0705506 False False
syslog_log2metric_humio_metrics 96.82KiB 0.73 100.00% 12.98MiB 188.88KiB 3.86KiB 0 0.0142055 13.08MiB 473.37KiB 9.64KiB 0 0.0353433 False False
splunk_hec_route_s3 124.11KiB 0.66 93.09% 18.23MiB 2.35MiB 48.86KiB 0 0.128755 18.35MiB 2.28MiB 47.66KiB 0 0.124094 False False
http_to_http_acks 95.99KiB 0.54 31.53% 17.23MiB 8.05MiB 168.27KiB 0 0.467005 17.32MiB 7.97MiB 166.0KiB 0 0.45967 True True
datadog_agent_remap_datadog_logs_acks 159.53KiB 0.25 84.94% 61.97MiB 2.98MiB 62.41KiB 0 0.0481561 62.12MiB 4.41MiB 91.74KiB 0 0.0709284 False False
syslog_splunk_hec_logs 29.85KiB 0.18 87.70% 16.5MiB 740.13KiB 15.06KiB 0 0.0437853 16.53MiB 595.51KiB 12.15KiB 0 0.0351676 False False
datadog_agent_remap_datadog_logs 92.44KiB 0.15 77.23% 61.29MiB 865.3KiB 17.71KiB 0 0.0137834 61.38MiB 3.58MiB 74.53KiB 0 0.0582346 False False
splunk_hec_indexer_ack_blackhole 22.75KiB 0.09 62.82% 23.75MiB 928.36KiB 18.88KiB 0 0.03817 23.77MiB 839.99KiB 17.1KiB 0 0.0345043 False False
splunk_hec_to_splunk_hec_logs_acks 8.57KiB 0.04 29.68% 23.76MiB 794.16KiB 16.16KiB 0 0.0326276 23.77MiB 769.64KiB 15.67KiB 0 0.031609 False False
splunk_hec_to_splunk_hec_logs_noack 965.22B 0 7.71% 23.84MiB 336.37KiB 6.87KiB 0 0.0137782 23.84MiB 337.73KiB 6.9KiB 0 0.0138335 False False
enterprise_http_to_http -712.28B -0 7.78% 23.85MiB 247.91KiB 5.06KiB 0 0.0101503 23.85MiB 244.97KiB 5.01KiB 0 0.01003 False False
file_to_blackhole -38.25KiB -0.04 28.00% 95.34MiB 3.4MiB 70.43KiB 0 0.0356268 95.3MiB 3.85MiB 80.16KiB 0 0.0404058 False False
syslog_humio_logs -12.75KiB -0.07 99.93% 16.64MiB 122.34KiB 2.5KiB 0 0.00717907 16.63MiB 138.32KiB 2.83KiB 0 0.00812265 False False
fluent_elasticsearch -124.75KiB -0.15 99.95% 79.47MiB 52.78KiB 1.07KiB 0 0.000648406 79.35MiB 1.74MiB 35.89KiB 0 0.0219807 False False
http_to_http_json -39.79KiB -0.16 99.76% 23.84MiB 344.47KiB 7.03KiB 0 0.014106 23.8MiB 542.15KiB 11.06KiB 0 0.0222372 False False
http_to_http_noack -95.17KiB -0.39 99.99% 23.84MiB 407.65KiB 8.33KiB 0 0.0166971 23.74MiB 1.1MiB 22.96KiB 0 0.0463705 False False
http_pipelines_blackhole -8.64KiB -0.52 99.96% 1.62MiB 39.72KiB 831.46B 0 0.0239571 1.61MiB 112.53KiB 2.29KiB 0 0.0682273 False False
http_pipelines_no_grok_blackhole -94.61KiB -0.85 100.00% 10.81MiB 202.3KiB 4.13KiB 0 0.0182659 10.72MiB 1016.58KiB 20.68KiB 0 0.0925793 False False
http_text_to_http_json -335.6KiB -0.85 100.00% 38.62MiB 1.35MiB 28.18KiB 0 0.0348975 38.29MiB 1.15MiB 24.03KiB 0 0.0300164 False False
syslog_loki -166.76KiB -1.18 100.00% 13.85MiB 304.47KiB 6.24KiB 0 0.0214582 13.69MiB 675.33KiB 13.73KiB 0 0.0481619 False False
socket_to_socket_blackhole -1.36MiB -5.68 100.00% 23.84MiB 282.99KiB 5.78KiB 0 0.0115902 22.48MiB 761.07KiB 15.54KiB 0 0.0330493 False False

src/sources/util/http_scrape.rs Outdated Show resolved Hide resolved
src/sources/util/http_scrape.rs Show resolved Hide resolved
Copy link
Member

@bruceg bruceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once tests are passing.

@github-actions
Copy link

Soak Test Results

Baseline: 6572842
Comparison: db122fd
Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
http_to_http_acks 290.71KiB 1.64 81.30% 17.28MiB 7.37MiB 154.02KiB 0 0.426022 17.57MiB 7.55MiB 157.51KiB 0 0.429885 True True
syslog_regex_logs2metric_ddmetrics 111.64KiB 0.89 100.00% 12.31MiB 660.09KiB 13.43KiB 0 0.0523628 12.42MiB 498.54KiB 10.17KiB 0 0.0392005 False False
http_pipelines_blackhole_acks 8.14KiB 0.67 99.15% 1.19MiB 123.1KiB 2.5KiB 0 0.101108 1.2MiB 89.32KiB 1.82KiB 0 0.0728758 False False
splunk_hec_route_s3 68.55KiB 0.37 67.39% 18.09MiB 2.41MiB 50.07KiB 0 0.132944 18.16MiB 2.33MiB 48.65KiB 0 0.128063 False False
syslog_log2metric_humio_metrics 37.01KiB 0.28 99.94% 12.88MiB 274.89KiB 5.61KiB 0 0.0208315 12.92MiB 452.26KiB 9.21KiB 0 0.0341775 False False
splunk_hec_to_splunk_hec_logs_noack 18.95KiB 0.08 87.62% 23.82MiB 502.65KiB 10.25KiB 0 0.0206039 23.84MiB 333.45KiB 6.81KiB 0 0.0136574 False False
splunk_hec_to_splunk_hec_logs_acks 11.51KiB 0.05 37.28% 23.75MiB 846.09KiB 17.21KiB 0 0.0347786 23.76MiB 800.08KiB 16.28KiB 0 0.0328721 False False
splunk_hec_indexer_ack_blackhole 12.51KiB 0.05 38.50% 23.76MiB 889.58KiB 18.1KiB 0 0.0365589 23.77MiB 838.14KiB 17.06KiB 0 0.0344273 False False
syslog_log2metric_splunk_hec_metrics 1.41KiB 0.01 6.51% 17.79MiB 455.29KiB 9.28KiB 0 0.0249835 17.79MiB 711.91KiB 14.49KiB 0 0.0390622 False False
enterprise_http_to_http -3.25KiB -0.01 33.66% 23.85MiB 260.41KiB 5.32KiB 0 0.0106615 23.84MiB 256.7KiB 5.25KiB 0 0.010511 False False
http_pipelines_no_grok_blackhole -2.08KiB -0.02 7.03% 10.96MiB 174.16KiB 3.55KiB 0 0.0155179 10.96MiB 1.12MiB 23.24KiB 0 0.101895 False False
file_to_blackhole -59.96KiB -0.06 47.56% 95.36MiB 3.0MiB 62.11KiB 0 0.0314123 95.3MiB 3.4MiB 70.81KiB 0 0.0357085 False False
syslog_humio_logs -21.21KiB -0.13 99.89% 16.39MiB 223.18KiB 4.56KiB 0 0.0132979 16.37MiB 224.57KiB 4.6KiB 0 0.0133976 False False
http_to_http_json -32.31KiB -0.13 98.95% 23.84MiB 346.25KiB 7.07KiB 0 0.014179 23.81MiB 512.03KiB 10.45KiB 0 0.0209955 False False
datadog_agent_remap_blackhole_acks -125.5KiB -0.19 76.76% 63.72MiB 4.23MiB 88.02KiB 0 0.0663088 63.59MiB 2.75MiB 57.4KiB 0 0.043163 False False
syslog_splunk_hec_logs -32.97KiB -0.2 94.13% 16.41MiB 627.62KiB 12.79KiB 0 0.0373524 16.37MiB 580.89KiB 11.85KiB 0 0.0346392 False False
fluent_elasticsearch -201.54KiB -0.25 100.00% 79.47MiB 52.69KiB 1.07KiB 0 0.000647305 79.28MiB 1.68MiB 34.57KiB 0 0.0211981 False False
http_to_http_noack -87.21KiB -0.36 99.98% 23.84MiB 407.45KiB 8.33KiB 0 0.0166891 23.75MiB 1.06MiB 22.15KiB 0 0.0447141 False False
http_pipelines_blackhole -14.87KiB -0.88 100.00% 1.64MiB 29.15KiB 609.99B 0 0.0173203 1.63MiB 112.43KiB 2.29KiB 0 0.0674056 False False
http_text_to_http_json -799.68KiB -2 100.00% 39.0MiB 831.75KiB 16.98KiB 0 0.0208226 38.22MiB 886.11KiB 18.1KiB 0 0.0226366 False False
datadog_agent_remap_blackhole -1.29MiB -2.2 100.00% 58.44MiB 6.83MiB 142.35KiB 0 0.116922 57.15MiB 6.64MiB 138.49KiB 0 0.116122 False False
datadog_agent_remap_datadog_logs_acks -1.73MiB -2.74 100.00% 63.3MiB 2.27MiB 47.45KiB 0 0.0357794 61.57MiB 4.34MiB 90.44KiB 0 0.0705541 False False
datadog_agent_remap_datadog_logs -1.89MiB -3.06 100.00% 61.85MiB 1.66MiB 34.73KiB 0 0.0267914 59.96MiB 4.39MiB 91.4KiB 0 0.0731887 False False
socket_to_socket_blackhole -851.88KiB -3.46 100.00% 24.06MiB 279.2KiB 5.7KiB 0 0.011332 23.22MiB 150.05KiB 3.06KiB 0 0.00630854 False False
syslog_loki -666.48KiB -4.47 100.00% 14.56MiB 419.61KiB 8.59KiB 0 0.0281303 13.91MiB 728.6KiB 14.81KiB 0 0.0511301 False False

@neuronull
Copy link
Contributor Author

Hey @hhromic , just checking if you have any further thoughts on this before it gets merged in!

@hhromic
Copy link
Contributor

hhromic commented Aug 26, 2022

Hey @hhromic , just checking if you have any further thoughts on this before it gets merged in!

Hi @neuronull, appreciate the consideration!
From my side I think there are no more comments at this time! When this feature gets released for sure we will test it more with real-world use cases and will probably be able to give more feedback if necessary!

Thanks for this fantastic addition!

@github-actions
Copy link

Soak Test Results

Baseline: a00a3f9
Comparison: 33c2364
Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
datadog_agent_remap_blackhole 831.06KiB 1.33 100.00% 61.16MiB 4.45MiB 92.6KiB 0 0.072701 61.97MiB 3.68MiB 76.8KiB 0 0.0594234 False False
syslog_splunk_hec_logs 143.17KiB 0.87 100.00% 15.99MiB 1004.99KiB 20.46KiB 0 0.0613738 16.13MiB 938.84KiB 19.12KiB 0 0.0568374 False False
http_pipelines_blackhole_acks 6.49KiB 0.55 98.91% 1.15MiB 102.53KiB 2.09KiB 0 0.0872622 1.15MiB 71.68KiB 1.46KiB 0 0.06067 False False
datadog_agent_remap_datadog_logs_acks 165.63KiB 0.27 78.41% 60.69MiB 4.19MiB 87.51KiB 0 0.0690046 60.85MiB 4.86MiB 101.26KiB 0 0.0799263 False False
datadog_agent_remap_blackhole_acks 156.63KiB 0.25 74.06% 62.39MiB 4.88MiB 101.58KiB 0 0.078149 62.55MiB 4.53MiB 94.68KiB 0 0.0724523 False False
syslog_log2metric_humio_metrics 28.3KiB 0.22 95.54% 12.84MiB 341.27KiB 6.97KiB 0 0.0259599 12.86MiB 601.4KiB 12.24KiB 0 0.0456498 False False
splunk_hec_to_splunk_hec_logs_noack 24.41KiB 0.1 94.94% 23.82MiB 510.66KiB 10.42KiB 0 0.0209348 23.84MiB 336.66KiB 6.87KiB 0 0.0137879 False False
splunk_hec_indexer_ack_blackhole 18.36KiB 0.08 53.34% 23.75MiB 915.13KiB 18.61KiB 0 0.0376252 23.77MiB 836.27KiB 17.02KiB 0 0.034357 False False
splunk_hec_to_splunk_hec_logs_acks 6.32KiB 0.03 21.27% 23.76MiB 811.85KiB 16.52KiB 0 0.0333628 23.77MiB 814.72KiB 16.58KiB 0 0.0334718 False False
enterprise_http_to_http -3.47KiB -0.01 35.92% 23.85MiB 258.56KiB 5.28KiB 0 0.0105862 23.84MiB 256.15KiB 5.25KiB 0 0.0104889 False False
splunk_hec_route_s3 -6.81KiB -0.03 8.22% 19.02MiB 2.3MiB 47.97KiB 0 0.121093 19.02MiB 2.17MiB 45.3KiB 0 0.113909 False False
file_to_blackhole -48.13KiB -0.05 38.49% 95.34MiB 3.06MiB 63.39KiB 0 0.0320666 95.29MiB 3.45MiB 71.72KiB 0 0.0361808 False False
http_to_http_json -30.55KiB -0.13 98.80% 23.85MiB 332.11KiB 6.78KiB 0 0.0135978 23.82MiB 493.76KiB 10.09KiB 0 0.0202414 False False
http_to_http_noack -67.48KiB -0.28 99.84% 23.84MiB 411.42KiB 8.42KiB 0 0.016852 23.77MiB 966.6KiB 19.69KiB 0 0.0397025 False False
fluent_elasticsearch -450.61KiB -0.55 100.00% 79.47MiB 53.85KiB 1.09KiB 0 0.000661587 79.03MiB 4.45MiB 91.29KiB 0 0.0562315 False False
http_pipelines_blackhole -11.18KiB -0.65 100.00% 1.68MiB 10.95KiB 229.27B 0 0.00636283 1.67MiB 117.17KiB 2.39KiB 0 0.0685443 False False
syslog_regex_logs2metric_ddmetrics -107.62KiB -0.9 100.00% 11.63MiB 739.82KiB 15.06KiB 0 0.0621225 11.52MiB 713.17KiB 14.54KiB 0 0.0604312 False False
http_pipelines_no_grok_blackhole -109.9KiB -0.97 100.00% 11.11MiB 102.69KiB 2.1KiB 0 0.00902142 11.01MiB 1.03MiB 21.54KiB 0 0.0939479 False False
datadog_agent_remap_datadog_logs -934.7KiB -1.44 100.00% 63.53MiB 937.47KiB 19.2KiB 0 0.0144065 62.62MiB 4.42MiB 92.01KiB 0 0.0705596 False False
syslog_log2metric_splunk_hec_metrics -262.51KiB -1.46 100.00% 17.57MiB 787.08KiB 16.04KiB 0 0.0437285 17.32MiB 987.55KiB 20.1KiB 0 0.0556781 False False
syslog_humio_logs -266.33KiB -1.57 100.00% 16.58MiB 680.87KiB 13.9KiB 0 0.0401066 16.32MiB 910.89KiB 18.65KiB 0 0.0545111 False False
http_to_http_acks -337.91KiB -1.89 87.05% 17.42MiB 7.32MiB 153.0KiB 0 0.419937 17.09MiB 7.76MiB 162.0KiB 0 0.453981 True True
syslog_loki -409.11KiB -2.74 100.00% 14.56MiB 513.06KiB 10.51KiB 0 0.0343951 14.16MiB 845.19KiB 17.18KiB 0 0.0582589 False False
http_text_to_http_json -1.43MiB -3.52 100.00% 40.53MiB 675.88KiB 13.8KiB 0 0.0162835 39.1MiB 791.98KiB 16.18KiB 0 0.0197778 False False
socket_to_socket_blackhole -945.64KiB -3.91 100.00% 23.62MiB 254.86KiB 5.2KiB 0 0.0105335 22.7MiB 347.02KiB 7.09KiB 0 0.0149263 False False

@neuronull neuronull merged commit 0e0d746 into master Aug 26, 2022
@neuronull neuronull deleted the neuronull/source_http_scrape branch August 26, 2022 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-condition: integration tests enable Run integration tests on this PR domain: ci Anything related to Vector's CI environment domain: codecs Anything related to Vector's codecs (encoding/decoding) domain: external docs Anything related to Vector's external, public documentation domain: sources Anything related to the Vector's sources source: http_server Anything `http_server` source related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor sources that scrape an endpoint on an interval New http_scrape source
5 participants