Proposal for an experiment to include native histograms in OpenMetrics #247

beorn7 · 2022-06-29T17:43:53Z

Prometheus's new Native Histograms AKA Sparse Histograms somehow need to be represented in OM 2.x.

I have described the various trade-offs here. As an outsider, it's hard for me to sketch out a concrete design how to deal with all those, but I would like to propose an experiment: Let's create a makeshift way of including a Native Histogram representation in OpenMetrics that is very easy to generate and parse but ignores, for now, efficiency concerns, OM design philosophies etc. It will, however, allow instrumentation libraries to expose Native Histograms in an experimental way and Prometheus to ingest those. We can then study exposition and ingestion in practice, iterate on it, and get a better idea about the trade-offs for the actual specification of Native Histograms in OM 2.x.

This experiment should be hidden behind a feature flag or even in a separate branch, depending on the release philosophy of the affected repository.

Here's the idea:

Add JSON tags to the histogram.Histogram type in prometheus/prometheus.
Instead of the floating point number for a regular sample, use a one-line JSON snippet for a Native Histogram sample, that marshals into the histogram.Histogram type as above.

Example for the exposition of a (fairly complex) pure Native Histogram including a timestamp:

# TYPE foo histogram
foo {"schema":0,"zero_threshold":0.001,"zero_count":4,"count":24,"sum":100,"positive_spans":[{"offset":0,"length":2},{"offset":1,"length":2}],"negative_spans":[{"offset":0,"length":2},{"offset":1,"length":2}],"positive_buckets":[2,1,-2,3],"negative_buckets":[2,1,-2,3]} 1520430042.123
foo_created 1520430000.123

Note that there is no name collision with any of the conventional Histogram fields. Therefore, a conventional and a Native Histogram representation can be exposed side by side:

# TYPE foo histogram
foo {"schema":0,"zero_threshold":0.001,"zero_count":2,"count":12,"sum":123.4,"positive_spans":[{"offset":0,"length":2},{"offset":1,"length":2}],"positive_buckets":[2,1,-2,3]}
foo_bucket{le="0.0001"} 2
foo_bucket{le="1.0"} 2
foo_bucket{le="2.0"} 5
foo_bucket{le="4.0"} 6
foo_bucket{le="8.0"} 10
foo_bucket{le="+Inf"} 12
foo_count 12
foo_sum 123.4
foo_created 1520430000.123

Following thoughts:

The JSON snippet makes it easy to quickly create generators and parsers.
To test ideal text parsing performance, a hand-coded highly-optimized parser can still be written.
The layout corresponds to what the Prometheus server has to create when ingesting a Native Histogram into TSDB. Therefore, this experiment will illustrate the "best case" (from the server's perspective) of doing minimal work for decoding.
The format is also fairly compact, avoiding spelling out bucket boundaries explicitly (which are, in the general case, very long floating point numbers, e.g. 0.0008955117609420616, all buckets on one row without repeating labels for each bucket.
With a bit of squinting (mostly removing the double quotes), the format is close to how a "machine-friendly" OM format could actually look like. It is, of course, not very "human-friendly", but a human-readable format would be both much more verbose and much more expensive to decode. See the trade-offs mentioned earlier.
While this implicitly also covers the case of a Gauge Histogram, it does not work for a Float Histogram (which looks significantly different Prometheus-internally). I think that's fine for this experiment. Exposing Float Histograms is a very rare use case (currently not even covered by OM for conventional Histograms).

WRT a human-readable representation, you might want to have a look at the String method of the histogram.Histogram type. For the 2nd example above, the string representation is {count:12, sum:123.4, [-0.001,0.001]:2, (0.5,1]:2, (1,2]:3, (2,4]:1, (4,8]:4}. This looks very benign, but it's also a very simplistic example with only a few buckets and very simple bucket boundaries. For reference, I paste a more typical histogram below. In addition to the verbosity, Prometheus has to "guess" a schema from the representation, sort all the buckets into it, generate the span descriptions, and calculate the deltas between buckets. In total, that is quite a decoding effort.

Here the string representation of a "normal" Native Histogram:

{ count:252719 sum:2.588417777086236 [-0.0011613350732448448,-0.0010649489576809157):1 [-0.0008211879055212055,-0.0007530326295937211):9 [-0.0007530326295937211,-0.0006905339660024878):50 [-0.0006905339660024878,-0.0006332224387944383):121 [-0.0006332224387944383,-0.0005806675366224224):213 [-0.0005806675366224224,-0.0005324744788404579):392 [-0.0005324744788404579,-0.00048828125):727 [-0.00048828125,-0.0004477558804710308):1171 [-0.0004477558804710308,-0.00041059395276060273):1672 [-0.00041059395276060273,-0.00037651631479686053):2193 [-0.00037651631479686053,-0.0003452669830012439):2903 [-0.0003452669830012439,-0.00031661121939721915):3469 [-0.00031661121939721915,-0.0002903337683112112):3785 [-0.0002903337683112112,-0.00026623723942022893):4245 [-0.00026623723942022893,-0.000244140625):4615 [-0.000244140625,-0.0002238779402355154):4920 [-0.0002238779402355154,-0.00020529697638030136):4996 [-0.00020529697638030136,-0.00018825815739843027):5091 [-0.00018825815739843027,-0.00017263349150062194):5122 [-0.00017263349150062194,-0.00015830560969860958):4900 [-0.00015830560969860958,-0.0001451668841556056):4876 [-0.0001451668841556056,-0.00013311861971011446):4707 [-0.00013311861971011446,-0.0001220703125):4435 [-0.0001220703125,-0.0001119389701177577):4087 [-0.0001119389701177577,-0.00010264848819015068):3994 [-0.00010264848819015068,-0.00009412907869921513):3730 [-0.00009412907869921513,-0.00008631674575031097):3425 [-0.00008631674575031097,-0.00007915280484930479):3157 [-0.00007915280484930479,-0.0000725834420778028):2944 [-0.0000725834420778028,-0.00006655930985505723):2782 [-0.00006655930985505723,-0.00006103515625):2536 [-0.00006103515625,-0.00005596948505887885):2440 [-0.00005596948505887885,-0.00005132424409507534):2224 [-0.00005132424409507534,-0.00004706453934960757):2042 [-0.00004706453934960757,-0.000043158372875155485):1948 [-0.000043158372875155485,-0.000039576402424652394):1759 [-0.000039576402424652394,-0.0000362917210389014):1628 [-0.0000362917210389014,-0.000033279654927528616):1439 [-0.000033279654927528616,-0.000030517578125):1282 [-0.000030517578125,-0.000027984742529439426):1184 [-0.000027984742529439426,-0.00002566212204753767):1185 [-0.00002566212204753767,-0.000023532269674803783):1064 [-0.000023532269674803783,-0.000021579186437577742):966 [-0.000021579186437577742,-0.000019788201212326197):889 [-0.000019788201212326197,-0.0000181458605194507):788 [-0.0000181458605194507,-0.000016639827463764308):732 [-0.000016639827463764308,-0.0000152587890625):638 [-0.0000152587890625,-0.000013992371264719713):602 [-0.000013992371264719713,-0.000012831061023768835):587 [-0.000012831061023768835,-0.000011766134837401892):534 [-0.000011766134837401892,-0.000010789593218788871):494 [-0.000010789593218788871,-0.000009894100606163098):420 [-0.000009894100606163098,-0.00000907293025972535):398 [-0.00000907293025972535,-0.000008319913731882154):359 [-0.000008319913731882154,-0.00000762939453125):332 [-0.00000762939453125,-0.0000069961856323598564):338 [-0.0000069961856323598564,-0.000006415530511884418):280 [-0.000006415530511884418,-0.000005883067418700946):255 [-0.000005883067418700946,-0.000005394796609394436):232 [-0.000005394796609394436,-0.000004947050303081549):228 [-0.000004947050303081549,-0.000004536465129862675):208 [-0.000004536465129862675,-0.000004159956865941077):179 [-0.000004159956865941077,-0.000003814697265625):166 [-0.000003814697265625,-0.0000034980928161799282):177 [-0.0000034980928161799282,-0.000003207765255942209):138 [-0.000003207765255942209,-0.000002941533709350473):149 [-0.000002941533709350473,-0.000002697398304697218):99 [-0.000002697398304697218,-0.0000024735251515407746):101 [-0.0000024735251515407746,-0.0000022682325649313374):96 [-0.0000022682325649313374,-0.0000020799784329705385):101 [-0.0000020799784329705385,-0.0000019073486328125):94 [-0.0000019073486328125,-0.0000017490464080899641):74 [-0.0000017490464080899641,-0.0000016038826279711044):79 [-0.0000016038826279711044,-0.0000014707668546752365):62 [-0.0000014707668546752365,-0.000001348699152348609):66 [-0.000001348699152348609,-0.0000012367625757703873):73 [-0.0000012367625757703873,-0.0000011341162824656687):53 [-0.0000011341162824656687,-0.0000010399892164852693):50 [-0.0000010399892164852693,-9.5367431640625e-07):37 [-9.5367431640625e-07,-8.745232040449821e-07):47 [-8.745232040449821e-07,-8.019413139855522e-07):32 [-8.019413139855522e-07,-7.353834273376182e-07):30 [-7.353834273376182e-07,-6.743495761743044e-07):28 [-6.743495761743044e-07,-6.183812878851937e-07):25 [-6.183812878851937e-07,-5.670581412328344e-07):25 [-5.670581412328344e-07,-5.199946082426346e-07):29 [-5.199946082426346e-07,-4.76837158203125e-07):18 [-4.76837158203125e-07,-4.3726160202249103e-07):15 [-4.3726160202249103e-07,-4.009706569927761e-07):17 [-4.009706569927761e-07,-3.676917136688091e-07):20 [-3.676917136688091e-07,-3.371747880871522e-07):15 [-3.371747880871522e-07,-3.0919064394259683e-07):17 [-3.0919064394259683e-07,-2.835290706164172e-07):15 [-2.835290706164172e-07,-2.599973041213173e-07):9 [-2.599973041213173e-07,-2.384185791015625e-07):12 [-2.384185791015625e-07,-2.1863080101124551e-07):15 [-2.1863080101124551e-07,-2.0048532849638805e-07):9 [-2.0048532849638805e-07,-1.8384585683440456e-07):4 [-1.8384585683440456e-07,-1.685873940435761e-07):8 [-1.685873940435761e-07,-1.5459532197129841e-07):8 [-1.5459532197129841e-07,-1.417645353082086e-07):2 [-1.417645353082086e-07,-1.2999865206065866e-07):4 [-1.1920928955078125e-07,-1.0931540050562276e-07):2 [-1.0024266424819403e-07,-9.192292841720228e-08):3 [-9.192292841720228e-08,-8.429369702178806e-08):3 [-8.429369702178806e-08,-7.729766098564921e-08):2 [-7.08822676541043e-08,-6.499932603032933e-08):4 [-6.499932603032933e-08,-5.960464477539063e-08):1 [-5.960464477539063e-08,-5.465770025281138e-08):3 [-5.465770025281138e-08,-5.012133212409701e-08):4 [-5.012133212409701e-08,-4.596146420860114e-08):1 [-4.596146420860114e-08,-4.214684851089403e-08):2 [-4.214684851089403e-08,-3.8648830492824603e-08):1 [-3.8648830492824603e-08,-3.544113382705215e-08):1 [-3.544113382705215e-08,-3.2499663015164664e-08):1 [-3.2499663015164664e-08,-2.9802322387695312e-08):2 [-2.732885012640569e-08,-2.5060666062048506e-08):2 [-1.9324415246412302e-08,-1.7720566913526073e-08):2 [-1.6249831507582332e-08,-1.4901161193847656e-08):1 [-1.4901161193847656e-08,-1.3664425063202845e-08):1 [-8.860283456763037e-09,-8.124915753791166e-09):1 [-7.450580596923828e-09,-6.832212531601422e-09):1 [-6.265166515512127e-09,-5.7451830260751424e-09):1 [-5.7451830260751424e-09,-5.2683560638617535e-09):1 [-5.2683560638617535e-09,-4.8311038116030754e-09):2 [-2.215070864190759e-09,-2.0312289384477915e-09):1 [-1.4362957565187856e-09,-1.3170890159654384e-09):1 (1.7080531329003556e-09,1.862645149230957e-09]:1 (2.6341780319308768e-09,2.8725915130375712e-09]:2 (5.7451830260751424e-09,6.265166515512127e-09]:1 (6.265166515512127e-09,6.832212531601422e-09]:1 (1.1490366052150285e-08,1.2530333031024253e-08]:1 (1.3664425063202845e-08,1.4901161193847656e-08]:1 (1.6249831507582332e-08,1.7720566913526073e-08]:1 (1.7720566913526073e-08,1.9324415246412302e-08]:2 (1.9324415246412302e-08,2.1073424255447014e-08]:3 (2.9802322387695312e-08,3.2499663015164664e-08]:1 (3.2499663015164664e-08,3.544113382705215e-08]:3 (3.544113382705215e-08,3.8648830492824603e-08]:4 (3.8648830492824603e-08,4.214684851089403e-08]:2 (4.596146420860114e-08,5.012133212409701e-08]:4 (5.012133212409701e-08,5.465770025281138e-08]:1 (5.465770025281138e-08,5.960464477539063e-08]:8 (5.960464477539063e-08,6.499932603032933e-08]:5 (6.499932603032933e-08,7.08822676541043e-08]:3 (7.08822676541043e-08,7.729766098564921e-08]:2 (7.729766098564921e-08,8.429369702178806e-08]:2 (8.429369702178806e-08,9.192292841720228e-08]:2 (9.192292841720228e-08,1.0024266424819403e-07]:4 (1.0024266424819403e-07,1.0931540050562276e-07]:6 (1.0931540050562276e-07,1.1920928955078125e-07]:8 (1.1920928955078125e-07,1.2999865206065866e-07]:9 (1.2999865206065866e-07,1.417645353082086e-07]:6 (1.417645353082086e-07,1.5459532197129841e-07]:10 (1.5459532197129841e-07,1.685873940435761e-07]:10 (1.685873940435761e-07,1.8384585683440456e-07]:7 (1.8384585683440456e-07,2.0048532849638805e-07]:8 (2.0048532849638805e-07,2.1863080101124551e-07]:7 (2.1863080101124551e-07,2.384185791015625e-07]:14 (2.384185791015625e-07,2.599973041213173e-07]:9 (2.599973041213173e-07,2.835290706164172e-07]:14 (2.835290706164172e-07,3.0919064394259683e-07]:9 (3.0919064394259683e-07,3.371747880871522e-07]:10 (3.371747880871522e-07,3.676917136688091e-07]:11 (3.676917136688091e-07,4.009706569927761e-07]:21 (4.009706569927761e-07,4.3726160202249103e-07]:18 (4.3726160202249103e-07,4.76837158203125e-07]:22 (4.76837158203125e-07,5.199946082426346e-07]:24 (5.199946082426346e-07,5.670581412328344e-07]:23 (5.670581412328344e-07,6.183812878851937e-07]:20 (6.183812878851937e-07,6.743495761743044e-07]:22 (6.743495761743044e-07,7.353834273376182e-07]:26 (7.353834273376182e-07,8.019413139855522e-07]:30 (8.019413139855522e-07,8.745232040449821e-07]:34 (8.745232040449821e-07,9.5367431640625e-07]:35 (9.5367431640625e-07,0.0000010399892164852693]:35 (0.0000010399892164852693,0.0000011341162824656687]:66 (0.0000011341162824656687,0.0000012367625757703873]:51 (0.0000012367625757703873,0.000001348699152348609]:60 (0.000001348699152348609,0.0000014707668546752365]:54 (0.0000014707668546752365,0.0000016038826279711044]:74 (0.0000016038826279711044,0.0000017490464080899641]:67 (0.0000017490464080899641,0.0000019073486328125]:74 (0.0000019073486328125,0.0000020799784329705385]:74 (0.0000020799784329705385,0.0000022682325649313374]:87 (0.0000022682325649313374,0.0000024735251515407746]:110 (0.0000024735251515407746,0.000002697398304697218]:107 (0.000002697398304697218,0.000002941533709350473]:113 (0.000002941533709350473,0.000003207765255942209]:133 (0.000003207765255942209,0.0000034980928161799282]:144 (0.0000034980928161799282,0.000003814697265625]:176 (0.000003814697265625,0.000004159956865941077]:195 (0.000004159956865941077,0.000004536465129862675]:189 (0.000004536465129862675,0.000004947050303081549]:206 (0.000004947050303081549,0.000005394796609394436]:221 (0.000005394796609394436,0.000005883067418700946]:234 (0.000005883067418700946,0.000006415530511884418]:307 (0.000006415530511884418,0.0000069961856323598564]:266 (0.0000069961856323598564,0.00000762939453125]:309 (0.00000762939453125,0.000008319913731882154]:341 (0.000008319913731882154,0.00000907293025972535]:373 (0.00000907293025972535,0.000009894100606163098]:408 (0.000009894100606163098,0.000010789593218788871]:450 (0.000010789593218788871,0.000011766134837401892]:512 (0.000011766134837401892,0.000012831061023768835]:569 (0.000012831061023768835,0.000013992371264719713]:564 (0.000013992371264719713,0.0000152587890625]:649 (0.0000152587890625,0.000016639827463764308]:692 (0.000016639827463764308,0.0000181458605194507]:729 (0.0000181458605194507,0.000019788201212326197]:847 (0.000019788201212326197,0.000021579186437577742]:890 (0.000021579186437577742,0.000023532269674803783]:942 (0.000023532269674803783,0.00002566212204753767]:1069 (0.00002566212204753767,0.000027984742529439426]:1193 (0.000027984742529439426,0.000030517578125]:1294 (0.000030517578125,0.000033279654927528616]:1334 (0.000033279654927528616,0.0000362917210389014]:1494 (0.0000362917210389014,0.000039576402424652394]:1571 (0.000039576402424652394,0.000043158372875155485]:1817 (0.000043158372875155485,0.00004706453934960757]:1969 (0.00004706453934960757,0.00005132424409507534]:2043 (0.00005132424409507534,0.00005596948505887885]:2286 (0.00005596948505887885,0.00006103515625]:2459 (0.00006103515625,0.00006655930985505723]:2670 (0.00006655930985505723,0.0000725834420778028]:2950 (0.0000725834420778028,0.00007915280484930479]:3145 (0.00007915280484930479,0.00008631674575031097]:3372 (0.00008631674575031097,0.00009412907869921513]:3538 (0.00009412907869921513,0.00010264848819015068]:3968 (0.00010264848819015068,0.0001119389701177577]:4280 (0.0001119389701177577,0.0001220703125]:4478 (0.0001220703125,0.00013311861971011446]:4828 (0.00013311861971011446,0.0001451668841556056]:5014 (0.0001451668841556056,0.00015830560969860958]:5149 (0.00015830560969860958,0.00017263349150062194]:5350 (0.00017263349150062194,0.00018825815739843027]:5540 (0.00018825815739843027,0.00020529697638030136]:5457 (0.00020529697638030136,0.0002238779402355154]:5503 (0.0002238779402355154,0.000244140625]:5523 (0.000244140625,0.00026623723942022893]:5350 (0.00026623723942022893,0.0002903337683112112]:5075 (0.0002903337683112112,0.00031661121939721915]:4607 (0.00031661121939721915,0.0003452669830012439]:4015 (0.0003452669830012439,0.00037651631479686053]:3309 (0.00037651631479686053,0.00041059395276060273]:2770 (0.00041059395276060273,0.0004477558804710308]:1996 (0.0004477558804710308,0.00048828125]:1457 (0.00048828125,0.0005324744788404579]:989 (0.0005324744788404579,0.0005806675366224224]:577 (0.0005806675366224224,0.0006332224387944383]:298 (0.0006332224387944383,0.0006905339660024878]:132 (0.0006905339660024878,0.0007530326295937211]:53 (0.0007530326295937211,0.0008211879055212055]:20 (0.0008211879055212055,0.0008955117609420616]:7 }

The text was updated successfully, but these errors were encountered:

brian-brazil · 2022-06-29T18:17:58Z

OM 2.x doesn't exist yet in any form, so I think any experiment should be done under a different name and content type for now to avoid any potential future confusion. We already have enough people thinking OM and Prometheus text format are the same thing.

From an OM 1.x standpoint, as long as it can always gracefully negotiate and degrade to OM 1.0 then it's still compliant with OM. Which is to say produce at least a +Inf bucket and any other buckets be essentially static.

In terms of the minute of the format itself I do have some thoughts, though with 2.x we can be less constrained than for 1.x considering that a 2.x implementation would still have to be able to produce a degraded 1.0. So for example that your proposal requires parsers noticing that "foo" is associated with the TYPE just above is reasonable here, whereas it isn't for 1.0.
Without the double quotes would be my main thought, and if you want a JSON parser to be able to handle it ensure that you have a plan for NaN/Inf.

beorn7 · 2022-06-29T18:54:30Z

WRT content type: Yes, sure, there should be a very specific content type just for the experiment.

Without the double quotes would be my main thought,

To clarify: If we want to use a JSON parser for the experiment, we need the double quotes during the experiment.

ensure that you have a plan for NaN/Inf.

Ah right. My thought here was that, for the experiment, we require instrumentation libraries to never emit NaN/Inf. That's anyway a weird corner case. We need to handle it for the real thing, of course. But for the real thing, we won't use a JSON snippet in the first place.

brian-brazil · 2022-06-29T18:58:21Z

To clarify: If we want to use a JSON parser for the experiment, we need the double quotes during the experiment.

The whole point is to experiment, so that sounds fine to me anyway.

beorn7 · 2022-06-29T19:01:52Z

/cc @fstab Would this match your expectation for an experiment in client_java?

beorn7 · 2022-09-06T12:25:07Z

FYI: @fstab has now added protobuf support to client_java temporarily. One of the reasons for this experiment (to add native histogram support to client_java in a simple way) is therefore not relevant anymore. This might still be useful to play with a text representation of native histograms and how it behaves during generation and parsing etc.

beorn7 · 2022-10-11T11:59:29Z

Also note #256 for a draft of Native Histogram support in the OpenMetrics protobuf format.

beorn7 · 2023-03-14T14:54:00Z

Given the reaction to a brainstorming doc, I think we should not pursuit this "embedded JSON" idea any longer (but we can, of course, change our minds again). Of all the ideas discussed, "embedded JSON" (idea 1 in the doc) was the least liked, notably also by @csmarchbanks, who maintains client_python, which will probably be the first instrumentation library to implement a text format for native histograms.

Therefore, I'm retracting this proposal (for now).

beorn7 mentioned this issue Jun 29, 2022

OpenMetrics SparseHistogram/NativeHistograms #237

Open

This was referenced Aug 16, 2022

scrape (histograms): Implement OpenMetrics histogram experiment prometheus/prometheus#11172

Closed

promql (histograms): Extend PromQL testing framework to include native histograms prometheus/prometheus#11170

Closed

beorn7 mentioned this issue Oct 13, 2022

histograms: Implement OpenMetrics protobuf parsing prometheus/prometheus#11264

Open

beorn7 closed this as completed Mar 14, 2023

hdost mentioned this issue Dec 26, 2023

Add support for native histograms prometheus/prom2json#125

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for an experiment to include native histograms in OpenMetrics #247

Proposal for an experiment to include native histograms in OpenMetrics #247

beorn7 commented Jun 29, 2022 •

edited

Loading

brian-brazil commented Jun 29, 2022 •

edited

Loading

beorn7 commented Jun 29, 2022

brian-brazil commented Jun 29, 2022

beorn7 commented Jun 29, 2022

beorn7 commented Sep 6, 2022

beorn7 commented Oct 11, 2022

beorn7 commented Mar 14, 2023

Proposal for an experiment to include native histograms in OpenMetrics #247

Proposal for an experiment to include native histograms in OpenMetrics #247

Comments

beorn7 commented Jun 29, 2022 • edited Loading

brian-brazil commented Jun 29, 2022 • edited Loading

beorn7 commented Jun 29, 2022

brian-brazil commented Jun 29, 2022

beorn7 commented Jun 29, 2022

beorn7 commented Sep 6, 2022

beorn7 commented Oct 11, 2022

beorn7 commented Mar 14, 2023

beorn7 commented Jun 29, 2022 •

edited

Loading

brian-brazil commented Jun 29, 2022 •

edited

Loading