-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPv6 only hosts support #306
Comments
It's a pretty big change to work around it, not that I wouldn't welcome a On Tue, Oct 8, 2013 at 3:07 PM, Nikita Vetoshkin
|
@bmdhacks yes, I understand that :) |
Hi, I have some idea, can you judge it?
Another useful step:
What do you think? Will it significantly increase the amount of data? Do you have any proposal of how to do it more naturally? |
Is anyone working on this? My preference would be either: Matt |
This, issue came up again in armeria from @trustin. I wonder if we want to consider a tactical workaround or not? Right now, there's no special indexing in zipkin for the ip address anyway, so it is more or less how we handle the union concern of ipv4, ipv6 or I don't know. This would permeate all sorts of tracers, who probably already came onto this problem. For example, I know opentracing expresses the ipv6 of a peer as a "tag". However, whatever we come up with needs to be able to work for both the local endpoint (the one making annotations), and also the address binary annotations (which designate client vs server) cc @kristofa @eirslett @abesto @yurishkuro @jcarres-mdsol @basvanbeek for ideas. |
Agree we need to support ipv6 somehow. How about the naïve approach, just adding a new field "ipv6" to the annotations? The only problem I see is enforcing that at least one of ipv[4,6] is set. Or we could just say that we don't really care whether it's an ipv4 or an ipv6 address, anyone who cares can decide about an address trivially which one it is. So, idea: let's introduce a new field "ip" or "ipAddress" in the relevant annotations, and deprecate "ipv4". The new field can be either an ipv4 or an ipv6 address. |
Thx for the feedback!
at least in thrift I don't think we'll be able to effectively deprecate
ipv4. This is used a lot, and will be hard to ever remove (in this model)
We could probably support ipv6 as a peer field with not much impact. For
example, someone may have both types of addresses to log, and if they don't
have ipv4, they could log 0 (which I've noticed some do anyway). 128 bits
are annoying as you can't stuff them into a number, we'd have to use bytes
in the thrift field (even if we do fully spelled out in json).
we could do the either/or (union type), just it is a little trickier for
implementors than a separate ipv6 field, and requires us to do some
prioritization if someone set both ipv4 and ip of type ipv4, if that makes
sense.
|
ps by "ipv4 .. This is used a lot" I don't mean zipkin uses it, rather
there's a lot of instrumentation that use it. Easier to add than
remove.
|
the zipkin-go-opentracing implementation can easily be adjusted without consumer API breakage. So what ever we come up with will be fine |
How about this: since Zipkin itself doesn't semantically use the IP, just displays it (is this true?), we can create a permissive data structure - separate ipv[4,6] fields, and let users decide what they want to put there. Let them come up with their own convention, and make this explicit, maybe providing best practices. |
We can't change the type of ipv4, as that would be incompatible with thrift
def. Adding a new string field to hold the ipv4 value would add confusion
and also bloat the field.
https://github.com/openzipkin/zipkin-api/blob/master/thrift/zipkinCore.thrift#L249
We can have flexibility of adding ipv6, to the thrift, but it seems
confusing to have a different encoding process than what's used for ipv4.
(remember thrifts are binary anyway. people don't look at them)
Keep in mind that choice affects the highest count of things in zipkin:
annotation and binary annotations. We have to be conscious about
efficiency, knowing that using string representation of ipv6 is up to 3x
the size as binary.
That said, remember that in json we use stringified fields anyway. Most of
this talk is about the thrift encoding, which is the most sensitive one.
TL;DR; is since there are only two types of ip addresses, and we already
support the former in native form, and can't change it without adding
confusion, the simplest and most consistent way to implement ipv6 would be
to add a field in its binary form.
This keeps this sort of change low-hanging, least confusion and bears in
mind that our next model may not even have an Endpoint type! (ex if it is
single-host spans, we can simply make things like this tags)
|
ps here's a link about model problems, and behind my desire for this issue to result in a tactical change vs an incompatible one. It is choosing battles.. Ex I have time to help with a tactical change like this, and prefer to punt incompatible changes to something that packs more punch #939 I'm thinking about thrift implications (cassandra), json implications (elasticsearch), and schema implications (another column in mysql), as well how to make this a "add this here" step for instrumentors. It is a lot of steps already, and adding something incompatible or inconsistent makes it less inviting for someone to jump in and do. |
so here's a summary of compatible options that I think are possible and relatively equivalent work unless called out otherwise. I'm ok with either, actually, and happy to have folks vote for whatever they prefer and implement accordingly. Add Endpoint.address[String]This would hold an opaque value and deprecates ipv4. We could say that this overrides any value held in ipv4, and perhaps backfill it at query time (as we do with Span.timestamp). ThriftWe'd mark ipv4 deprecated and add the following: struct Endpoint {
--snip--
/**
* Holds the text representation of a IPv4 or IPv6 address associated with this endpoint
*
* IPv4 address: Ex. "1.2.3.4" per RFC 2673, section 3.2.
* IPv6 address: Ex. "2001:0db8:85a3:0000:0000:8a2e:0370:7334" per RFC 2373, section 2.2.
*/
4: optional string address JsonI'm not aware of a json-schema type that describes both ipv4 and ipv6. Add Endpoint.ipv6[bytes]This would be optional and hold the 128 bits for the ipv6 address and doesn't affect ipv4. ThriftWe'd the following: struct Endpoint {
--snip--
/**
* IPv6 host address packed into 16 bytes. Ex Inet6Address.getBytes()
*/
4: optional binary ipv6 JsonUses text format, just like the "ipv4" currently does. Document "ipv6" as the the json schema type of the same name. Pros and Consaddress[String] will require a larger thrift field and row, although in both cases it is bounded by largest text form of an IP (50bytes). Some may object to the rows * 50 bytes enlarging of annotation/binary annotation namespace, and can choose not to store this. ipv6[bytes] is separate from ipv4, which means an endpoint could have both an ipv6 and ipv4 association. It is more consistent in approach, but the current approach of ipv4 in thrift has proven annoying to some. This choice still needs a column even if it only needs to be 16bytes long. |
@mosesn if you don't mind.. could you give a "twitter answer" on this topic? |
I'm in favor of Endpoint.ipv6 |
We could base64-encode the ipv6 address, in JSON it's a pain to work with binary data... |
in json, I'm not suggesting binary, rather the normal string encoding (like
we do for ipv4)
|
@adriancole let me try to find someone from Twitter who has more background on this. |
You convinced me Endpoint.ipv6 is the way to go (with binary in thrift, string in json). |
Thinking out loud here... a related change we could do is leniently parse ipv4 in thrift, defaulting to 0.0.0.0. For example, I've noticed people stuff '0' when they don't have an ipv4. We couldn't return null from Endpoint.ipv4, as that would break people, but we could default it to zero and document the practice. This would keep those with ipv6 endpoints a little smaller as they don't have to encode a fake ipv4 address. |
Endpoint.ipv6 as a binary seems like the best approach to us (chiming in for Twitter's zipkin team). Our rational being: the utility of Endpoint.address' is uncertain but comes at a guaranteed cost for ipv6. The world is definitely moving to ipv6, it's practically inevitable, and we can expect ipv6 to be the dominant case in the near future, so let's optimize for it. FWIW, a default of 0 for ipv4, that seems sane, but I don't have any useful thoughts one way or the other about it. |
Thanks for the feedback!
|
This adds `Endpoint.ipv6` as a fixed-width byte array (16 bytes). This formalizes `Endpoint.ipv4 == 0` implying there's no ipv4 address. In thrift, this remains a byte array (String) at field 4. In json, this is normal string formatting. * normal java utilities are used for codec (Inet6Address) In the UI, the ipv6 address is preferred and bracketed when present. Ex. [2001:db8:0:0:0:0:0:c002]:8080 In MySQL, this is mapped to a `BINARY(16)` field named `zipkin_annotations.endpoint_ipv6`. When this column is missing a warning like below is printed: ``` Jul 11, 2016 4:08:53 PM zipkin.storage.mysql.HasIpv6 compute WARNING: zipkin_annotations.ipv6 doesn't exist, so Endpoint.ipv6 is not supported. Execute: alter table zipkin_annotations add `endpoint_ipv6` BINARY(16) ``` Fixes #306
here's the impl. when merged I can update zipkin-api's repo (to update the thrift and json model definition) #1178 |
here's the api definition change openzipkin/zipkin-api#20 |
This adds Endpoint.ipv6 as a fixed-width byte array (16 bytes). This formalizes Endpoint.ipv4 == 0 implying there's no ipv4 address. These are supported in recent versions of Zipkin with the same semantics. You can use this via the Endpoint constructor of Brave, or via tracer hooks like ClientTracer.setClientSent or ServerTracer.setServerReceived This also deprecates Endpoint.create(serviceName,ipv4,port) as it leads to NullPointerExceptions See openzipkin/zipkin#306 See openzipkin/zipkin#1309
If I'm right host identifier in trace is currently named
ipv4
and encoded asint32
and used as integer throughout the system (column types, etc).This automaticaly disables
ipv6
only hosts from being compatible with zipkin. Are there any plans or thoughts how one can add such support or maybe workaround can be found?The text was updated successfully, but these errors were encountered: