Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unifying Jaeger clients design #1269

Closed
yurishkuro opened this issue Dec 28, 2018 · 6 comments
Closed

Unifying Jaeger clients design #1269

yurishkuro opened this issue Dec 28, 2018 · 6 comments
Labels
meta-issue An tracking issue that requires work in other repos

Comments

@yurishkuro
Copy link
Member

There are many discrepancies in the Jaeger clients today about how their internal components are implemented. It would be good to come up with an abstract reference design and gradually ensure all clients follow the pattern. Specifically, the most divergence is in the implementation of span reporting. I think the following is the type system that we want to use going forward:

  • Reporter: an abstract interface for doing something with finished spans. Examples may include:
    • logging span ID,
    • no-op
    • keeping spans in memory in a ring buffer (for inspection in z-pages equivalent)
    • sending spans out of process (to agent or collector)
  • RemoteReporter implements Reporter: responsible for taking the finished spans off the hot path of the request, e.g. by putting them in a queue consumed by a background thread, or some other form of async processing.
    • Requires a Sender to actually send the spans out of process.
    • Guarantees that the Sender is invoked on a single thread
    • Calls sender.append() on new span in the queue
    • Calls sender.flush() when reporter.close() is called in order to fully drain the queue
    • Emits metrics on the number of spans sent out (or failed)
  • Sender: an abstract interface with two methods, append and flush. Implementations may accept Encoder and Transport as dependencies.
    • append() may buffer new spans for sending out later in a batch, or may flush them immediately (e.g. if its internal buffer is full or reached the max packet size). Returns the number of spans flushed, if any.
    • flush() immediately sends all spans from the internal buffer, and returns the number of spans successfully send.
  • Transport is used by the Sender to send a batch of spans to the target destination, e.g. the agent or the collector. API of the Transport may be language-specific. It accepts the batch already encoded to a series of bytes, to allow for decoupling of the encoding format (Thrift, JSON, proto) from the communication channel (HTTP, UDP, Kafka).
  • Encoder converts a batch of spans into a sequence of bytes.
    • TBD: UDP transport requires that the final size of the byte array representing the batch of spans does not exceed max packet size supported by the agent (default: 65000 bytes). If Sender is the one responsible for accumulating spans in a batch, then there needs to be an interface between Sender and Encoder to incrementally calculate the size of each encoded span or the total size of the batch. More fine-grained access to the encoder allows for more efficient implementation.

Sample Sender interface:

// Sender abstracts the method of sending spans out of process.
// Implementations are NOT required to be thread-safe; the RemoteReporter
// is expected to only call methods on the Sender from the same go-routine.
type Sender interface {
	// Append converts the span to the wire representation and adds it
	// to sender's internal buffer.  If the buffer exceeds its designated
	// size, the transport should call Flush() and return the number of spans
	// flushed, otherwise return 0. If error is returned, the returned number
	// of spans is treated as failed span, and reported to metrics accordingly.
	Append(span *Span) (int, error)

	// Flush submits the internal buffer to the remote server. It returns the
	// number of spans flushed. If error is returned, the returned number of
	// spans is treated as failed spans, and reported to metrics accordingly.
	Flush() (int, error)
}
@rmfitzpatrick
Copy link

rmfitzpatrick commented Jan 22, 2019

What about error conditions in Transport and Sender? Should a multiple return value be standardized for marshaling up errors like in go's Flush() and Append()? This wouldn't be idiomatic in python or java, where exceptions should be clearly thrown, but the metrics factories require a fair amount of awareness in the RemoteReporter that can't be easily obtained otherwise (e.g. keeping a counter of appended spans).

@yurishkuro
Copy link
Member Author

I think returning multiple values is fine in Python, I've seen it done. In Java, we could throw a custom exception, or use an observer, or ... open to suggestions

@Falco20019
Copy link

Falco20019 commented Apr 16, 2020

What's the status here? I stumbled over it again the last couple days when refactoring some stuff. In C#, we already have senders, but the Transport and Encoder levels are included in it. That lead to some duplication between gRPC's GrpcSender and Thrift's UdpSender (especially the max size and append logic). So I would be looking into it soon.

Encoder could be a bit problematic, as Thrift and gRPC both use specific implementations that include the encoding. So maybe it's easier to do this by another approach, similar to what Thrift does (where Transport is used by Protocol).

I would like an approach, where the data would be encapsulated in a way that allows us to write a more generic Sender.

The approach that I will look into next is having an Encoder that can generate Process, Batch and Span defined and using a common interface with getSize(). This way, I'm not as generic as having an Encoder that only knows about byte[], but am still able to create Transport instances for Thrift and gRPC that can be used by the same Sender.

@Falco20019
Copy link

@yurishkuro I played around with this idea and got something working over at https://github.com/jaegertracing/jaeger-client-csharp/tree/Falco20019/encoder-transport

I ended up with Senders, Encoders and Transports. Senders are encapsulating the logic what happens at append and flush. The Encoder knows the Transport and only has a method to get an encode span. The Transport has a method to get the size of an object. The GRPC and Thrift Encoders are based on the SizedBatch encoder and implement an encoder that knows GrpcTransport or ThriftTransport (base class / interface for all transports working with this encoder). The encoder also is responsible for encoding the data from Span to IEncodedData which contains the size.

The Transport itself is responsible for the connection, for finding out the real data size (delegated from the encoder and enhanced if needed) and the real writing to the connection. I sadly found no more generic way that still allows us to use Thrift and GRPC with generated code.

What do you think about this approach? I like it to the extend that it abstracts the logic for the size stuff and append-flush-cycles away. But I don't like that I was not able to find a more generic and fitting way.

@yurishkuro
Copy link
Member Author

I think the concept of Encoder works ok when dealing with encoding-agnostic transports like UDP or HTTP, but I am not sure how easy it is to use byte[] payloads (or any other fixed type) with gRPC, which usually uses strongly-typed auto-generated clients.

@yurishkuro
Copy link
Member Author

Per #3362, we're sunsetting Jaeger clients.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta-issue An tracking issue that requires work in other repos
Projects
None yet
Development

No branches or pull requests

3 participants