This is also because Google's Protobuf implementations aren't doing a very good job with avoiding unnecessary allocations. Gogoproto is better and it is possible to do even better, here is an example prototype I have put together for Go (even if you do not use the laziness part it is still much faster than Google's implementation): https://github.com/splunk/exp-lazyproto
Otel logs aim to record the execution context in the logs.
In languages when the context is implicitly passed (e.g. via thread-local storage / MDC in Java) Otel automatically injects trace id and span id in the logs emitted using your regular logging library (e.g. log4j). Then in your log backend you can make queries like "show me all log records of all services in my distributed system that were part of this particular user request".
Disclosure: I am an Otel contributor, working on logs (work-in-progress, not for production use yet).
This. The statelessness of the OTLP is by design. I did consider stateful designs with e.g. shared state dictionary compression but eventually chose not to, so that the intermediaries can remain stateless.
An extension to OTLP that uses shared state (and columnar encoding) to achieve more compact representation and is suitable for the last network leg in the data delivery path has been proposed and may become a reality in the future: https://github.com/open-telemetry/oteps/pull/171
Windows has something like 15,000 performance counters and error metrics that can be collected. There isn’t a system on earth that can even approach this. At scale, I have to pick and choose maybe 20-100 counters for fear of overloading a cluster(!) of servers collecting the data… once a minute.
That’s because the protocol overheads cause “write multiplication” of a hundred-to-one or worse. Every byte of metric ends up nearly a kilobyte on the wire.
Meanwhile I did some experiments that showed that even with a tiny bit of crude data-oriented design and delta compression a single box could collect 10K metrics across 10K endpoints every second without breaking a sweat.
The modern REST / RPC approach is fine for business apps but is an unmitigated disaster for collecting tiny metrics.
Set your goals higher than collecting a selected subset of 1% of the available metrics 60x less frequently than admins would like…
Article author here, good to see it on HN, someone else has submitted it (thanks :-)).
If you are interested in the topic you may be also interested in a research library I wrote recently: https://github.com/splunk/exp-lazyproto, which among other things exploits the partial (de)serialization technique. This is just a prototype for now, one day I may actually do a production quality implementation.
BigBrotherBird (now OpenZipkin... thanks legal, sigh) used 128b trace_ids when we first built it at Twitter. I don’t recall the reasoning, but that’s the first system I know of which chose that size.
Dapper used 64b IDs for span and trace, but being locked inside the Googleplex probably limited its influence on compatibility issues.
My point is that 128b is the common standard now, and that’s all that I really care about - that the standard exists and APM systems conform to it. To that end, I am very pro-otel.
Disclosure: I am the author.