With distributed tracing, you observe requests as they move from one service to another in a distributed system. It’s superbly practical for a number of reasons, such as understanding your service connections and diagnosing latency issues, among many other benefits.
However, if the majority of all your requests are successful 200s and finish without unacceptable latency or errors, do you really need all that data? Here’s the thing—you don’t always need a ton of data to find the right insights. You just need the right sampling of data.
The idea behind sampling is to control the spans you send to your observability backend, resulting in lower ingest costs. Different organizations will have their own reasons for not just why they want to sample, but also what they want to sample. You might want to customize your sampling strategy to:
It’s important to use consistent terminology when discussing sampling. A trace or span is considered “sampled” or “not sampled”:
Sometimes, the definitions of these terms get mixed up. You may find someone state that they are “sampling out data” or that data not processed or exported is considered “sampled”. These are incorrect statements.
Head sampling is a sampling technique used to make a sampling decision as early as possible. A decision to sample or drop a span or trace is not made by inspecting the trace as a whole.
For example, the most common form of head sampling is Consistent Probability Sampling. It may also be referred to as Deterministic Sampling. In this case, a sampling decision is made based on the trace ID and a desired percentage of traces to sample. This ensures that whole traces are sampled - no missing spans - at a consistent rate, such as 5% of all traces.
The upsides to head sampling are:
The primary downside to head sampling is that it is not possible make a sampling decision based on data in the entire trace. This means that head sampling is effective as a blunt instrument, but is wholly insufficient for sampling strategies that must take whole-system information into account. For example, it is not possible to use head sampling to ensure that all traces with an error within them are sampled. For this, you need Tail Sampling.
Tail sampling is where the decision to sample a trace takes place by considering all or most of the spans within the trace. Tail Sampling gives you the option to sample your traces based on specific criteria derived from different parts of a trace, which isn’t an option with Head Sampling.
Some examples of how you can use Tail Sampling include:
As you can see, tail sampling allows for a much higher degree of sophistication. For larger systems that must sample telemetry, it is almost always necessary to use Tail Sampling to balance data volume with usefulness of that data.
There are three primary downsides to tail sampling today:
Finally, for some systems, tail sampling may be used in conjunction with Head Sampling. For example, a set of services that produce an extremely high volume of trace data may first use head sampling to only sample a small percentage of traces, and then later in the telemetry pipeline use tail sampling to make more sophisticated sampling decisions before exporting to a backend. This is often done in the interest of protecting the telemetry pipeline from being overloaded.