Segment7 Blog

OpenTelemetry service graph and Rust

4 min read tags:

The service graphs connector builds a map representing the interrelationships between various services in a system. The connector will analyse trace data and generate metrics describing the relationship between the services. These metrics can be used by data visualization apps (e.g. Grafana) to draw a service graph

Service Graph Connector

The service graph includes request and error rates between the various systems that are being traced. You can also include edges to systems that you don't collect traces for.

Here's a rendering of a graph of a set of services I run that annotate a picture of the sky with current weather and air quality conditions, including prometheus for monitoring metrics. The camera, purpleair sensor, and pushgateway services don't export OpenTelemetry traces but can be included in the graph.

---
config:
  theme: base
  themeVariables:
    primaryColor: "#552ba9"
    primaryTextColor: "#f9f7f0"
  fontFamily: Atkinson Hyperlegible Next
---

flowchart TD
  A([sky-camera])
  B([ambient-weather-local])
  C([camera])
  D([make-timelapse])
  E([prometheus])
  F([purpleair-exporter])
  G([purpleair sensor])
  H([pushgateway])
  I([unknown])
  J([user])

  A --> B
  A --> C
  A --> E
  A --> F
  B --> H
  E --> D
  E --> F
  E --> I
  F --> G
  J --> E
  J --> B

OpenTelemetry context

The OpenTelemetry context connects traces across your systems. The context will be propogated when you have configured your systems to ingest the context from an inbound request and inject the context into outbound requests.

For rust projects I use the tracing family of crates for tracing internal to a service along with the opentelemetry family of crates for exporting these to a collector.

axum-tracing-opentelemetry

I built my services using axum so I use axum-tracing-opentelemetry to propagate traces across systems for inbound requests. The crate provides two middleware you add to your router.

let app = Router::new()
    .route_service(/* ... */)
    /* other services ... */
    .layer(OtelInResponseLayer)
    .layer(OtelAxumLayer::default())
    // other middleware, or untraced services like health checks

If you create your own tracing setup be sure to call set_text_map_propagator. Without this the OtelInResponseLayer will not inject response headers and your exported traces will not connect.

let propagator =
    opentelemetry_sdk::propagation::TraceContextPropagator::new();

opentelemetry::global::set_text_map_propagator(propagator);

If you can't be bothered, init-tracing-opentelemetry will do this for you.

reqwest-tracing

I use reqwest for outbound requests to other services with reqwest-tracing (and reqwest-middleware) to inject the tracing context.

Crate features

You need to enable one of the opentelemetry features in reqwest-tracing:

[dependencies]
reqwest-tracing = {
    version = "0.5",
    features = ["opentelemetry_0_30"]
}
# …

Outbound requests

If all the services you call also emit OpenTelemetry traces you can use the default configuration, perhaps with a customized span name:

let client = reqwest::Client::builder().build().unwrap();
let client = reqwest_middleware::ClientBuilder::new(client)
   .with_init(Extension(OtelName("my-client".into())))
   .with(TracingMiddleware::default())
   .build();

Then build and submit a request:

client
    .get(url)
    .header(ACCEPT, "text/json")
    .send()
    .await

The above request will propagate the trace context to the client.

I ran into an issue where the span name was dropped (not recorded) when using reqwest_middleware::RequestBuilder::build(). This function returns a reqwest::Request which does not include the Extensions you set on the request. When you eventually call ClientWithMiddleware::execute() the span information has disappeared. Instead, call RequestBuilder::send().

Untraced peers

If your target service does not export OTEL services, like the camera or Purpleair sensor mentioned above, they can still be included in the service graph by setting the peer.service attribute. This attribute is most often used with databases, but non-OTEL HTTP services can also count.

Using this with reqwest-tracing requires crating your own backend. I chose to make the peer service optional. I use one ClientWithMiddleware with an OtelName extension and set the PeerService extension when making a request to an untraced peer.

use http::Extensions;
use reqwest::{Request, Response};
use reqwest_middleware::Result;
use reqwest_tracing::{
    ReqwestOtelSpanBackend, default_on_request_end, default_span_name, reqwest_otel_span,
};
use tracing::Span;

#[derive(Clone)]
pub struct PeerService(pub &'static str);

#[derive(Default)]
pub struct SpanBackend;

impl ReqwestOtelSpanBackend for SpanBackend {
    fn on_request_start(req: &Request, extensions: &mut Extensions) -> Span {
        let name = default_span_name(req, extensions);

        if let Some(peer_service) = peer_service(extensions) {
            reqwest_otel_span!(name = name, req, peer.service = %peer_service)
        } else {
            reqwest_otel_span!(name = name, req)
        }
    }

    fn on_request_end(span: &Span, outcome: &Result<Response>, _: &mut Extensions) {
        default_on_request_end(span, outcome)
    }
}

fn peer_service(extensions: &Extensions) -> Option<&str> {
    extensions
        .get::<PeerService>()
        .map(|peer_service| peer_service.0)
}

Setting the peer service is similar to the above request:

self.client
    .get(camera_url)
    .with_extension(crate::PeerService("camera"))
    .send()
    .await

Collector configuration

The OpenTelemetry service graph connector calls requests to (or from) untraced peers. By default these will not be considered connections so you need to update your configuration to include these.

For the Service Graph Connector add peer.service to the list of the virtual_node_peer_attributes in the configuration.

For Grafana Alloy add peer.service to the database_name_attribute in the otelcol.connector.servicegraph arguments.

Search