Open Telemetry

From bibbleWiki
Jump to navigation Jump to search

Introduction

This is a quick page about open telemetry which I came across doing the Effect_TS.

Four things

Four things must ye know about Open Telemetry

  • Traces - Spans
  • Metrics - Performances, counts etc
  • Logs - No Explanation required
  • Baggage - contextual information we want to pass from span to span

This is an example of a trace with spans.

Trace: "User logs in"
├── Span: "Frontend sends login request"
├── Span: "API receives request"
│   ├── Span: "Validate credentials"
│   └── Span: "Query user DB"

Other Terms

  • OTLP - Open Telemetry Protocol Sent over gRPC or HTTP.
  • Collector - Own Service to accept data and push it out
  • Instrumentation - How you make the data. Some frameworks like expressJS, effectTS have automatic but you can use a OpenTelemetry SDK

Configuring Collector

There are six things you configure in a collector

  • Receivers - How is how you collector takes in data
  • Processors - Cleansing, Transform
  • Extensions - Extra things, e.g. health monitor
  • Exporters - Push data out, e.g Loki, Prometheus
  • Pipelines - The path that telemetry data—traces, metrics, or logs—follows from ingestion to export
  • Connectors - Can export data from one pipeline and feed it into another
connectors:
  spanmetrics:

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [spanmetrics]

    metrics:
      receivers: [spanmetrics]
      exporters: [prometheus]

Example with Node

npm install @opentelemetry/sdk-node \
  @opentelemetry/api \
  @opentelemetry/auto-instrumentations-node \
  @opentelemetry/sdk-metrics \
  @opentelemetry/sdk-trace-node

We can run this with

npx  tsx --import  ./src/opentelemetry/instrumentation.ts  ./src/app.ts

Example with CSharp

Configure OpenTelemetry Resource/Tracing/Metrics

So we need to configure these three

public static class IServiceCollectionExtension
{
    public static IServiceCollection AddTelemetry(
        this IServiceCollection serviceCollection,
        ITelemetryConfiguration telemetryConfig,
        IHoneycombConfiguration honeycombConfig)
    {
        // Register telemetry service first
        serviceCollection
            .AddSingleton<ITelemetryServiceFactory, TelemetryServiceFactory>()
            .AddSingleton<ITelemetryService>(sp =>
            {
                try
                {
                    return sp.GetRequiredService<ITelemetryServiceFactory>().CreateTelemetryService();
                }
                catch (Exception ex)
                {
                    Console.WriteLine($"Error creating telemetry service: {ex.Message}");
                    throw; // Re-throw to prevent silent failures
                }
            });

        // Create the builder first
        var builder = serviceCollection
            .AddOpenTelemetry()
            .ConfigureResource(resourceBuilder =>
                resourceBuilder.ConfigurableResource(telemetryConfig)
            )
            .WithTracing(tracerProviderBuilder =>
                tracerProviderBuilder.ConfigureTracing(telemetryConfig, honeycombConfig)
            )
           .WithMetrics(metricsBuilder =>
                metricsBuilder.ConfigureMetrics(telemetryConfig, honeycombConfig)
            );

        return serviceCollection;
    }
}

Configure Resources

In CSharp you have, and I expect in the others too, Resources. These are things you configure once and a attached to the OTel data. They are only sent once per batch. To set these up you do the following. Add an extension to build them.

    public static ResourceBuilder ConfigurableResource(
        this ResourceBuilder resourceBuilder,
        ITelemetryConfiguration telemetryConfig)
    {
        resourceBuilder
            .AddService(
                    telemetryConfig.ServiceName,
                    telemetryConfig.ServiceNamespace,
                    telemetryConfig.ServiceVersion)
            .AddAttributes(BuildAttributes(telemetryConfig));

        return resourceBuilder;
    }

</syntaxhighlight>

Configure Tracing

Tracing is where you configure the Tracing sent to OTel. The Dataset is sent in the header. I had to remove the http Instrumentation as it prevented the Grpc being tied to the other events.

    public static TracerProviderBuilder ConfigureTracing(
        this TracerProviderBuilder tracerProviderBuilder,
        ITelemetryConfiguration telemetryConfig,
        IHoneycombConfiguration honeycombConfig)
    {
        // Configure basic tracing without external exporters first
        tracerProviderBuilder
            .AddSamplerWhen(telemetryConfig.Environment == "Development")
            .AddAspNetCoreInstrumentation()
            .AddGrpcClientInstrumentation(options =>
            {
                options.SuppressDownstreamInstrumentation = true;
            })
            // .AddHttpClientInstrumentation()
            .AddSource(telemetryConfig.ServiceName)
            .AddConsoleExporterWhen(telemetryConfig.EnableConsoleExporter == true && telemetryConfig.Environment == "Development")
            .AddOtlpExporterWhen(honeycombConfig, telemetryConfig.Enabled);

        return tracerProviderBuilder;
    }

Configure Metrics

This is straightforward but here for completeness.

    public static MeterProviderBuilder ConfigureMetrics(
            this MeterProviderBuilder meterProviderBuilder,
            ITelemetryConfiguration telemetryConfig,
            IHoneycombConfiguration honeycombConfig)
    {
        // Configure basic metrics
        meterProviderBuilder
            // Add a resource with service name, version, and namespace
            .AddAspNetCoreInstrumentation()
            // Add HTTP client instrumentation
            .AddHttpClientInstrumentation()
            // Addi runtime instrumentation
            .AddRuntimeInstrumentation()
            // Add SQL client instrumentation if needed
            .AddSqlClientInstrumentation()
            //Adding process instrumentation (cpu, memory etc.)
            .AddProcessInstrumentation()
            // Add custom meters
            .AddMeter(
                "Microsoft.AspNetCore.Hosting",
                "Microsoft.AspNetCore.Server.Kestrel",
                "Microsoft.AspNetCore.Http.Connections",
                telemetryConfig.ServiceName)
            .AddConsoleExporterWhen(telemetryConfig.EnableConsoleExporter == true && telemetryConfig.Environment == "Development")
            .AddOtlpExporter(opts =>
            {
                // Set the endpoint explicitly
                opts.Endpoint = new Uri(honeycombConfig.MetricsEndpoint);
                // Include dataset and environment in headers
                opts.Headers = $"x-honeycomb-team={honeycombConfig.ApiKey},x-honeycomb-dataset={honeycombConfig.Dataset},x-honeycomb-environment={honeycombConfig.Environment}";
                // Set the correct protocol
                opts.Protocol = OtlpExportProtocol.HttpProtobuf;
            });

        return meterProviderBuilder;
    }

Instrumentation

Many of the technologies come with their own instrumentation libraries. This includes

  • AspNetCore
  • Http
  • SqlClient
  • Grpc

So we can add these

dotnet add package OpenTelemetry.Instrumentation.AspNetCore
dotnet add package OpenTelemetry.Instrumentation.Http
dotnet add package OpenTelemetry.Instrumentation.SqlClient --prerelease
dotnet add package OpenTelemetry.Instrumentation.GrpcNetClient --prerelease

We can add the tracing and metrics for these out of the box

    private static void ConfigureTracing(TracerProviderBuilder builder)
    {
        builder
            .AddAspNetCoreInstrumentation()
            .AddHttpClientInstrumentation()
            .AddSqlClientInstrumentation()
            .AddGrpcClientInstrumentation();
    }

    private static void ConfigureMetrics(MeterProviderBuilder builder)
    {
        builder
            .AddAspNetCoreInstrumentation()
            .AddHttpClientInstrumentation()
            .AddSqlClientInstrumentation();
            // No gRPC metrics instrumentation available yet
    }

We Can add our one metrics by building our on metrics.

Originally I started with this but soon found out it needs to be a bit more sophisticated.
public static class DiagnostConfig
{
    public const string ServiceName = "GrpcServer";
    public const string ServiceVersion = "1.0.0";
    
    public static readonly Meter Meter = new(ServiceName, ServiceVersion);
    
    public static readonly Counter<long> RequestCounter = Meter.CreateCounter<long>(
        name: "grpc.server.requests",
        unit: "requests",
        description: "Number of gRPC requests received");
        
    public static readonly ActivitySource ActivitySource = new(ServiceName, ServiceVersion);
}

In the end it amounted to creating these. You can find these in C# grpc project

  • ServiceInfo - Holds basic Service information Name, Version etc
  • MetricProvider - Defines and exposes application metrics
  • OperationTracker - Combines metrics and tracing for a single operation
  • TelemetryService - The central orchestration point for all telemetry operations
  • Tracing Provider - Manages distributed tracing across service boundaries

You generally maker one set of these per application and once done you add them to your Tracing Provider.

private static void ConfigureTracing(TracerProviderBuilder builder)
    {
        builder
           .AddSource(MetricsProvider.ActivitySourceName)
...

Here is an example of the usage in the Grpc Server.

    public async Task<Actor> GetActorAsync(string actorId, CancellationToken cancellationToken = default)
    {
        // Track both request count and duration with one call
        using var tracker = _telemetryService.TrackOperation("GetActorAsync",
            new Dictionary<string, object?> { ["request.actorId"] = actorId });

        try
        {
            var actorEntity = await _actorRepository.GetActorAsync(actorId, cancellationToken) ??
                throw new ArgumentException($"Actor with ID {actorId} not found.");
            return _actorMapper.MapFromEntity(actorEntity);
        }
        catch (Exception ex)
        {
            _telemetryService.Metrics.RecordError("GetActorAsync", ex);
            throw;
        }
    }

Span And Activity

In the C# world a Span is and Activity. A span is

  • Time bound analysis
  • Scoped piece work e.g. http request, grpc request, sql call
  • Causality What caused this to happen e.g. a http request might cause a sql call
  • Useful for measure complex code timings

Exporters

Console

So lets start with the console one first

dotnet add package OpenTelemetry.Exporter.Console

And add to the ConfigureTracing

private static void ConfigureTracing(TracerProviderBuilder builder)
{
    builder
        // Add instrumentations
...
        
        // Add our custom ActivitySource
        .AddSource(MetricsProvider.ActivitySourceName)
        
        // Console exporter - great for development
        .AddConsoleExporter(options => {
            options.Targets = ConsoleExporterOutputTargets.Console;
        });
}

And to Metrics

===
private static void ConfigureMetrics(MeterProviderBuilder builder)
{
    builder
        // Add instrumentations
...        
        // Add our custom Meter
        .AddMeter(MetricsProvider.MeterName)
        
        // Console exporter - great for development
        .AddConsoleExporter();
}

Honeycomb

Add the protocol

dotnet add package OpenTelemetry.Exporter.OpenTelemetryProtocol

I spent a lot of time on getting this set up due to the robot misleading me. Traces are routed to Honeycomb based on there service name where as metrics do respect the header x-honeycomb-dataset. This was not known by the robot. Anyway here it is. First Metrics

    private static void ConfigureMetrics(
        MeterProviderBuilder builder,
        TelemetryConfig telemetryConfig,
        HoneycombConfig honeycombConfig)
    {
        Console.WriteLine("Configuring Honeycomb metrics...");

        builder
            // .AddAspNetCoreInstrumentation()
            // .AddHttpClientInstrumentation()
            // .AddSqlClientInstrumentation()

            // Add custom meters
            .AddMeter(telemetryConfig.ServiceName)

            // Add exporters
            .AddConsoleExporter()
            .AddOtlpExporter(opts =>
            {
                // Set the endpoint explicitly
                opts.Endpoint = new Uri("https://api.honeycomb.io/v1/metrics");

                // Include dataset and environment in headers
                opts.Headers = $"x-honeycomb-team={honeycombConfig.ApiKey},x-honeycomb-dataset={honeycombConfig.Dataset},x-honeycomb-environment={honeycombConfig.Environment}";

                // Set the correct protocol
                opts.Protocol = OtlpExportProtocol.HttpProtobuf;
            });
    }

And now Tracing

private static void ConfigureTracing(
        TracerProviderBuilder builder,
        TelemetryConfig telemetryConfig,
        HoneycombConfig honeycombConfig)
    {     
        builder
            .SetSampler(new AlwaysOnSampler())
            // .AddAspNetCoreInstrumentation()
            // .AddHttpClientInstrumentation()
            // .AddSqlClientInstrumentation(options =>
            // {
            //     // Enable capturing exceptions
            //     options.RecordException = true;
            // })
            // .AddGrpcClientInstrumentation()
            .AddSource(telemetryConfig.ServiceName)
            .AddConsoleExporter()
            .AddOtlpExporter(opts =>
            {
                // Set the endpoint explicitly
                opts.Endpoint = new Uri("https://api.honeycomb.io/v1/traces");

                // Include dataset and environment in headers
                // Note: x-honeycomb-dataset is ignored for tracing. Instead it uses the service name.
                opts.Headers = $"x-honeycomb-team={honeycombConfig.ApiKey},x-honeycomb-environment={honeycombConfig.Environment}";

                // Set the correct protocol
                opts.Protocol = OtlpExportProtocol.HttpProtobuf;

                opts.BatchExportProcessorOptions = new BatchExportProcessorOptions<Activity>
                {
                    MaxExportBatchSize = 512,
                    ScheduledDelayMilliseconds = 1000,
                    ExporterTimeoutMilliseconds = 30000,
                    MaxQueueSize = 2048,
                };
            });
    }