Open Telemetry

From bibbleWiki
Jump to navigation Jump to search

Introduction

This is a quick page about open telemetry which I came across doing the Effect_TS.

Four things

Four things must ye know about Open Telemetry

  • Traces - Spans
  • Metrics - Performances, counts etc
  • Logs - No Explanation required
  • Baggage - contextual information we want to pass from span to span

This is an example of a trace with spans.

Trace: "User logs in"
├── Span: "Frontend sends login request"
├── Span: "API receives request"
│   ├── Span: "Validate credentials"
│   └── Span: "Query user DB"

Other Terms

  • OTLP - Open Telemetry Protocol Sent over gRPC or HTTP.
  • Collector - Own Service to accept data and push it out
  • Instrumentation - How you make the data. Some frameworks like expressJS, effectTS have automatic but you can use a OpenTelemetry SDK

Configuring Collector

There are six things you configure in a collector

  • Receivers - How is how you collector takes in data
  • Processors - Cleansing, Transform
  • Extensions - Extra things, e.g. health monitor
  • Exporters - Push data out, e.g Loki, Prometheus
  • Pipelines - The path that telemetry data—traces, metrics, or logs—follows from ingestion to export
  • Connectors - Can export data from one pipeline and feed it into another
connectors:
  spanmetrics:

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [spanmetrics]

    metrics:
      receivers: [spanmetrics]
      exporters: [prometheus]

Example with Node

npm install @opentelemetry/sdk-node \
  @opentelemetry/api \
  @opentelemetry/auto-instrumentations-node \
  @opentelemetry/sdk-metrics \
  @opentelemetry/sdk-trace-node

We can run this with

npx  tsx --import  ./src/opentelemetry/instrumentation.ts  ./src/app.ts

Example with CSharp

Configure Resources

In CSharp you have, and I expect in the other too, Resources. These are things you configure once and a attached to the OTel data. They are only sent once per batch. To set these up you do the following. Add the nuget packages

dotnet add package OpenTelemetry
dotnet add package OpenTelemetry.Extensions.Hosting

And then add an extension to build them.

    public static IServiceCollection AddAppConfiguration(this IServiceCollection services, IConfiguration configuration)
    {
        services.AddOpenTelemetry()
            .ConfigureResource(ConfigureResources);
        return services;
    }

    private static void ConfigureResources(ResourceBuilder resourceBuilder)
    {
        resourceBuilder
            .AddService("GrpcServerCs", serviceVersion: "1.0.0")
            .AddAttributes(new Dictionary<string, object>
            {
                    { "service.name", "GrpcServerCs" },
                    { "service.namespace", "GrpcServer" },
                    { "service.instance.id", Environment.MachineName },
                    { "service.version", "1.0.0" },
                    { "deployment.environment", "development" }
            });
    }

Instrumentation

Many of the technologies come with their own instrumentation libraries. This includes

  • AspNetCore
  • Http
  • SqlClient
  • Grpc

So we can add these

dotnet add package OpenTelemetry.Instrumentation.AspNetCore
dotnet add package OpenTelemetry.Instrumentation.Http
dotnet add package OpenTelemetry.Instrumentation.SqlClient --prerelease
dotnet add package OpenTelemetry.Instrumentation.GrpcNetClient --prerelease

We can add the tracing and metrics for these out of the box

    private static void ConfigureTracing(TracerProviderBuilder builder)
    {
        builder
            .AddAspNetCoreInstrumentation()
            .AddHttpClientInstrumentation()
            .AddSqlClientInstrumentation()
            .AddGrpcClientInstrumentation();
    }

    private static void ConfigureMetrics(MeterProviderBuilder builder)
    {
        builder
            .AddAspNetCoreInstrumentation()
            .AddHttpClientInstrumentation()
            .AddSqlClientInstrumentation();
            // No gRPC metrics instrumentation available yet
    }

We Can add our one metrics by building our on metrics.

Originally I started with this but soon found out it needs to be a bit more sophisticated.
public static class DiagnostConfig
{
    public const string ServiceName = "GrpcServer";
    public const string ServiceVersion = "1.0.0";
    
    public static readonly Meter Meter = new(ServiceName, ServiceVersion);
    
    public static readonly Counter<long> RequestCounter = Meter.CreateCounter<long>(
        name: "grpc.server.requests",
        unit: "requests",
        description: "Number of gRPC requests received");
        
    public static readonly ActivitySource ActivitySource = new(ServiceName, ServiceVersion);
}

In the end it amounted to creating these. You can find these in C# grpc project

  • ServiceInfo - Holds basic Service information Name, Version etc
  • MetricProvider - Defines and exposes application metrics
  • OperationTracker - Combines metrics and tracing for a single operation
  • TelemetryService - The central orchestration point for all telemetry operations
  • Tracing Provider - Manages distributed tracing across service boundaries

You generally maker one set of these per application and once done you add them to your Tracing Provider.

private static void ConfigureTracing(TracerProviderBuilder builder)
    {
        builder
           .AddSource(MetricsProvider.ActivitySourceName)
...

Here is an example of the usage in the Grpc Server.

    public async Task<Actor> GetActorAsync(string actorId, CancellationToken cancellationToken = default)
    {
        // Track both request count and duration with one call
        using var tracker = _telemetryService.TrackOperation("GetActorAsync",
            new Dictionary<string, object?> { ["request.actorId"] = actorId });

        try
        {
            var actorEntity = await _actorRepository.GetActorAsync(actorId, cancellationToken) ??
                throw new ArgumentException($"Actor with ID {actorId} not found.");
            return _actorMapper.MapFromEntity(actorEntity);
        }
        catch (Exception ex)
        {
            _telemetryService.Metrics.RecordError("GetActorAsync", ex);
            throw;
        }
    }

Span And Activity

In the C# world a Span is and Activity. A span is

  • Time bound analysis
  • Scoped piece work e.g. http request, grpc request, sql call
  • Causality What caused this to happen e.g. a http request might cause a sql call
  • Useful for measure complex code timings

Exporters

Console

So lets start with the console one first

dotnet add package OpenTelemetry.Exporter.Console

And add to the ConfigureTracing

private static void ConfigureTracing(TracerProviderBuilder builder)
{
    builder
        // Add instrumentations
...
        
        // Add our custom ActivitySource
        .AddSource(MetricsProvider.ActivitySourceName)
        
        // Console exporter - great for development
        .AddConsoleExporter(options => {
            options.Targets = ConsoleExporterOutputTargets.Console;
        });
}

And to Metrics

===
private static void ConfigureMetrics(MeterProviderBuilder builder)
{
    builder
        // Add instrumentations
...        
        // Add our custom Meter
        .AddMeter(MetricsProvider.MeterName)
        
        // Console exporter - great for development
        .AddConsoleExporter();
}

Honeycomb

Add the protocol

dotnet add package OpenTelemetry.Exporter.OpenTelemetryProtocol

I spent a lot of time on getting this set up due to the robot misleading me. Traces are routed to Honeycomb based on there service name where as metrics do respect the header x-honeycomb-dataset. This was not known by the robot. Anyway here it is. First Metrics

    private static void ConfigureMetrics(
        MeterProviderBuilder builder,
        TelemetryConfig telemetryConfig,
        HoneycombConfig honeycombConfig)
    {
        Console.WriteLine("Configuring Honeycomb metrics...");

        builder
            // .AddAspNetCoreInstrumentation()
            // .AddHttpClientInstrumentation()
            // .AddSqlClientInstrumentation()

            // Add custom meters
            .AddMeter(telemetryConfig.ServiceName)

            // Add exporters
            .AddConsoleExporter()
            .AddOtlpExporter(opts =>
            {
                // Set the endpoint explicitly
                opts.Endpoint = new Uri("https://api.honeycomb.io/v1/metrics");

                // Include dataset and environment in headers
                opts.Headers = $"x-honeycomb-team={honeycombConfig.ApiKey},x-honeycomb-dataset={honeycombConfig.Dataset},x-honeycomb-environment={honeycombConfig.Environment}";

                // Set the correct protocol
                opts.Protocol = OtlpExportProtocol.HttpProtobuf;
            });
    }

And now Tracing

private static void ConfigureTracing(
        TracerProviderBuilder builder,
        TelemetryConfig telemetryConfig,
        HoneycombConfig honeycombConfig)
    {     
        builder
            .SetSampler(new AlwaysOnSampler())
            // .AddAspNetCoreInstrumentation()
            // .AddHttpClientInstrumentation()
            // .AddSqlClientInstrumentation(options =>
            // {
            //     // Enable capturing exceptions
            //     options.RecordException = true;
            // })
            // .AddGrpcClientInstrumentation()
            .AddSource(telemetryConfig.ServiceName)
            .AddConsoleExporter()
            .AddOtlpExporter(opts =>
            {
                // Set the endpoint explicitly
                opts.Endpoint = new Uri("https://api.honeycomb.io/v1/traces");

                // Include dataset and environment in headers
                // Note: x-honeycomb-dataset is ignored for tracing. Instead it uses the service name.
                opts.Headers = $"x-honeycomb-team={honeycombConfig.ApiKey},x-honeycomb-environment={honeycombConfig.Environment}";

                // Set the correct protocol
                opts.Protocol = OtlpExportProtocol.HttpProtobuf;

                opts.BatchExportProcessorOptions = new BatchExportProcessorOptions<Activity>
                {
                    MaxExportBatchSize = 512,
                    ScheduledDelayMilliseconds = 1000,
                    ExporterTimeoutMilliseconds = 30000,
                    MaxQueueSize = 2048,
                };
            });
    }