Open Telemetry

From bibbleWiki
Jump to navigation Jump to search

Introduction

This is a quick page about open telemetry which I came across doing the Effect_TS.

Four things

Four things must ye know about Open Telemetry

  • Traces - Spans
  • Metrics - Performances, counts etc
  • Logs - No Explanation required
  • Baggage - contextual information we want to pass from span to span

This is an example of a trace with spans.

Trace: "User logs in"
├── Span: "Frontend sends login request"
├── Span: "API receives request"
│   ├── Span: "Validate credentials"
│   └── Span: "Query user DB"

Other Terms

  • OTLP - Open Telemetry Protocol Sent over gRPC or HTTP.
  • Collector - Own Service to accept data and push it out
  • Instrumentation - How you make the data. Some frameworks like expressJS, effectTS have automatic but you can use a OpenTelemetry SDK

Configuring Collector

There are six things you configure in a collector

  • Receivers - How is how you collector takes in data
  • Processors - Cleansing, Transform
  • Extensions - Extra things, e.g. health monitor
  • Exporters - Push data out, e.g Loki, Prometheus
  • Pipelines - The path that telemetry data—traces, metrics, or logs—follows from ingestion to export
  • Connectors - Can export data from one pipeline and feed it into another
connectors:
  spanmetrics:

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [spanmetrics]

    metrics:
      receivers: [spanmetrics]
      exporters: [prometheus]

Example with Node

npm install @opentelemetry/sdk-node \
  @opentelemetry/api \
  @opentelemetry/auto-instrumentations-node \
  @opentelemetry/sdk-metrics \
  @opentelemetry/sdk-trace-node

We can run this with

npx  tsx --import  ./src/opentelemetry/instrumentation.ts  ./src/app.ts

Example with CSharp

Configure OpenTelemetry Resource/Tracing/Metrics

So we need to configure these three

public static class IServiceCollectionExtension
{
    public static IServiceCollection AddTelemetry(
        this IServiceCollection serviceCollection,
        ITelemetryConfiguration telemetryConfig,
        IHoneycombConfiguration honeycombConfig)
    {
        // Register telemetry service first
        serviceCollection
            .AddSingleton<ITelemetryServiceFactory, TelemetryServiceFactory>()
            .AddSingleton<ITelemetryService>(sp =>
            {
                try
                {
                    return sp.GetRequiredService<ITelemetryServiceFactory>().CreateTelemetryService();
                }
                catch (Exception ex)
                {
                    Console.WriteLine($"Error creating telemetry service: {ex.Message}");
                    throw; // Re-throw to prevent silent failures
                }
            });

        // Create the builder first
        var builder = serviceCollection
            .AddOpenTelemetry()
            .ConfigureResource(resourceBuilder =>
                resourceBuilder.ConfigurableResource(telemetryConfig)
            )
            .WithTracing(tracerProviderBuilder =>
                tracerProviderBuilder.ConfigureTracing(telemetryConfig, honeycombConfig)
            )
           .WithMetrics(metricsBuilder =>
                metricsBuilder.ConfigureMetrics(telemetryConfig, honeycombConfig)
            );

        return serviceCollection;
    }
}

Configure Resources

In CSharp you have, and I expect in the others too, Resources. These are things you configure once and a attached to the OTel data. They are only sent once per batch. To set these up you do the following. Add an extension to build them.

    public static ResourceBuilder ConfigurableResource(
        this ResourceBuilder resourceBuilder,
        ITelemetryConfiguration telemetryConfig)
    {
        resourceBuilder
            .AddService(
                    telemetryConfig.ServiceName,
                    telemetryConfig.ServiceNamespace,
                    telemetryConfig.ServiceVersion)
            .AddAttributes(BuildAttributes(telemetryConfig));

        return resourceBuilder;
    }

</syntaxhighlight>

Configure Tracing

Tracing is where you configure the Tracing sent to OTel. The Dataset is sent in the header. I had to remove the http Instrumentation as it prevented the Grpc being tied to the other events.

    public static TracerProviderBuilder ConfigureTracing(
        this TracerProviderBuilder tracerProviderBuilder,
        ITelemetryConfiguration telemetryConfig,
        IHoneycombConfiguration honeycombConfig)
    {
        // Configure basic tracing without external exporters first
        tracerProviderBuilder
            .AddSamplerWhen(telemetryConfig.Environment == "Development")
            .AddAspNetCoreInstrumentation()
            .AddGrpcClientInstrumentation(options =>
            {
                options.SuppressDownstreamInstrumentation = true;
            })
            // .AddHttpClientInstrumentation()
            .AddSource(telemetryConfig.ServiceName)
            .AddConsoleExporterWhen(telemetryConfig.EnableConsoleExporter == true && telemetryConfig.Environment == "Development")
            .AddOtlpExporterWhen(honeycombConfig, telemetryConfig.Enabled);

        return tracerProviderBuilder;
    }

Configure Metrics

This is straightforward but here for completeness.

    public static MeterProviderBuilder ConfigureMetrics(
            this MeterProviderBuilder meterProviderBuilder,
            ITelemetryConfiguration telemetryConfig,
            IHoneycombConfiguration honeycombConfig)
    {
        // Configure basic metrics
        meterProviderBuilder
            // Add a resource with service name, version, and namespace
            .AddAspNetCoreInstrumentation()
            // Add HTTP client instrumentation
            .AddHttpClientInstrumentation()
            // Addi runtime instrumentation
            .AddRuntimeInstrumentation()
            // Add SQL client instrumentation if needed
            .AddSqlClientInstrumentation()
            //Adding process instrumentation (cpu, memory etc.)
            .AddProcessInstrumentation()
            // Add custom meters
            .AddMeter(
                "Microsoft.AspNetCore.Hosting",
                "Microsoft.AspNetCore.Server.Kestrel",
                "Microsoft.AspNetCore.Http.Connections",
                telemetryConfig.ServiceName)
            .AddConsoleExporterWhen(telemetryConfig.EnableConsoleExporter == true && telemetryConfig.Environment == "Development")
            .AddOtlpExporter(opts =>
            {
                // Set the endpoint explicitly
                opts.Endpoint = new Uri(honeycombConfig.MetricsEndpoint);
                // Include dataset and environment in headers
                opts.Headers = $"x-honeycomb-team={honeycombConfig.ApiKey},x-honeycomb-dataset={honeycombConfig.Dataset},x-honeycomb-environment={honeycombConfig.Environment}";
                // Set the correct protocol
                opts.Protocol = OtlpExportProtocol.HttpProtobuf;
            });

        return meterProviderBuilder;
    }

Activities

Overview

So I struggled with this as the robot took over and hid how it works. What I ended up doing was wrapping my creation of activity inside a helper class. This had two constructors, one which take a parent context and one that does not. The idea is that if you start and activity and you want to track the stack e.g.

foo1()
  foo2()
    foo3()

You create an Activity with a default parent context for the first call but for the subsequent calls you look in something call Activity.Current, check it is not null and use that for the parent context which should be the previous calls spam id.

    public OperationTracker(
        OperationType type,
        string operationName,
        ActivityContext parentContext,
        TagList? tags,
        TracingProvider tracing,
        MetricsProvider metrics)
    {
        _type = type;
        _operationName = operationName ?? throw new ArgumentNullException(nameof(operationName));
        _parentContext = parentContext;
        _tracing = tracing ?? throw new ArgumentNullException(nameof(tracing));
        _metrics = metrics ?? throw new ArgumentNullException(nameof(metrics));

        // Check for listeners before creating a new activity
        if (!_tracing.ActivitySource.HasListeners())
        {
            Console.WriteLine($"Warning: ActivitySource '{_tracing.ActivitySource.Name}' has no listeners. Activity '{_operationName}' will not be created.");
        }

        _activity = _tracing.ActivitySource.StartActivityWithParentAndTags(
            operationName,
            _parentContext,
            ActivityKind.Server
            );

        if (_activity != null && tags != null)
        {
            _activity.AddTags(tags);
            Console.WriteLine($"{_activity.DisplayName}/{operationName}/{_activity.SpanId}: Making new Activity with Kind {_activity.Kind} and with Parent: Parent SpanId {parentContext.SpanId}");
        }

        // Count the request regardless of sampling
        _metrics.RecordRequest(_type, operationName);
    }

Interceptor

For my Grpc I used interceptors to wrap the calls. This way I could create the initial call for all Grpc endpoints. I my case GetActor, GetActors, GetFilm, Getfilms. This got me used to interceptors again on .NET and reduced the code. The GetActor just does the following

Get Actor
  Get Actor From Database

So there is a Actor Repository

    public async Task<ActorEntity?> GetActorAsync(int actorId, CancellationToken cancellationToken = default)
    {
        _logger.LogDebug("Getting actor with ID {ActorId} from database", actorId);

        using var dbTracker = _telemetryService.TrackOperation(
            OperationType.DatabaseOperation,
            DiagnosticNames.DatabaseGetActor);

        try
        {
            const string selectQuery = "SELECT actor_id, first_name, last_name, last_update FROM actor WHERE actor_id = @actor_id";

            // Validate SQL statement with parameters
            var parameters = new Dictionary<string, object> { { "@actor_id", actorId } };

            // Validate SQL statement
            var validationResult = await _sqlValidator.ValidateAsync(connection, selectQuery, parameters, cancellationToken);
            if (!validationResult.IsValid)
            {
                _logger.LogError("SQL Validation Error: {ErrorMessage}", validationResult.ErrorMessage);
                throw new SqlValidationException($"Invalid SQL statement: {validationResult.ErrorMessage}", 
                    selectQuery);
            }

            // Ensure connection is opened only if it's closed
            if (connection.State != System.Data.ConnectionState.Open)
            {
                _logger.LogInformation("Opening connection for GetActorAsync");
                await connection.OpenAsync(cancellationToken);
            }

            await using var cmd = connection.CreateCommand();
            cmd.CommandText = selectQuery;
            cmd.Parameters.AddWithValue("@actor_id", actorId);

            // Execute reader
            await using var reader = await cmd.ExecuteReaderAsync(cancellationToken);

            ActorEntity? actorEntity = null;

            if (await reader.ReadAsync(cancellationToken))
            {
                actorEntity = BuildActorEntity(reader);
            }

            return actorEntity;
        }
        catch (SqlValidationException ex)
        {
            _logger.LogError(ex, "Error SQL Validation error");
            dbTracker.TrackException(ex);
            throw new RpcException(new Status(StatusCode.Internal, $"Invalid SQL statement: {ex.Message}"));
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Error retrieving actor with ID {ActorId}", actorId);
            dbTracker.TrackException(ex);
            throw;
        }
    }

Instrumentation

Many of the technologies come with their own instrumentation libraries. This includes

  • AspNetCore
  • Http
  • SqlClient
  • Grpc

So we can add these

dotnet add package OpenTelemetry.Instrumentation.AspNetCore
dotnet add package OpenTelemetry.Instrumentation.Http
dotnet add package OpenTelemetry.Instrumentation.SqlClient --prerelease
dotnet add package OpenTelemetry.Instrumentation.GrpcNetClient --prerelease

We can add the tracing and metrics for these out of the box

    private static void ConfigureTracing(TracerProviderBuilder builder)
    {
        builder
            .AddAspNetCoreInstrumentation()
            .AddHttpClientInstrumentation()
            .AddSqlClientInstrumentation()
            .AddGrpcClientInstrumentation();
    }

    private static void ConfigureMetrics(MeterProviderBuilder builder)
    {
        builder
            .AddAspNetCoreInstrumentation()
            .AddHttpClientInstrumentation()
            .AddSqlClientInstrumentation();
            // No gRPC metrics instrumentation available yet
    }

We Can add our one metrics by building our on metrics.

Originally I started with this but soon found out it needs to be a bit more sophisticated.
public static class DiagnostConfig
{
    public const string ServiceName = "GrpcServer";
    public const string ServiceVersion = "1.0.0";
    
    public static readonly Meter Meter = new(ServiceName, ServiceVersion);
    
    public static readonly Counter<long> RequestCounter = Meter.CreateCounter<long>(
        name: "grpc.server.requests",
        unit: "requests",
        description: "Number of gRPC requests received");
        
    public static readonly ActivitySource ActivitySource = new(ServiceName, ServiceVersion);
}

In the end it amounted to creating these. You can find these in C# grpc project

  • ServiceInfo - Holds basic Service information Name, Version etc
  • MetricProvider - Defines and exposes application metrics
  • OperationTracker - Combines metrics and tracing for a single operation
  • TelemetryService - The central orchestration point for all telemetry operations
  • Tracing Provider - Manages distributed tracing across service boundaries

You generally maker one set of these per application and once done you add them to your Tracing Provider.

private static void ConfigureTracing(TracerProviderBuilder builder)
    {
        builder
           .AddSource(MetricsProvider.ActivitySourceName)
...

Here is an example of the usage in the Grpc Server.

    public async Task<Actor> GetActorAsync(string actorId, CancellationToken cancellationToken = default)
    {
        // Track both request count and duration with one call
        using var tracker = _telemetryService.TrackOperation("GetActorAsync",
            new Dictionary<string, object?> { ["request.actorId"] = actorId });

        try
        {
            var actorEntity = await _actorRepository.GetActorAsync(actorId, cancellationToken) ??
                throw new ArgumentException($"Actor with ID {actorId} not found.");
            return _actorMapper.MapFromEntity(actorEntity);
        }
        catch (Exception ex)
        {
            _telemetryService.Metrics.RecordError("GetActorAsync", ex);
            throw;
        }
    }

Span And Activity

In the C# world a Span is and Activity. A span is

  • Time bound analysis
  • Scoped piece work e.g. http request, grpc request, sql call
  • Causality What caused this to happen e.g. a http request might cause a sql call
  • Useful for measure complex code timings

Exporters

Console

So lets start with the console one first

dotnet add package OpenTelemetry.Exporter.Console

And add to the ConfigureTracing

private static void ConfigureTracing(TracerProviderBuilder builder)
{
    builder
        // Add instrumentations
...
        
        // Add our custom ActivitySource
        .AddSource(MetricsProvider.ActivitySourceName)
        
        // Console exporter - great for development
        .AddConsoleExporter(options => {
            options.Targets = ConsoleExporterOutputTargets.Console;
        });
}

And to Metrics

===
private static void ConfigureMetrics(MeterProviderBuilder builder)
{
    builder
        // Add instrumentations
...        
        // Add our custom Meter
        .AddMeter(MetricsProvider.MeterName)
        
        // Console exporter - great for development
        .AddConsoleExporter();
}

Honeycomb

Add the protocol

dotnet add package OpenTelemetry.Exporter.OpenTelemetryProtocol

I spent a lot of time on getting this set up due to the robot misleading me. Traces are routed to Honeycomb based on there service name where as metrics do respect the header x-honeycomb-dataset. This was not known by the robot. Anyway here it is. First Metrics

    private static void ConfigureMetrics(
        MeterProviderBuilder builder,
        TelemetryConfig telemetryConfig,
        HoneycombConfig honeycombConfig)
    {
        Console.WriteLine("Configuring Honeycomb metrics...");

        builder
            // .AddAspNetCoreInstrumentation()
            // .AddHttpClientInstrumentation()
            // .AddSqlClientInstrumentation()

            // Add custom meters
            .AddMeter(telemetryConfig.ServiceName)

            // Add exporters
            .AddConsoleExporter()
            .AddOtlpExporter(opts =>
            {
                // Set the endpoint explicitly
                opts.Endpoint = new Uri("https://api.honeycomb.io/v1/metrics");

                // Include dataset and environment in headers
                opts.Headers = $"x-honeycomb-team={honeycombConfig.ApiKey},x-honeycomb-dataset={honeycombConfig.Dataset},x-honeycomb-environment={honeycombConfig.Environment}";

                // Set the correct protocol
                opts.Protocol = OtlpExportProtocol.HttpProtobuf;
            });
    }

And now Tracing

private static void ConfigureTracing(
        TracerProviderBuilder builder,
        TelemetryConfig telemetryConfig,
        HoneycombConfig honeycombConfig)
    {     
        builder
            .SetSampler(new AlwaysOnSampler())
            // .AddAspNetCoreInstrumentation()
            // .AddHttpClientInstrumentation()
            // .AddSqlClientInstrumentation(options =>
            // {
            //     // Enable capturing exceptions
            //     options.RecordException = true;
            // })
            // .AddGrpcClientInstrumentation()
            .AddSource(telemetryConfig.ServiceName)
            .AddConsoleExporter()
            .AddOtlpExporter(opts =>
            {
                // Set the endpoint explicitly
                opts.Endpoint = new Uri("https://api.honeycomb.io/v1/traces");

                // Include dataset and environment in headers
                // Note: x-honeycomb-dataset is ignored for tracing. Instead it uses the service name.
                opts.Headers = $"x-honeycomb-team={honeycombConfig.ApiKey},x-honeycomb-environment={honeycombConfig.Environment}";

                // Set the correct protocol
                opts.Protocol = OtlpExportProtocol.HttpProtobuf;

                opts.BatchExportProcessorOptions = new BatchExportProcessorOptions<Activity>
                {
                    MaxExportBatchSize = 512,
                    ScheduledDelayMilliseconds = 1000,
                    ExporterTimeoutMilliseconds = 30000,
                    MaxQueueSize = 2048,
                };
            });
    }