Skip to content

Observability

Wallaby is instrumented with OpenTelemetry metrics and traces using the built-in .NET primitives (System.Diagnostics.Metrics.Meter and System.Diagnostics.ActivitySource). You can configure OTEL for Wallaby by adding its meter and activity source to your telemetry pipeline.

Enabling

csharp
services.AddOpenTelemetry()
    .WithMetrics(m => m
        .AddMeter("Wallaby")
    .WithTracing(t => t
        .AddSource("Wallaby")
        .AddNpgsql());                 // nests enrichment queries under Wallaby's transform spans

The meter and source names are also exposed as constants: WallabyInstrumentation.MeterName and WallabyInstrumentation.ActivitySourceName.

Metrics

Durations are in seconds (OpenTelemetry convention);

MetricTypeAttributesDescription
wallaby.changes.receivedCounterwallaby.slot, wallaby.action, wallaby.sourceMaterialized change events received (live and backfill).
wallaby.ingestion.lagHistogram (s)wallaby.slotDelay between a source transaction's commit and Wallaby receiving it.
wallaby.dependent.syntheticCounterwallaby.tableSynthetic parent changes emitted inline by dependent-table fan-out (a wide fan-out's offloaded tail is counted by the backfill.* metrics instead).
wallaby.transform.durationHistogram (s)wallaby.entityTime spent invoking a mapping's transform for a batch.
wallaby.sink.delivery.durationHistogram (s)wallaby.sink, wallaby.delivery.outcomeDuration of a single sink delivery attempt (its count by outcome gives attempts and retries).
wallaby.sink.records.deliveredCounterwallaby.sinkRecords accepted by a sink.
wallaby.sink.delivery.failuresCounterwallaby.sink, wallaby.delivery.outcomeFailed deliveries (retryable/permanent/dead_letter).
wallaby.backfill.rowsCounterwallaby.tableRows copied during backfill.
wallaby.backfill.activeUpDownCounterTables currently being backfilled.
wallaby.backfill.chunk.durationHistogram (s)wallaby.tableTime to read and emit one backfill chunk.

The main questions you'll want to ask are:

  • What's our throughput?, which can be seen via rate(wallaby.changes.received);
  • Are we keeping up? which is tracked via wallaby.ingestion.lag;

.NET runtime metrics should also be monitored to ensure CPU and memory usage is acceptable.

Traces

The activity source Wallaby emits one span per unit of work:

SpanKindNotable attributes
transactionConsumerwallaby.slot, wallaby.txn.lsn.commit, wallaby.txn.lsn.end, wallaby.txn.size, wallaby.ingestion.lag_s
dependent.resolveInternalwallaby.table, wallaby.dependent.count
routeInternalwallaby.batch.size
transformInternalwallaby.entity, wallaby.batch.size
sink.deliverProducerwallaby.sink, wallaby.destination, wallaby.batch.size (retries recorded as span events; status Error on terminal failure)
backfill.chunkInternalwallaby.table, wallaby.chunk.size
ackInternalwallaby.slot, wallaby.txn.lsn.end

Spans nest under the transaction root, so a single trace shows a committed transaction flowing through routing, each transform, and each sink delivery. If you also enable Npgsql tracing, the EF Core queries your transforms run appear nested under the transform (and dependent.resolve) spans.

Cardinality

Metric attributes are deliberately low-cardinality: wallaby.slot, wallaby.sink, wallaby.entity, wallaby.table, wallaby.action, wallaby.source, and wallaby.delivery.outcome.

Per-row values - tenant/scope keys, document ids, and per-tenant destinations - are never used as metric attributes (they would explode cardinality). wallaby.destination appears only as a span attribute, where sampling keeps the cost bounded.