Skip to content

Add OpenTelemetry trace context propagation#99

Open
patrickleet wants to merge 1 commit into
mainfrom
codex/opentelemetry-tracing-compatibility
Open

Add OpenTelemetry trace context propagation#99
patrickleet wants to merge 1 commit into
mainfrom
codex/opentelemetry-tracing-compatibility

Conversation

@patrickleet

@patrickleet patrickleet commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • add canonical W3C trace-context metadata helpers and carrier APIs for Message, Entity, EventRecord, and OutboxMessage
  • preserve traceparent/tracestate through Knative ingress, outbox-to-message mapping, direct dispatch, durable event storage, and broker metadata round-trips
  • add an optional distributed otel feature that emits framework-owned tracing spans without requiring an OpenTelemetry SDK in the default build
  • extend ServiceManifest plus dsvc scaffold --metrics --tracing/--otel GitOps output for OTLP env values and HTTP ServiceMonitor generation
  • document the Distributed/Hops boundary for OTLP export to ObserveStack Alloy/Tempo

Verification

  • cargo fmt --check
  • git diff --check
  • cargo test --all-features
  • cargo clippy --lib --all-features -- -D warnings
  • cargo test -p distributed_cli
  • cargo clippy -p distributed_cli --all-targets -- -D warnings
  • cargo run -p distributed_cli --bin dsvc -- scaffold observability-orders --path target/tmp/scaffold-observability --store in-memory --transport http --metrics --tracing --gitops --force --distributed-path .
  • cargo check --manifest-path target/tmp/scaffold-observability/Cargo.toml

Implements [[tasks/opentelemetry-tracing-compatibility]].

Summary by CodeRabbit

  • New Features
    • Added optional observability for service scaffolding and generated manifests: metrics (HTTP /metrics + Prometheus scrape artifacts) and tracing/otel (OTLP-enabled spans + Helm/GitOps env templating).
    • Introduced W3C trace-context support (traceparent/tracestate) across messages, entities, event records, and outbox messages.
  • Bug Fixes
    • Improved trace metadata propagation for transport round trips and CloudEvents (HTTP headers take precedence where applicable).
  • Documentation
    • Added and expanded observability and transport guidance, including GitOps observability behavior for HTTP vs Knative.
  • Tests
    • Extended coverage for metrics/tracing scaffolding and trace-context round-trip validation.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@patrickleet, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 38 minutes and 42 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8e7a9706-848c-4024-a1d1-cdbfb132b880

📥 Commits

Reviewing files that changed from the base of the PR and between 6d43624 and 258708c.

📒 Files selected for processing (24)
  • Cargo.toml
  • distributed_cli/README.md
  • distributed_cli/src/cli.rs
  • distributed_cli/src/generate/gitops.rs
  • distributed_cli/src/generate/mod.rs
  • distributed_cli/src/generate/service_crate.rs
  • distributed_cli/src/lib.rs
  • docs/observability.md
  • docs/transports.md
  • src/bus/message.rs
  • src/bus/runner.rs
  • src/entity/entity.rs
  • src/entity/event_record.rs
  • src/lib.rs
  • src/manifest.rs
  • src/microsvc/knative_ingress.rs
  • src/microsvc/service.rs
  • src/outbox/message.rs
  • src/outbox_worker/outbox_dispatch.rs
  • src/trace_context.rs
  • tests/kafka_transport/main.rs
  • tests/nats_transport/main.rs
  • tests/persistent_repository_conformance/scenario.rs
  • tests/rabbitmq_transport/main.rs
📝 Walkthrough

Walkthrough

The PR adds trace-context propagation helpers, observability manifest types, runtime tracing spans, and scaffold support for metrics and tracing. It also updates generated GitOps output, docs, and transport tests to carry and validate trace metadata.

Changes

Observability and scaffolded tracing

Layer / File(s) Summary
Trace context core
Cargo.toml, src/trace_context.rs, src/lib.rs
Adds the optional otel feature, defines trace-context helpers and validation, and re-exports trace-context types and constants at the crate root.
Service observability manifest
src/manifest.rs
Adds service observability fields, metrics and tracing manifest types, trace propagation/export enums, and serialization tests.
Metadata carriers
src/bus/message.rs, src/entity/entity.rs, src/entity/event_record.rs, src/outbox/message.rs, docs/transports.md, docs/observability.md
Adds trace-context injection and extraction across message carriers, switches correlation and causation keys to shared constants, and documents metadata propagation.
CloudEvents trace injection
src/microsvc/knative_ingress.rs
Injects HTTP trace headers into parsed CloudEvents messages and updates binary and structured precedence tests.
Dispatch and publish spans
src/bus/runner.rs, src/microsvc/service.rs, src/outbox_worker/outbox_dispatch.rs, docs/observability.md
Adds tracing spans around transport receive, microsvc dispatch, handler invocation, and outbox publish paths, with tracing documentation and a dispatch metadata test.
Scaffold inputs and spec wiring
distributed_cli/src/lib.rs, distributed_cli/src/cli.rs, distributed_cli/src/generate/mod.rs
Adds scaffold flags, extends the public scaffold spec, and threads metrics and tracing into generator state.
Generated service and GitOps output
distributed_cli/src/generate/service_crate.rs, distributed_cli/src/generate/gitops.rs, distributed_cli/src/generate/mod.rs, distributed_cli/README.md, docs/observability.md
Generates metrics-enabled service entrypoints, tracing-aware manifests and GitOps templates, updated generator tests, and scaffold documentation.
Transport and persistence tests
tests/kafka_transport/main.rs, tests/nats_transport/main.rs, tests/persistent_repository_conformance/scenario.rs, tests/rabbitmq_transport/main.rs
Verifies traceparent propagation across Kafka, NATS, RabbitMQ, and repository persistence round-trips.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • hops-ops/distributed#43: Shares the microsvc dispatch/invoke path and tracing instrumentation changes in src/microsvc/service.rs.

Poem

I hopped through traceparent trails so fine,
and tucked tracestate into the line.
My whiskers twitched at spans apace,
while metrics sprouted in their place.
🐇 The carrots all said: “signal bright!”

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: adding OpenTelemetry trace-context propagation support across the framework.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/opentelemetry-tracing-compatibility

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
distributed_cli/src/generate/service_crate.rs (1)

82-132: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Optional: collapse the duplicated metrics main branches.

The Http if self.metrics and Knative if self.metrics arms are identical except for router vs cloud_events_router. The body (listener bind, /metrics route, axum::serve, metrics() handler) could be generated once with the router constructor selected by transport.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distributed_cli/src/generate/service_crate.rs` around lines 82 - 132, The
metrics-enabled HTTP and Knative branches in service_crate::generate are
duplicating the same main/metrics handler logic. Refactor the
ServiceTransport::Http if self.metrics and ServiceTransport::Knative if
self.metrics arms to share one template, and make only the router constructor
differ between distributed::microsvc::router and
distributed::microsvc::cloud_events_router while keeping the listener bind,
/metrics route, axum::serve call, and metrics() output in one generated body.
distributed_cli/src/cli.rs (1)

74-74: 📐 Maintainability & Code Quality | 🔵 Trivial

Use visible_alias for the --otel flag to ensure it appears in --help.

The alias = "otel" attribute creates a functional alias that works at runtime but hides it from the help output. Since the README documents --otel as a supported flag, users following the documentation and then running --help will not find it listed.

-    #[arg(long, alias = "otel")]
+    #[arg(long, visible_alias = "otel")]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distributed_cli/src/cli.rs` at line 74, The `--otel` flag alias on the CLI
argument is currently hidden from help output because `alias = "otel"` is used
in the `cli.rs` argument definition. Update the `#[arg(...)]` on the affected
option in `distributed_cli::cli::Cli` to use `visible_alias = "otel"` instead,
so the flag remains functional and is shown in `--help`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@distributed_cli/src/generate/gitops.rs`:
- Around line 119-131: The OTEL_SERVICE_NAME env var in the gitops template is
using the deploy chart name from .Chart.Name, which produces the wrong service
identity for traces. Update the env block in the generator that builds the Helm
snippet to inject the package/service name already used elsewhere in this
codepath (matching ServiceManifest::new and the service label), instead of
hardcoding .Chart.Name, so trace and metric identities stay aligned.

In `@distributed_cli/src/generate/service_crate.rs`:
- Around line 53-55: The generated main entrypoint is missing tracing setup, so
spans are never exported even when the otel feature is enabled. Update
service_crate.rs in main_rs to conditionally emit the opentelemetry OTLP
exporter initialization from OTEL_EXPORTER_OTLP_ENDPOINT and install the
matching tracing_subscriber layer when self.tracing is true. Make sure the
generated main function performs this setup before calling
distributed::microsvc::serve, and keep the logic gated so non-tracing builds
remain unchanged.

In `@src/microsvc/knative_ingress.rs`:
- Around line 96-113: The trace context update in
parse_cloud_event/inject_http_trace_context is unconditional and can wipe
existing metadata from parse_binary or parse_structured when HTTP trace headers
are absent. Add a guard in inject_http_trace_context so it only calls
TraceContext::inject_vec when at least one of TRACEPARENT or TRACESTATE is
present in the HeaderMap, otherwise leave the Message.metadata unchanged.

In `@src/trace_context.rs`:
- Around line 101-109: The is_valid_traceparent validator currently accepts any
non-ff version and uppercase hex, but it should enforce the documented
version-00 shape. Update the checks in is_valid_traceparent (and the related
logic around the traceparent parsing/validation path) to require version == "00"
and to only accept lowercase hex for version-00 fields, while keeping the
existing length and format validation for trace_id, parent_id, and flags.

---

Nitpick comments:
In `@distributed_cli/src/cli.rs`:
- Line 74: The `--otel` flag alias on the CLI argument is currently hidden from
help output because `alias = "otel"` is used in the `cli.rs` argument
definition. Update the `#[arg(...)]` on the affected option in
`distributed_cli::cli::Cli` to use `visible_alias = "otel"` instead, so the flag
remains functional and is shown in `--help`.

In `@distributed_cli/src/generate/service_crate.rs`:
- Around line 82-132: The metrics-enabled HTTP and Knative branches in
service_crate::generate are duplicating the same main/metrics handler logic.
Refactor the ServiceTransport::Http if self.metrics and
ServiceTransport::Knative if self.metrics arms to share one template, and make
only the router constructor differ between distributed::microsvc::router and
distributed::microsvc::cloud_events_router while keeping the listener bind,
/metrics route, axum::serve call, and metrics() output in one generated body.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aacf9019-2f39-41e1-8653-f91da7554bfa

📥 Commits

Reviewing files that changed from the base of the PR and between 2e83175 and b07d1eb.

📒 Files selected for processing (24)
  • Cargo.toml
  • distributed_cli/README.md
  • distributed_cli/src/cli.rs
  • distributed_cli/src/generate/gitops.rs
  • distributed_cli/src/generate/mod.rs
  • distributed_cli/src/generate/service_crate.rs
  • distributed_cli/src/lib.rs
  • docs/observability.md
  • docs/transports.md
  • src/bus/message.rs
  • src/bus/runner.rs
  • src/entity/entity.rs
  • src/entity/event_record.rs
  • src/lib.rs
  • src/manifest.rs
  • src/microsvc/knative_ingress.rs
  • src/microsvc/service.rs
  • src/outbox/message.rs
  • src/outbox_worker/outbox_dispatch.rs
  • src/trace_context.rs
  • tests/kafka_transport/main.rs
  • tests/nats_transport/main.rs
  • tests/persistent_repository_conformance/scenario.rs
  • tests/rabbitmq_transport/main.rs

Comment thread distributed_cli/src/generate/gitops.rs Outdated
Comment thread distributed_cli/src/generate/service_crate.rs
Comment thread src/microsvc/knative_ingress.rs
Comment thread src/trace_context.rs Outdated
@patrickleet patrickleet force-pushed the codex/opentelemetry-tracing-compatibility branch from b07d1eb to 6d43624 Compare June 25, 2026 02:40

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@distributed_cli/README.md`:
- Around line 25-30: Remove the duplicated `--gitops` entry from the options
list in the README help text so the `dctl scaffold` flags are listed only once.
Update the sentence that enumerates common flags around `--gitops`, `--metrics`,
and `--tracing` to keep the list accurate and avoid implying multiple distinct
`--gitops` switches.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ad3470a8-d2c1-499f-8dc1-f77e2428607e

📥 Commits

Reviewing files that changed from the base of the PR and between b07d1eb and 6d43624.

📒 Files selected for processing (24)
  • Cargo.toml
  • distributed_cli/README.md
  • distributed_cli/src/cli.rs
  • distributed_cli/src/generate/gitops.rs
  • distributed_cli/src/generate/mod.rs
  • distributed_cli/src/generate/service_crate.rs
  • distributed_cli/src/lib.rs
  • docs/observability.md
  • docs/transports.md
  • src/bus/message.rs
  • src/bus/runner.rs
  • src/entity/entity.rs
  • src/entity/event_record.rs
  • src/lib.rs
  • src/manifest.rs
  • src/microsvc/knative_ingress.rs
  • src/microsvc/service.rs
  • src/outbox/message.rs
  • src/outbox_worker/outbox_dispatch.rs
  • src/trace_context.rs
  • tests/kafka_transport/main.rs
  • tests/nats_transport/main.rs
  • tests/persistent_repository_conformance/scenario.rs
  • tests/rabbitmq_transport/main.rs
✅ Files skipped from review due to trivial changes (3)
  • docs/observability.md
  • docs/transports.md
  • Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (20)
  • tests/kafka_transport/main.rs
  • distributed_cli/src/cli.rs
  • src/outbox_worker/outbox_dispatch.rs
  • tests/persistent_repository_conformance/scenario.rs
  • src/lib.rs
  • distributed_cli/src/lib.rs
  • tests/rabbitmq_transport/main.rs
  • src/outbox/message.rs
  • distributed_cli/src/generate/gitops.rs
  • src/entity/entity.rs
  • src/microsvc/service.rs
  • tests/nats_transport/main.rs
  • src/microsvc/knative_ingress.rs
  • src/trace_context.rs
  • src/bus/runner.rs
  • src/bus/message.rs
  • distributed_cli/src/generate/service_crate.rs
  • distributed_cli/src/generate/mod.rs
  • src/entity/event_record.rs
  • src/manifest.rs

Comment thread distributed_cli/README.md
@patrickleet patrickleet force-pushed the codex/opentelemetry-tracing-compatibility branch from 6d43624 to 711ce33 Compare June 25, 2026 02:50
Implements [[tasks/opentelemetry-tracing-compatibility]]
@patrickleet patrickleet force-pushed the codex/opentelemetry-tracing-compatibility branch from 711ce33 to 258708c Compare June 25, 2026 03:01
@patrickleet

Copy link
Copy Markdown
Collaborator Author

CodeRabbit summary nitpick follow-up (--otel alias): fixed in 258708c. --tracing now uses visible_alias = "otel", and scaffold_help_lists_otel_alias asserts generated help shows both flags.

@patrickleet

Copy link
Copy Markdown
Collaborator Author

CodeRabbit summary nitpick follow-up (duplicated metrics main branches): fixed in 258708c. Generated metrics entrypoints now share one template and select the appropriate router by transport.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant