Skip to content

feat: native-spans wasm support (pipeline, capabilities, trace exporter)#151

Open
bengl wants to merge 12 commits into
mainfrom
bengl/native-spans-attempt-3-ish
Open

feat: native-spans wasm support (pipeline, capabilities, trace exporter)#151
bengl wants to merge 12 commits into
mainfrom
bengl/native-spans-attempt-3-ish

Conversation

@bengl

@bengl bengl commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds wasm-only native-spans support to libdatadog-nodejs: the Rust/wasm
bindings that let dd-trace-js run its span pipeline and trace export through
libdatadog instead of JS. Pairs with the companion dd-trace-js PR (DataDog/dd-trace-js#9139),
(bengl/native-spans-attempt-3), which consumes these bindings.

What's here (10 commits)

  • build: cargo workspace + wasm build tooling (scripts/build-wasm.js).
  • capabilities crate: the wasm capability bundle — HTTP transport (incl.
    unix-socket / Windows named-pipe), sleep, etc., routing I/O back into JS.
  • pipeline crate (WasmSpanState): change-buffer span pipeline (span_id
    protocol), string table, stats collector, and the span-data bindings:
    • meta_struct (setMetaStruct/getMetaStruct)
    • top-level span_events (addSpanEvent, typed attributes)
    • v0.5 output selection (setUseV05) — single exporter; in v0.5 mode
      meta_struct/span_events are dropped by the protocol, by design.
  • Toolchain + libdatadog dependency bumps to make the crates build for wasm.
  • Removed the standalone trace_exporter crate (JsTraceExporter, the
    earlier pre-encoded-bytes path): the pipeline crate's WasmSpanState
    builds its own internal TraceExporter and supersedes it; nothing consumes
    the standalone binding.

Testing

  • node --test suites: pipeline (incl. meta_struct/span_events round-trips,
    v0.5 vs v0.4 endpoint routing against a mock agent, bounds/overflow
    rejection), http_transport (incl. a real AF_UNIX socket), trace_exporter,
    process_discovery, crashtracker.
  • cargo check / clippy clean on the wasm target.
  • Each feature went through a multi-perspective review-until-green loop before
    landing.

Notes / caveats

  • wasm-only: the unsound native NAPI pipeline path was removed; all crates
    target wasm32-unknown-unknown.
  • Real end-to-end verification against a live agent is CI-gated (prebuilds
    are gitignored / not published here).
  • libdatadog is consumed as a git dependency; no libdatadog source changes.

Draft for review; not yet rebased for release.

Comment thread test/crashtracker/package-lock.json Outdated
Comment thread crates/process_discovery/src/lib.rs Outdated
bengl and others added 10 commits June 29, 2026 14:14
Establish the Rust workspace, pin the toolchain, and add the npm scripts
and shell tooling that build the wasm modules and the native (napi)
addons and run their test suites.

Co-authored-by: Jules Wiriath <jules.wiriath@datadoghq.com>
Co-authored-by: paullegranddc <paul.legranddescloizeaux@datadoghq.com>
Co-authored-by: Gyuheon Oh <102937919+gyuheon0h@users.noreply.github.com>
Implement the portable libdatadog capability traits for the wasm runtime:
an HTTP client backed by Node's http.request, a setTimeout-based sleep,
and a response-header observer hook. This bundle is the generic parameter
the data-pipeline and trace-exporter crates are instantiated with.

Co-authored-by: Jules Wiriath <jules.wiriath@datadoghq.com>
Co-authored-by: paullegranddc <paul.legranddescloizeaux@datadoghq.com>
Co-authored-by: Gyuheon Oh <102937919+gyuheon0h@users.noreply.github.com>
Wasm binding over libdatadog's span-id-addressed change buffer: span
creation and mutation via a binary op protocol, a deduplicated string
table, segment (trace-level) attributes, batched export through
prepareChunk/sendPreparedChunk, and optional client-side stats.

Co-authored-by: Jules Wiriath <jules.wiriath@datadoghq.com>
Co-authored-by: paullegranddc <paul.legranddescloizeaux@datadoghq.com>
Co-authored-by: Gyuheon Oh <102937919+gyuheon0h@users.noreply.github.com>
Expose libdatadog's TraceExporter to JS via wasm-bindgen, pinned to the
wasm capability bundle. The exporter is built lazily on first send, since
the blocking build path is unavailable on wasm. Includes an integration
test that drives it against an in-process mock agent.

Co-authored-by: Jules Wiriath <jules.wiriath@datadoghq.com>
Co-authored-by: paullegranddc <paul.legranddescloizeaux@datadoghq.com>
Co-authored-by: Gyuheon Oh <102937919+gyuheon0h@users.noreply.github.com>
Update library_config, process_discovery, and datadog-js-zstd for the
workspace's pinned Rust toolchain and libdatadog dependency versions.

Co-authored-by: Jules Wiriath <jules.wiriath@datadoghq.com>
Co-authored-by: paullegranddc <paul.legranddescloizeaux@datadoghq.com>
Co-authored-by: Gyuheon Oh <102937919+gyuheon0h@users.noreply.github.com>
libdatadog's Span already carries meta_struct (VecMap<Text, Bytes>) and
the exporter serializes it, but no JS binding existed, so structured
per-span data (AppSec, Code Origin, Dynamic Instrumentation) could not be
sent on the native path.

There is no change-buffer opcode for meta_struct, so setMetaStruct writes
the value directly onto the span via span_mut() after draining the change
queue. meta_struct depends on no other queued op, so bypassing the queue
ordering is safe \u2014 subsequent ops are applied on the next flush and never
touch meta_struct. getMetaStruct mirrors the existing per-span getters for
round-trip coverage.
The wasm HTTP transport only spoke http/https to a host:port, so a
`unix://` or `windows:` agent URL could not be reached: request() treated
the hex-encoded socket path (ddcommon's parse_uri stores it in the URI
authority) as a TCP host.

Detect the unix/windows scheme in request(), hex-decode the socket path
from the authority, and pass it to the JS transport, which now uses
Node's { socketPath } (covering Windows named pipes too). TCP requests
are unchanged; socket requests send a localhost Host header with no port.
Adds a unix-socket case to the transport tests.
Add `addSpanEvent` to append OpenTelemetry-style span events onto the
top-level v0.4 `span_events` field that libdatadog already serializes.
Like meta_struct there is no change-buffer opcode, so the event is
appended directly to the span after draining the queue (span_events do
not depend on any other queued op, so bypassing queue ordering is safe).

Attributes arrive as a flat little-endian buffer with per-value type
tags (String=0, Boolean=1, Integer=2, Double=3, Array=4) matching
libdatadog's AttributeArrayValue discriminants; every read is bounded
against the buffer so a malformed/truncated buffer errors instead of
panicking. A `getSpanEventsJson` helper serializes events via the same
serde impl used for the msgpack wire format, exercised by new
round-trip tests covering each scalar type, arrays, and bounds.
Add setUseV05() to WasmSpanState so the single trace exporter can emit the
v0.5 wire format (/v0.5/traces) instead of the default v0.4. The flag is read
once, at the lazy exporter build on first send, then fixed; callers must set
it before the first flush.

v0.5 uses a fixed 12-field schema with no slots for meta_struct, span_events,
or span_links, so libdatadog's v0.5 serializer silently drops them. This
mirrors dd-trace-js master's v0.5 encoder and is intentional \u2014 there is no
guard and no dual exporter. libdatadog does not downgrade V05 (unlike V1), so
the caller (dd-trace-js) is responsible for only enabling this after the agent
advertises /v0.5/traces via /info.
The pipeline crate's WasmSpanState builds its own internal TraceExporter and
owns serialization + send (sendPreparedChunk), superseding the standalone
trace_exporter binding (JsTraceExporter, the earlier pre-encoded-v0.4-bytes
path). Nothing consumes it: dd-trace-js (native-spans and master) loads only
the pipeline crate, and no other tracked code references it. Drop the crate,
its wasm test, and the build/test wiring.
@bengl bengl force-pushed the bengl/native-spans-attempt-3-ish branch from 0bf80b1 to 7b2d50b Compare June 29, 2026 18:16
@bengl bengl marked this pull request as ready for review June 29, 2026 18:23
@bengl bengl requested review from a team as code owners June 29, 2026 18:23

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b2d50b74a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/pipeline/src/lib.rs Outdated
Comment thread package.json Outdated
bengl added 2 commits June 29, 2026 14:39
The branch's build-wasm dropped the leading `yarn -s install-wasm-pack`
that main still runs, so a clean checkout without wasm-pack on PATH fails
with `wasm-pack: not found` before building any module. Restore it.
flushStats() took the StatsCollector out of its RefCell for the whole
async send, so a prepareChunk() during that await saw None and skipped
add_spans — permanently dropping those successfully-sent spans from
client-side stats. Split StatsCollector::flush into a synchronous
prepare_request (drain + encode under a brief borrow) and an async
send_request (no borrow). flushStats now builds the request, releases the
collector, then awaits the send, so concurrent add_spans is counted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant