Skip to content

feat: GenAI agent observability with server-side cost enrichment#1550

Open
Debanitrkl wants to merge 1 commit intoparseablehq:mainfrom
Debanitrkl:feat/genai-agent-observability
Open

feat: GenAI agent observability with server-side cost enrichment#1550
Debanitrkl wants to merge 1 commit intoparseablehq:mainfrom
Debanitrkl:feat/genai-agent-observability

Conversation

@Debanitrkl
Copy link

@Debanitrkl Debanitrkl commented Feb 19, 2026

Summary

Adds server-side GenAI trace processing for streams tagged with X-P-Dataset-Tag: agent-observability. When OTel traces arrive with this tag, Parseable automatically:

  • Coerces types — converts OTel's string-encoded IntValue and float attributes to native numeric JSON types
  • Enriches with computed columns — appends p_genai_cost_usd, p_genai_tokens_total, p_genai_tokens_per_sec, and p_genai_duration_ms to every span before storage
  • Pre-registers schema — declares all 31 GenAI fields (27 OTel semantic convention + 4 enriched) with correct types at stream creation

This makes GenAI trace data immediately SQL-queryable — no CAST(), no external pricing lookups, no post-processing.

Problem

OTel GenAI traces (from opentelemetry-instrumentation-openai-v2 and similar instrumentors) have several pain points when stored as-is:

  1. Type mismatch: OTel's protobuf-to-JSON serialization turns IntValue(1250) into the string "1250". Without correction, gen_ai.usage.input_tokens becomes a Utf8 column and SUM() fails.

  2. No cost visibility: Token counts arrive raw. To answer "how much did this agent run cost?", users must maintain a separate model pricing table and join it in every query.

  3. Missing derived metrics: Output throughput (tokens/sec) requires combining a span attribute with the span's timing metadata — awkward in SQL. Duration is stored in nanoseconds, but humans think in milliseconds.

  4. Schema drift: Without pre-registered fields, the first event defines column types. If the first trace has a string token count, the column is permanently Utf8.

What changed

New: src/otel/genai.rs (564 lines)

The core GenAI module with four parts:

Field definitionsGENAI_KNOWN_FIELD_LIST (31 fields with Arrow types), GENAI_INT_FIELDS (6 fields), GENAI_FLOAT_FIELDS (5 fields). These are the source of truth for schema creation and type coercion.

Type coercioncoerce_genai_field_types() converts string-encoded integers and floats to native JSON numeric types. Only touches known GenAI fields, safely skips values that are already numeric or unparseable.

Cost enrichmentenrich_genai_record() computes four columns per span:

Column Logic
p_genai_cost_usd (input_tokens × input_price) + (output_tokens × output_price) using embedded pricing table
p_genai_tokens_total input_tokens + output_tokens
p_genai_tokens_per_sec output_tokens / (span_duration_ns / 1e9)
p_genai_duration_ms span_duration_ns / 1e6

Pricing table — Embedded per-token pricing for 30+ models across OpenAI, Anthropic, Google, Mistral, Cohere, Meta, and Groq. Uses LazyLock for zero per-request allocation. Supports:

  • Exact match (gpt-4ogpt-4o)
  • Prefix match (gpt-4o-2024-11-20gpt-4o, longest prefix wins)
  • User overrides via genai-pricing.json (searched in $PARSEABLE_CONFIG_DIR, cwd, ~/.parseable/)

Unit tests — 10 tests covering type coercion (int, float, skip non-string, handle invalid), enrichment (basic, response-model-over-request-model, unknown model, missing tokens), pricing prefix match, and field count validation.

Modified: src/handlers/http/ingest.rs

In setup_otel_stream():

  • Parses X-P-Dataset-Tag from request headers
  • When tag is AgentObservability, merges GENAI_KNOWN_FIELD_LIST into the known fields set before stream creation — this ensures all 31 fields are in the schema with correct types from day one
  • Passes dataset_tag (instead of None) to create_stream_if_not_exists()

Modified: src/handlers/http/modal/utils/ingest_utils.rs

In flatten_and_push_logs(), in the OtelTraces branch:

  • Looks up the stream's stored dataset_tag via get_dataset_tag()
  • If AgentObservability, applies coerce_genai_field_types() then enrich_genai_record() to each flattened trace record before pushing to storage
  • The is_genai check is done once per batch (not per-record), so non-GenAI streams have zero overhead

Modified: src/otel.rs

Added pub mod genai; to register the new module.

New: resources/parseable-genai-collector.yaml

Canonical OTel Collector configuration template for GenAI traces. Uses ${PARSEABLE_URL}, ${PARSEABLE_AUTH}, ${STREAM_NAME} variables. Includes the critical X-P-Dataset-Tag: agent-observability header.

New: resources/genai-pricing-example.json

Example custom pricing override file showing the expected format.

Design decisions

Why enrich at ingest, not query time?
Computing cost at query time would require a UDF or a JOIN against a pricing table for every query. Ingest-time enrichment means SELECT SUM(p_genai_cost_usd) FROM "genai-traces" just works. The tradeoff is that if pricing changes, historical data keeps the old price — but that's actually correct (you paid the old price for old calls).

Why embedded pricing, not config-only?
Zero-config is important for adoption. The embedded table covers the most common models. The genai-pricing.json override exists for custom/fine-tuned models or price updates between Parseable releases.

Why prefix matching for model names?
Model IDs include date suffixes (gpt-4o-2024-11-20, claude-3-5-sonnet-20241022). Listing every dated variant would be unmaintainable. Prefix match with "longest wins" handles this cleanly — gpt-4o-mini won't accidentally match gpt-4o because it's a longer prefix in the table.

Why gate on DatasetTag, not auto-detect?
Auto-detecting GenAI fields would add overhead to every OTel trace ingestion. The explicit X-P-Dataset-Tag: agent-observability header is opt-in — only streams that declare themselves as GenAI get the processing. This keeps non-GenAI trace ingestion at zero additional cost.

How to test

1. Start Parseable

parseable local-store

2. Start OTel Collector with the included config

export PARSEABLE_URL=http://localhost:8000
export PARSEABLE_AUTH=$(echo -n 'admin:admin' | base64)
export STREAM_NAME=genai-traces
otelcol-contrib --config resources/parseable-genai-collector.yaml

3. Send instrumented GenAI traces

pip install openai opentelemetry-distro opentelemetry-exporter-otlp opentelemetry-instrumentation-openai-v2

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
OTEL_SERVICE_NAME=test-agent \
opentelemetry-instrument python your_app.py

4. Query enriched data

SELECT "gen_ai.request.model", "gen_ai.usage.input_tokens",
       "gen_ai.usage.output_tokens", p_genai_cost_usd,
       p_genai_tokens_total, p_genai_tokens_per_sec, p_genai_duration_ms
FROM "genai-traces"

Token counts should be native integers (not strings), and the p_genai_* columns should be populated with correct values.

Test plan

  • Unit tests pass (cargo test — 10 tests in otel::genai::tests)
  • Manual end-to-end: OTel Collector → Parseable → SQL query shows enriched columns
  • Verify non-GenAI OTel trace streams are unaffected (no p_genai_* columns, no type coercion)
  • Verify stream restart preserves dataset tag (tag persisted in metadata)
  • Verify custom pricing override via genai-pricing.json

Summary by CodeRabbit

Release Notes

  • New Features
    • Added GenAI observability support with automatic metric calculation (tokens, duration, cost).
    • Introduced dynamic pricing configuration to track GenAI costs across multiple models.
    • Added OpenTelemetry Collector configuration template for GenAI observability pipelines.
    • Enabled dataset tagging for enhanced stream management and metadata handling.

…e coercion for OTel traces

Add server-side processing for GenAI/LLM traces tagged with
`X-P-Dataset-Tag: agent-observability`. When traces arrive with this tag,
Parseable automatically coerces OTel string-encoded numeric fields to native
types and enriches each span with computed cost, throughput, and duration
columns — making GenAI data SQL-queryable without manual CAST() or joins
against external pricing tables.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 19, 2026

Walkthrough

Introduces comprehensive GenAI observability support with field type coercion, token/cost enrichment, dynamic pricing lookup, and dataset-tag-driven conditional processing. Integrates GenAI-specific transformations into OTEL ingestion pipeline when dataset tag indicates agent observability workloads.

Changes

Cohort / File(s) Summary
Configuration & Examples
resources/genai-pricing-example.json, resources/parseable-genai-collector.yaml
Added example pricing configuration for custom models and canonical OpenTelemetry Collector configuration for Parseable GenAI observability pipeline with OTLP receiver, batch processor, and HTTP exporter.
GenAI Observability Core
src/otel/genai.rs
New module providing GenAI-specific data handling: field type coercion for integers/floats, record enrichment with computed metrics (token totals, duration, throughput, cost), dynamic pricing lookup with user overrides from JSON config, and comprehensive test coverage.
OTEL Module Integration
src/otel.rs
Exposed new genai module as public submodule to make GenAI observability tooling available within OTEL namespace.
OTEL Ingestion Pipeline
src/handlers/http/ingest.rs, src/handlers/http/modal/utils/ingest_utils.rs
Extended OTEL stream setup to parse X-P-Dataset-Tag header and conditionally apply GenAI field augmentation when dataset tag indicates agent observability; traces processing now coerces field types and enriches records before ingestion.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Handler as OTEL<br/>Ingestion
    participant GenAI as GenAI<br/>Processor
    participant Pricing as Pricing<br/>Lookup
    participant Stream as Stream<br/>Storage

    Client->>Handler: POST /otlp/traces<br/>(X-P-Dataset-Tag: genai-agent)
    Handler->>Handler: Parse dataset tag
    alt Dataset Tag = AgentObservability
        Handler->>GenAI: Coerce field types
        GenAI->>GenAI: Parse ints/floats
        Handler->>GenAI: Enrich record
        GenAI->>Pricing: lookup_pricing(model)
        Pricing->>GenAI: Return input_price, output_price
        GenAI->>GenAI: Compute total tokens, duration, cost
        GenAI->>Stream: Push enriched record
    else
        Handler->>Stream: Push record as-is
    end
    Stream-->>Client: 200 OK
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • #1540: Builds on dataset_tag/DATASET_TAG_KEY and DatasetTag plumbing introduced here; depends on header parsing and stream creation metadata handling.
  • #1406: Concurrently modifies OTEL ingestion in src/handlers/http/ingest.rs with changes to setup_otel_stream and known-field processing behavior.
  • #1391: Earlier PR that introduced the OTEL ingestion flow now extended with dataset-tag-aware GenAI coercion and enrichment logic.

Poem

🐰 Hops of joy through token trails,
Where pricing lookups never fail,
GenAI streams enriched with care,
Dataset tags float through the air,
Observability's here to stay!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding GenAI agent observability with server-side cost enrichment, which is the primary focus of the changeset across all modified files.
Description check ✅ Passed The PR description is comprehensive, addressing problem statement, changes made, design decisions, testing instructions, and test plan. However, the required checklist items (tested log ingestion/query, comments added, documentation added) are only partially marked—unit tests pass but manual end-to-end testing is not yet complete.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
src/otel/genai.rs (4)

211-233: Prefix matching in lookup_pricing can produce false-positive cost attribution.

A model like "command-r-plus-online" (hypothetical Cohere model with different pricing) would silently inherit "command-r-plus" pricing. Similarly, any third-party model whose name happens to start with a known prefix (e.g., "gpt-4o-my-finetune") would get attributed costs. This is documented behavior and works well for version-suffixed model names, but consider adding a trace-level log when a prefix (non-exact) match is used, so operators can detect misattribution.

💡 Optional: log prefix matches for observability
     // Prefix match: find the longest matching prefix
     let mut best_match: Option<(&str, (f64, f64))> = None;
     for (key, &pricing) in PRICING_TABLE.iter() {
         if model.starts_with(key.as_str()) {
             match best_match {
                 Some((best_key, _)) if key.len() > best_key.len() => {
                     best_match = Some((key, pricing));
                 }
                 None => {
                     best_match = Some((key, pricing));
                 }
                 _ => {}
             }
         }
     }
-    best_match.map(|(_, pricing)| pricing)
+    best_match.map(|(matched_key, pricing)| {
+        tracing::trace!(
+            "GenAI pricing: prefix match '{}' -> '{}'",
+            model,
+            matched_key
+        );
+        pricing
+    })
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/otel/genai.rs` around lines 211 - 233, The lookup_pricing function
silently returns a prefix match from PRICING_TABLE which can misattribute costs;
update lookup_pricing to emit a trace-level log whenever a non-exact (prefix)
match is used by logging the input model, the matched key, and the resolved
pricing tuple before returning (i.e., when best_match is chosen but the initial
PRICING_TABLE.get(model) was None); keep exact matches unchanged and ensure the
log is at trace/debug level so operators can detect potential false-positive
attributions without noise.

95-140: Hardcoded pricing will silently drift from actual provider pricing.

The embedded pricing table is a snapshot; model prices change frequently (OpenAI alone has changed GPT-4o pricing multiple times). The override mechanism mitigates this, but operators who don't configure genai-pricing.json will get stale cost estimates with no warning. Consider logging a startup notice that default pricing is in use, or adding a comment noting the date these prices were last verified.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/otel/genai.rs` around lines 95 - 140, DEFAULT_PRICING is a hardcoded
snapshot that can become stale; update the code to (1) add an inline comment
near DEFAULT_PRICING noting the date these values were last verified, and (2)
detect at startup when the override file (genai-pricing.json) is not present or
not loaded and emit a clear startup log/warning that default pricing is in use
(include the verification date and advise operators to provide
genai-pricing.json to override). Reference DEFAULT_PRICING and the override
filename genai-pricing.json when implementing the detection and log message so
the warning appears whenever the code falls back to the hardcoded table.

170-185: CWD fallback in find_pricing_config can be surprising in containerized/service deployments.

When running Parseable as a service or in a container, the current working directory is often / or an arbitrary path, making the "genai-pricing.json" candidate unlikely to resolve as intended. Consider documenting that PARSEABLE_CONFIG_DIR is the recommended mechanism, or adding a log message when the CWD fallback is checked.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/otel/genai.rs` around lines 170 - 185, The CWD fallback in
find_pricing_config (the "genai-pricing.json" candidate) can be misleading in
containers; update find_pricing_config to log when the CWD candidate is
considered (use the existing logger/tracing crate) and recommend
PARSEABLE_CONFIG_DIR in the message, or remove the CWD candidate entirely;
specifically, modify the candidates logic in find_pricing_config to either (a)
replace the plain Some("genai-pricing.json") with code that checks
Path::new("genai-pricing.json").exists() and emits a debug/warn like "checked
CWD for genai-pricing.json; prefer PARSEABLE_CONFIG_DIR" when hit, or (b) drop
that candidate and add a log explaining that PARSEABLE_CONFIG_DIR is the
recommended mechanism, referencing PARSEABLE_CONFIG_DIR and the filename
"genai-pricing.json" so callers can find or set the expected config location.

266-322: p_genai_tokens_total and p_genai_cost_usd silently skipped when only one of input/output tokens is present.

Both p_genai_tokens_total (Line 276) and p_genai_cost_usd (Line 313) require both input_tokens and output_tokens to be Some. If a span reports only output tokens (e.g., streaming completions where input isn't re-counted), no total or cost is computed. This is a defensible choice but could lose enrichment for legitimate partial-token spans.

Consider computing partial values when at least one token count is available — treating the missing one as 0.

💡 Compute enrichment with partial token data
-    if let (Some(inp), Some(out)) = (input_tokens, output_tokens) {
+    let inp = input_tokens.unwrap_or(0);
+    let out = output_tokens.unwrap_or(0);
+    if input_tokens.is_some() || output_tokens.is_some() {
         record.insert(
             "p_genai_tokens_total".to_string(),
             Value::Number(Number::from(inp + out)),
         );
     }

Apply the same pattern to the cost calculation block.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/otel/genai.rs` around lines 266 - 322, In enrich_genai_record,
p_genai_tokens_total and p_genai_cost_usd currently require both input_tokens
and output_tokens to be Some; change them to compute using partial data by
treating missing token counts as 0 (use input_tokens.unwrap_or(0) and
output_tokens.unwrap_or(0)). For p_genai_tokens_total replace the (Some(inp),
Some(out)) check with at least one present or simply compute let total =
input_tokens.unwrap_or(0) + output_tokens.unwrap_or(0) and insert if total > 0;
for p_genai_cost_usd, if model_name is Some compute cost using inp.unwrap_or(0)
and out.unwrap_or(0) and then insert the p_genai_cost_usd value when the
lookup_pricing(model_name) returns pricing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/otel/genai.rs`:
- Around line 211-233: The lookup_pricing function silently returns a prefix
match from PRICING_TABLE which can misattribute costs; update lookup_pricing to
emit a trace-level log whenever a non-exact (prefix) match is used by logging
the input model, the matched key, and the resolved pricing tuple before
returning (i.e., when best_match is chosen but the initial
PRICING_TABLE.get(model) was None); keep exact matches unchanged and ensure the
log is at trace/debug level so operators can detect potential false-positive
attributions without noise.
- Around line 95-140: DEFAULT_PRICING is a hardcoded snapshot that can become
stale; update the code to (1) add an inline comment near DEFAULT_PRICING noting
the date these values were last verified, and (2) detect at startup when the
override file (genai-pricing.json) is not present or not loaded and emit a clear
startup log/warning that default pricing is in use (include the verification
date and advise operators to provide genai-pricing.json to override). Reference
DEFAULT_PRICING and the override filename genai-pricing.json when implementing
the detection and log message so the warning appears whenever the code falls
back to the hardcoded table.
- Around line 170-185: The CWD fallback in find_pricing_config (the
"genai-pricing.json" candidate) can be misleading in containers; update
find_pricing_config to log when the CWD candidate is considered (use the
existing logger/tracing crate) and recommend PARSEABLE_CONFIG_DIR in the
message, or remove the CWD candidate entirely; specifically, modify the
candidates logic in find_pricing_config to either (a) replace the plain
Some("genai-pricing.json") with code that checks
Path::new("genai-pricing.json").exists() and emits a debug/warn like "checked
CWD for genai-pricing.json; prefer PARSEABLE_CONFIG_DIR" when hit, or (b) drop
that candidate and add a log explaining that PARSEABLE_CONFIG_DIR is the
recommended mechanism, referencing PARSEABLE_CONFIG_DIR and the filename
"genai-pricing.json" so callers can find or set the expected config location.
- Around line 266-322: In enrich_genai_record, p_genai_tokens_total and
p_genai_cost_usd currently require both input_tokens and output_tokens to be
Some; change them to compute using partial data by treating missing token counts
as 0 (use input_tokens.unwrap_or(0) and output_tokens.unwrap_or(0)). For
p_genai_tokens_total replace the (Some(inp), Some(out)) check with at least one
present or simply compute let total = input_tokens.unwrap_or(0) +
output_tokens.unwrap_or(0) and insert if total > 0; for p_genai_cost_usd, if
model_name is Some compute cost using inp.unwrap_or(0) and out.unwrap_or(0) and
then insert the p_genai_cost_usd value when the lookup_pricing(model_name)
returns pricing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments