feat: GenAI agent observability with server-side cost enrichment#1550
feat: GenAI agent observability with server-side cost enrichment#1550Debanitrkl wants to merge 1 commit intoparseablehq:mainfrom
Conversation
…e coercion for OTel traces Add server-side processing for GenAI/LLM traces tagged with `X-P-Dataset-Tag: agent-observability`. When traces arrive with this tag, Parseable automatically coerces OTel string-encoded numeric fields to native types and enriches each span with computed cost, throughput, and duration columns — making GenAI data SQL-queryable without manual CAST() or joins against external pricing tables. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WalkthroughIntroduces comprehensive GenAI observability support with field type coercion, token/cost enrichment, dynamic pricing lookup, and dataset-tag-driven conditional processing. Integrates GenAI-specific transformations into OTEL ingestion pipeline when dataset tag indicates agent observability workloads. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Handler as OTEL<br/>Ingestion
participant GenAI as GenAI<br/>Processor
participant Pricing as Pricing<br/>Lookup
participant Stream as Stream<br/>Storage
Client->>Handler: POST /otlp/traces<br/>(X-P-Dataset-Tag: genai-agent)
Handler->>Handler: Parse dataset tag
alt Dataset Tag = AgentObservability
Handler->>GenAI: Coerce field types
GenAI->>GenAI: Parse ints/floats
Handler->>GenAI: Enrich record
GenAI->>Pricing: lookup_pricing(model)
Pricing->>GenAI: Return input_price, output_price
GenAI->>GenAI: Compute total tokens, duration, cost
GenAI->>Stream: Push enriched record
else
Handler->>Stream: Push record as-is
end
Stream-->>Client: 200 OK
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (4)
src/otel/genai.rs (4)
211-233: Prefix matching inlookup_pricingcan produce false-positive cost attribution.A model like
"command-r-plus-online"(hypothetical Cohere model with different pricing) would silently inherit"command-r-plus"pricing. Similarly, any third-party model whose name happens to start with a known prefix (e.g.,"gpt-4o-my-finetune") would get attributed costs. This is documented behavior and works well for version-suffixed model names, but consider adding a trace-level log when a prefix (non-exact) match is used, so operators can detect misattribution.💡 Optional: log prefix matches for observability
// Prefix match: find the longest matching prefix let mut best_match: Option<(&str, (f64, f64))> = None; for (key, &pricing) in PRICING_TABLE.iter() { if model.starts_with(key.as_str()) { match best_match { Some((best_key, _)) if key.len() > best_key.len() => { best_match = Some((key, pricing)); } None => { best_match = Some((key, pricing)); } _ => {} } } } - best_match.map(|(_, pricing)| pricing) + best_match.map(|(matched_key, pricing)| { + tracing::trace!( + "GenAI pricing: prefix match '{}' -> '{}'", + model, + matched_key + ); + pricing + }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/otel/genai.rs` around lines 211 - 233, The lookup_pricing function silently returns a prefix match from PRICING_TABLE which can misattribute costs; update lookup_pricing to emit a trace-level log whenever a non-exact (prefix) match is used by logging the input model, the matched key, and the resolved pricing tuple before returning (i.e., when best_match is chosen but the initial PRICING_TABLE.get(model) was None); keep exact matches unchanged and ensure the log is at trace/debug level so operators can detect potential false-positive attributions without noise.
95-140: Hardcoded pricing will silently drift from actual provider pricing.The embedded pricing table is a snapshot; model prices change frequently (OpenAI alone has changed GPT-4o pricing multiple times). The override mechanism mitigates this, but operators who don't configure
genai-pricing.jsonwill get stale cost estimates with no warning. Consider logging a startup notice that default pricing is in use, or adding a comment noting the date these prices were last verified.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/otel/genai.rs` around lines 95 - 140, DEFAULT_PRICING is a hardcoded snapshot that can become stale; update the code to (1) add an inline comment near DEFAULT_PRICING noting the date these values were last verified, and (2) detect at startup when the override file (genai-pricing.json) is not present or not loaded and emit a clear startup log/warning that default pricing is in use (include the verification date and advise operators to provide genai-pricing.json to override). Reference DEFAULT_PRICING and the override filename genai-pricing.json when implementing the detection and log message so the warning appears whenever the code falls back to the hardcoded table.
170-185: CWD fallback infind_pricing_configcan be surprising in containerized/service deployments.When running Parseable as a service or in a container, the current working directory is often
/or an arbitrary path, making the"genai-pricing.json"candidate unlikely to resolve as intended. Consider documenting thatPARSEABLE_CONFIG_DIRis the recommended mechanism, or adding a log message when the CWD fallback is checked.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/otel/genai.rs` around lines 170 - 185, The CWD fallback in find_pricing_config (the "genai-pricing.json" candidate) can be misleading in containers; update find_pricing_config to log when the CWD candidate is considered (use the existing logger/tracing crate) and recommend PARSEABLE_CONFIG_DIR in the message, or remove the CWD candidate entirely; specifically, modify the candidates logic in find_pricing_config to either (a) replace the plain Some("genai-pricing.json") with code that checks Path::new("genai-pricing.json").exists() and emits a debug/warn like "checked CWD for genai-pricing.json; prefer PARSEABLE_CONFIG_DIR" when hit, or (b) drop that candidate and add a log explaining that PARSEABLE_CONFIG_DIR is the recommended mechanism, referencing PARSEABLE_CONFIG_DIR and the filename "genai-pricing.json" so callers can find or set the expected config location.
266-322:p_genai_tokens_totalandp_genai_cost_usdsilently skipped when only one of input/output tokens is present.Both
p_genai_tokens_total(Line 276) andp_genai_cost_usd(Line 313) require bothinput_tokensandoutput_tokensto beSome. If a span reports only output tokens (e.g., streaming completions where input isn't re-counted), no total or cost is computed. This is a defensible choice but could lose enrichment for legitimate partial-token spans.Consider computing partial values when at least one token count is available — treating the missing one as 0.
💡 Compute enrichment with partial token data
- if let (Some(inp), Some(out)) = (input_tokens, output_tokens) { + let inp = input_tokens.unwrap_or(0); + let out = output_tokens.unwrap_or(0); + if input_tokens.is_some() || output_tokens.is_some() { record.insert( "p_genai_tokens_total".to_string(), Value::Number(Number::from(inp + out)), ); }Apply the same pattern to the cost calculation block.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/otel/genai.rs` around lines 266 - 322, In enrich_genai_record, p_genai_tokens_total and p_genai_cost_usd currently require both input_tokens and output_tokens to be Some; change them to compute using partial data by treating missing token counts as 0 (use input_tokens.unwrap_or(0) and output_tokens.unwrap_or(0)). For p_genai_tokens_total replace the (Some(inp), Some(out)) check with at least one present or simply compute let total = input_tokens.unwrap_or(0) + output_tokens.unwrap_or(0) and insert if total > 0; for p_genai_cost_usd, if model_name is Some compute cost using inp.unwrap_or(0) and out.unwrap_or(0) and then insert the p_genai_cost_usd value when the lookup_pricing(model_name) returns pricing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/otel/genai.rs`:
- Around line 211-233: The lookup_pricing function silently returns a prefix
match from PRICING_TABLE which can misattribute costs; update lookup_pricing to
emit a trace-level log whenever a non-exact (prefix) match is used by logging
the input model, the matched key, and the resolved pricing tuple before
returning (i.e., when best_match is chosen but the initial
PRICING_TABLE.get(model) was None); keep exact matches unchanged and ensure the
log is at trace/debug level so operators can detect potential false-positive
attributions without noise.
- Around line 95-140: DEFAULT_PRICING is a hardcoded snapshot that can become
stale; update the code to (1) add an inline comment near DEFAULT_PRICING noting
the date these values were last verified, and (2) detect at startup when the
override file (genai-pricing.json) is not present or not loaded and emit a clear
startup log/warning that default pricing is in use (include the verification
date and advise operators to provide genai-pricing.json to override). Reference
DEFAULT_PRICING and the override filename genai-pricing.json when implementing
the detection and log message so the warning appears whenever the code falls
back to the hardcoded table.
- Around line 170-185: The CWD fallback in find_pricing_config (the
"genai-pricing.json" candidate) can be misleading in containers; update
find_pricing_config to log when the CWD candidate is considered (use the
existing logger/tracing crate) and recommend PARSEABLE_CONFIG_DIR in the
message, or remove the CWD candidate entirely; specifically, modify the
candidates logic in find_pricing_config to either (a) replace the plain
Some("genai-pricing.json") with code that checks
Path::new("genai-pricing.json").exists() and emits a debug/warn like "checked
CWD for genai-pricing.json; prefer PARSEABLE_CONFIG_DIR" when hit, or (b) drop
that candidate and add a log explaining that PARSEABLE_CONFIG_DIR is the
recommended mechanism, referencing PARSEABLE_CONFIG_DIR and the filename
"genai-pricing.json" so callers can find or set the expected config location.
- Around line 266-322: In enrich_genai_record, p_genai_tokens_total and
p_genai_cost_usd currently require both input_tokens and output_tokens to be
Some; change them to compute using partial data by treating missing token counts
as 0 (use input_tokens.unwrap_or(0) and output_tokens.unwrap_or(0)). For
p_genai_tokens_total replace the (Some(inp), Some(out)) check with at least one
present or simply compute let total = input_tokens.unwrap_or(0) +
output_tokens.unwrap_or(0) and insert if total > 0; for p_genai_cost_usd, if
model_name is Some compute cost using inp.unwrap_or(0) and out.unwrap_or(0) and
then insert the p_genai_cost_usd value when the lookup_pricing(model_name)
returns pricing.
Summary
Adds server-side GenAI trace processing for streams tagged with
X-P-Dataset-Tag: agent-observability. When OTel traces arrive with this tag, Parseable automatically:IntValueand float attributes to native numeric JSON typesp_genai_cost_usd,p_genai_tokens_total,p_genai_tokens_per_sec, andp_genai_duration_msto every span before storageThis makes GenAI trace data immediately SQL-queryable — no
CAST(), no external pricing lookups, no post-processing.Problem
OTel GenAI traces (from
opentelemetry-instrumentation-openai-v2and similar instrumentors) have several pain points when stored as-is:Type mismatch: OTel's protobuf-to-JSON serialization turns
IntValue(1250)into the string"1250". Without correction,gen_ai.usage.input_tokensbecomes aUtf8column andSUM()fails.No cost visibility: Token counts arrive raw. To answer "how much did this agent run cost?", users must maintain a separate model pricing table and join it in every query.
Missing derived metrics: Output throughput (
tokens/sec) requires combining a span attribute with the span's timing metadata — awkward in SQL. Duration is stored in nanoseconds, but humans think in milliseconds.Schema drift: Without pre-registered fields, the first event defines column types. If the first trace has a string token count, the column is permanently
Utf8.What changed
New:
src/otel/genai.rs(564 lines)The core GenAI module with four parts:
Field definitions —
GENAI_KNOWN_FIELD_LIST(31 fields with Arrow types),GENAI_INT_FIELDS(6 fields),GENAI_FLOAT_FIELDS(5 fields). These are the source of truth for schema creation and type coercion.Type coercion —
coerce_genai_field_types()converts string-encoded integers and floats to native JSON numeric types. Only touches known GenAI fields, safely skips values that are already numeric or unparseable.Cost enrichment —
enrich_genai_record()computes four columns per span:p_genai_cost_usd(input_tokens × input_price) + (output_tokens × output_price)using embedded pricing tablep_genai_tokens_totalinput_tokens + output_tokensp_genai_tokens_per_secoutput_tokens / (span_duration_ns / 1e9)p_genai_duration_msspan_duration_ns / 1e6Pricing table — Embedded per-token pricing for 30+ models across OpenAI, Anthropic, Google, Mistral, Cohere, Meta, and Groq. Uses
LazyLockfor zero per-request allocation. Supports:gpt-4o→gpt-4o)gpt-4o-2024-11-20→gpt-4o, longest prefix wins)genai-pricing.json(searched in$PARSEABLE_CONFIG_DIR, cwd,~/.parseable/)Unit tests — 10 tests covering type coercion (int, float, skip non-string, handle invalid), enrichment (basic, response-model-over-request-model, unknown model, missing tokens), pricing prefix match, and field count validation.
Modified:
src/handlers/http/ingest.rsIn
setup_otel_stream():X-P-Dataset-Tagfrom request headersAgentObservability, mergesGENAI_KNOWN_FIELD_LISTinto the known fields set before stream creation — this ensures all 31 fields are in the schema with correct types from day onedataset_tag(instead ofNone) tocreate_stream_if_not_exists()Modified:
src/handlers/http/modal/utils/ingest_utils.rsIn
flatten_and_push_logs(), in theOtelTracesbranch:dataset_tagviaget_dataset_tag()AgentObservability, appliescoerce_genai_field_types()thenenrich_genai_record()to each flattened trace record before pushing to storageis_genaicheck is done once per batch (not per-record), so non-GenAI streams have zero overheadModified:
src/otel.rsAdded
pub mod genai;to register the new module.New:
resources/parseable-genai-collector.yamlCanonical OTel Collector configuration template for GenAI traces. Uses
${PARSEABLE_URL},${PARSEABLE_AUTH},${STREAM_NAME}variables. Includes the criticalX-P-Dataset-Tag: agent-observabilityheader.New:
resources/genai-pricing-example.jsonExample custom pricing override file showing the expected format.
Design decisions
Why enrich at ingest, not query time?
Computing cost at query time would require a UDF or a JOIN against a pricing table for every query. Ingest-time enrichment means
SELECT SUM(p_genai_cost_usd) FROM "genai-traces"just works. The tradeoff is that if pricing changes, historical data keeps the old price — but that's actually correct (you paid the old price for old calls).Why embedded pricing, not config-only?
Zero-config is important for adoption. The embedded table covers the most common models. The
genai-pricing.jsonoverride exists for custom/fine-tuned models or price updates between Parseable releases.Why prefix matching for model names?
Model IDs include date suffixes (
gpt-4o-2024-11-20,claude-3-5-sonnet-20241022). Listing every dated variant would be unmaintainable. Prefix match with "longest wins" handles this cleanly —gpt-4o-miniwon't accidentally matchgpt-4obecause it's a longer prefix in the table.Why gate on DatasetTag, not auto-detect?
Auto-detecting GenAI fields would add overhead to every OTel trace ingestion. The explicit
X-P-Dataset-Tag: agent-observabilityheader is opt-in — only streams that declare themselves as GenAI get the processing. This keeps non-GenAI trace ingestion at zero additional cost.How to test
1. Start Parseable
2. Start OTel Collector with the included config
3. Send instrumented GenAI traces
4. Query enriched data
Token counts should be native integers (not strings), and the
p_genai_*columns should be populated with correct values.Test plan
cargo test— 10 tests inotel::genai::tests)p_genai_*columns, no type coercion)genai-pricing.jsonSummary by CodeRabbit
Release Notes