Skip to content

fix(openai-agents): capture response.instructions as system prompt on generation spans#3739

Open
ritiztambi wants to merge 1 commit intotraceloop:mainfrom
ritiztambi:fix/openai-agents-capture-instructions
Open

fix(openai-agents): capture response.instructions as system prompt on generation spans#3739
ritiztambi wants to merge 1 commit intotraceloop:mainfrom
ritiztambi:fix/openai-agents-capture-instructions

Conversation

@ritiztambi
Copy link

@ritiztambi ritiztambi commented Feb 27, 2026

Fixes #3738

Problem

The opentelemetry-instrumentation-openai-agents package does not capture response.instructions from the OpenAI Responses API. When an agent has instructions set (the system prompt), generation spans have no gen_ai.prompt entry with role: system — the system prompt is silently dropped.

The _extract_response_attributes() function reads temperature, max_output_tokens, top_p, model, frequency_penalty, output, and usage — but never response.instructions.

Fix

In on_span_end, when handling GenerationSpanData / ResponseSpanData, prepend response.instructions as a system message to the input data before calling _extract_prompt_attributes. This produces:

  • gen_ai.prompt.0.role = "system"
  • gen_ai.prompt.0.content = <agent instructions>

Followed by the conversation history at indices 1, 2, 3, etc.

Applied to both the primary code path (ResponseSpanData / GenerationSpanData) and the legacy fallback path.

Reference

This matches exactly how the vanilla opentelemetry-instrumentation-openai package handles instructions in responses_wrappers.py:

if traced_response.instructions:
    _set_span_attribute(
        span,
        f"{GenAIAttributes.GEN_AI_PROMPT}.{prompt_index}.content",
        traced_response.instructions,
    )
    _set_span_attribute(
        span,
        f"{GenAIAttributes.GEN_AI_PROMPT}.{prompt_index}.role",
        "system",
    )

Testing

Tested with an OpenAI Agents SDK agent that has instructions set. Before: no system prompt in generation spans. After: system prompt appears as gen_ai.prompt.0 with role system.


Important

Capture response.instructions as a system prompt in generation spans in on_span_end in _hooks.py.

  • Behavior:
    • In on_span_end in _hooks.py, prepend response.instructions as a system message to input data for GenerationSpanData and ResponseSpanData.
    • Sets gen_ai.prompt.0.role to "system" and gen_ai.prompt.0.content to <agent instructions>.
    • Applied to both primary and legacy code paths.
  • Reference:
    • Matches behavior of vanilla opentelemetry-instrumentation-openai in handling instructions.
  • Testing:
    • Verified with an agent having instructions set; system prompt now appears in generation spans.

This description was created by Ellipsis for 9b3bdd2. You can customize this summary. It will automatically update as commits are pushed.

Summary by CodeRabbit

  • Bug Fixes
    • System instruction messages are now prepended to prompt data when content tracing is enabled, ensuring they are included in telemetry for OpenAI agent operations and improving observability and tracing accuracy.

@CLAassistant
Copy link

CLAassistant commented Feb 27, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 9b3bdd2 in 12 seconds. Click for details.
  • Reviewed 47 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_ZgSQMc8t1OACSjmc

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@coderabbitai
Copy link

coderabbitai bot commented Feb 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9b3bdd2 and 4f2e321.

📒 Files selected for processing (1)
  • packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py

📝 Walkthrough

Walkthrough

Prepend response.instructions as a system-role message to the input data when extracting prompt attributes during span end handling, so the instrumentor includes agent instructions in generated span attributes.

Changes

Cohort / File(s) Summary
System Message Extraction
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py
Added helper to prepend response.instructions as a {"role":"system","content": ...} message to input_data when trace_content is enabled; updated span end paths (ResponseSpanData/GenerationSpanData and legacy fallback) to call the helper before _extract_prompt_attributes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 I nudged a system prompt out of the mist,
Pushed it first in the prompt list.
Spans now see the instruction's art,
A tiny change, a clearer chart. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: capturing response.instructions as a system prompt in generation spans for the OpenAI Agents instrumentation.
Linked Issues check ✅ Passed The pull request successfully implements all coding requirements from issue #3738: prepending response.instructions as a system message to input_data before prompt extraction, resulting in correct gen_ai.prompt attributes.
Out of Scope Changes check ✅ Passed All changes are directly related to the stated objective of capturing response.instructions as system prompt on generation spans; no unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py (1)

626-637: Deduplicate the instruction-prepend block to avoid behavior drift.

The same logic is duplicated in both branches. A small helper keeps both paths consistent and easier to maintain.

Refactor proposal
@@
 def _extract_response_attributes(otel_span, response, trace_content: bool):
@@
     return model_settings
+
+
+def _prepend_system_instruction(input_data, response, trace_content: bool):
+    if not (
+        trace_content
+        and response
+        and hasattr(response, "instructions")
+        and response.instructions
+    ):
+        return input_data
+    system_msg = {"role": "system", "content": response.instructions}
+    if not input_data:
+        return [system_msg]
+    return [system_msg] + (input_data if isinstance(input_data, list) else list(input_data))
@@
-                if (
-                    trace_content
-                    and response
-                    and hasattr(response, "instructions")
-                    and response.instructions
-                ):
-                    system_msg = {"role": "system", "content": response.instructions}
-                    input_data = [system_msg] + (input_data if input_data else [])
+                input_data = _prepend_system_instruction(
+                    input_data, response, trace_content
+                )
@@
-                if (
-                    trace_content
-                    and response
-                    and hasattr(response, "instructions")
-                    and response.instructions
-                ):
-                    system_msg = {"role": "system", "content": response.instructions}
-                    input_data = [system_msg] + (input_data if input_data else [])
+                input_data = _prepend_system_instruction(
+                    input_data, response, trace_content
+                )

Also applies to: 687-695

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py`
around lines 626 - 637, The duplicated block that prepends a system instruction
to input_data (checking span_data.response.instructions and creating system_msg)
should be extracted into a single helper function (e.g.,
_prepend_system_instructions(span_data, input_data) or
_maybe_prepend_system_instructions) and invoked from both places in _hooks.py
where the logic currently appears (the block around the response =
getattr(span_data, "response", None) and the duplicate at 687-695); implement
the helper to return the potentially modified input_data, handle None input_data
correctly, and update the two call sites (instead of duplicating the condition
and list construction) so behavior remains identical and easier to maintain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py`:
- Around line 626-637: The duplicated block that prepends a system instruction
to input_data (checking span_data.response.instructions and creating system_msg)
should be extracted into a single helper function (e.g.,
_prepend_system_instructions(span_data, input_data) or
_maybe_prepend_system_instructions) and invoked from both places in _hooks.py
where the logic currently appears (the block around the response =
getattr(span_data, "response", None) and the duplicate at 687-695); implement
the helper to return the potentially modified input_data, handle None input_data
correctly, and update the two call sites (instead of duplicating the condition
and list construction) so behavior remains identical and easier to maintain.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a78de64 and 9b3bdd2.

📒 Files selected for processing (1)
  • packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py

@ritiztambi ritiztambi force-pushed the fix/openai-agents-capture-instructions branch from 9b3bdd2 to 4f2e321 Compare February 27, 2026 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Bug Report: OpenAI Agents instrumentor does not capture response.instructions (system prompt)

2 participants