Skip to content

Fix XPIA binary_path incompatibility for model targets (#5058420)#45527

Open
slister1001 wants to merge 1 commit intoAzure:mainfrom
slister1001:fix/bugbash-xpia-binary-path-5058420
Open

Fix XPIA binary_path incompatibility for model targets (#5058420)#45527
slister1001 wants to merge 1 commit intoAzure:mainfrom
slister1001:fix/bugbash-xpia-binary-path-5058420

Conversation

@slister1001
Copy link
Member

Bug Fix: #5058420

Problem: The indirect jailbreak (XPIA) strategy creates prompts with binary_path data type when injecting attack content into file-based contexts. When targeting a model (not an agent), PyRIT OpenAIChatTarget rejects binary_path with: ValueError: This target only supports text, image_path, and audio_path. Received: binary_path.

Fix: In _CallbackChatTarget._send_prompt_impl, when the request data type is binary_path, the file content is now read and inlined as text before invoking the callback. This prevents the ValueError from downstream targets that do not support binary_path.

Changes

  • Added import os at module level
  • Added binary_path-to-text conversion in _send_prompt_impl before message construction
  • The conversion reads the file (which contains injected context text) using UTF-8 with error replacement

When the indirect jailbreak (XPIA) strategy creates file-based context
prompts with binary_path data type, the callback chat target now reads
the file content and converts to text before invoking the callback.
This prevents ValueError from targets that don't support binary_path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 5, 2026 01:17
@slister1001 slister1001 requested a review from a team as a code owner March 5, 2026 01:17
@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Mar 5, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses an incompatibility where XPIA/indirect-jailbreak prompts can be generated with binary_path data type, which can be rejected by downstream model targets that only accept text-like inputs. The proposed fix in _CallbackChatTarget inlines the referenced file contents as text before invoking the callback.

Changes:

  • Add os import to support filesystem checks.
  • Inline binary_path file content as UTF-8 text (with replacement) before constructing the outgoing chat message payload.
  • Apply formatting cleanups for readability (multi-line conditions, logging, call sites).

Comment on lines 125 to 136
messages: List[Dict[str, str]] = []
for msg in conversation_history:
for piece in msg.message_pieces:
messages.append(
{
"role": (piece.api_role if hasattr(piece, "api_role") else str(piece.role)),
"role": (
piece.api_role
if hasattr(piece, "api_role")
else str(piece.role)
),
"content": piece.converted_value or piece.original_value or "",
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The binary_path-to-text conversion is only applied to the current request. Conversation history pieces are still appended using piece.converted_value or piece.original_value, so any prior binary_path messages will be sent as a filesystem path string (not the file contents). This can break multi-turn flows and reintroduce the same incompatibility if earlier turns used binary_path. Consider applying the same conversion logic when building messages from conversation_history (ideally via a small helper).

Copilot uses AI. Check for mistakes.
Comment on lines +115 to +119
if request.converted_value_data_type == "binary_path" and os.path.isfile(
request_content
):
with open(request_content, "r", encoding="utf-8", errors="replace") as f:
request_content = f.read()
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This async method reads the file synchronously (open(...).read()), which can block the event loop if the file is large or on a slow filesystem. Consider performing the read in a thread (e.g., asyncio.to_thread) or otherwise ensuring the read is non-blocking to avoid latency spikes during parallel executions.

Copilot uses AI. Check for mistakes.
Comment on lines +111 to +119
# Handle binary_path by reading file content and converting to text.
# XPIA (indirect jailbreak) strategy creates file-based context prompts with
# binary_path data type, but model targets only support text content.
request_content = request.converted_value or request.original_value or ""
if request.converted_value_data_type == "binary_path" and os.path.isfile(
request_content
):
with open(request_content, "r", encoding="utf-8", errors="replace") as f:
request_content = f.read()
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new binary_path handling reads and inlines local file contents based solely on the prompt value. If the file is unreadable (permissions/encoding issues) this will raise and fail the request, and if an untrusted caller can influence the path it could lead to unintended local file disclosure via the callback. Consider wrapping the file read in try/except with a clear error (or a safe fallback), and (if feasible in this layer) constraining reads to expected temp/context directories and/or imposing a max read size.

Copilot uses AI. Check for mistakes.
Comment on lines +27 to +29
callback: Callable[
[List[Dict], bool, Optional[str], Optional[Dict[str, Any]]], Dict
],
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The callback type is annotated as returning Dict, but _send_prompt_impl always awaits it (response = await self._callback(...)). This is a typing mismatch and also masks a real runtime footgun: a synchronous callback with the expected parameter names would pass get_chat_target's signature check but then fail here when awaited. Consider annotating as Callable[..., Awaitable[Dict[str, Any]]] and/or validating/wrapping sync callables in __init__ (or in get_chat_target).

Copilot uses AI. Check for mistakes.
Comment on lines +111 to +119
# Handle binary_path by reading file content and converting to text.
# XPIA (indirect jailbreak) strategy creates file-based context prompts with
# binary_path data type, but model targets only support text content.
request_content = request.converted_value or request.original_value or ""
if request.converted_value_data_type == "binary_path" and os.path.isfile(
request_content
):
with open(request_content, "r", encoding="utf-8", errors="replace") as f:
request_content = f.read()
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are extensive unit tests for _CallbackChatTarget, but the new binary_path-to-text conversion path isn’t covered. Adding a unit test that sets converted_value_data_type="binary_path" with a temp file (and asserts the callback receives the file contents, not the path) would prevent regressions; consider also a test for missing/unreadable files.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants