Preserve multimodal media in saved eval results by d42me · Pull Request #1015 · PrimeIntellect-ai/verifiers

d42me · 2026-03-13T02:33:53Z

Summary

preserve multimodal images and audio in saved eval results instead of collapsing them to placeholders
normalize image data URLs into base64-backed input_image records and preserve structured input_audio payloads when valid
add regression coverage for data URL parsing, audio alias handling, typed message serialization, and prompt/completion save paths

Note

Medium Risk
Changes how prompts/completions are serialized when saving eval results, which can affect downstream consumers expecting prior placeholder text formats. New parsing/normalization logic for multimodal parts could drop or transform malformed media payloads differently than before.

Overview
Saved eval outputs now preserve structured multimodal content in prompt/completion (including image_url and input_audio) by switching states_to_outputs to use new serialize_messages_for_output rather than messages_to_printable placeholders.

message_utils adds multimodal-safe serialization helpers (including audio alias support and whitespace-compacting normalization) plus _parse_data_url validation utilities, and tests are expanded/renamed to cover data-url parsing, typed-message serialization, audio fallback behavior, and save-path regression for multimodal prompts/completions. A small reliability tweak also updates RLM sandbox cleanup to use shutil.rmtree(..., ignore_errors=True).

^{Written by Cursor Bugbot for commit f4dfaf1. This will update automatically on new commits. Configure here.}

verifiers/utils/message_utils.py

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-14T01:51:25Z

verifiers/utils/message_utils.py

+    if not _BASE64_DATA_RE.fullmatch(compact_data):
+        return None
+
+    return media_type.lower(), compact_data


Unused production function with supporting regex constants

Low Severity

_parse_data_url and its three module-level compiled regexes (_DATA_URL_RE, _IMAGE_MEDIA_TYPE_RE, _BASE64_DATA_RE) are defined in production code but never called from any production code path. The function is only imported and invoked in the test file test_message_utils_multimodal.py. Neither _extract_image_part_for_output nor serialize_message_for_output calls it. Additionally, verifiers/clients/anthropic_messages_client.py already contains a parse_data_url function that appears to serve the same purpose, making this a potential duplication as well.

Additional Locations (1)

verifiers/utils/message_utils.py#L22-L28

hallerite

left some comments, but lgtm! pre-approved.

hallerite · 2026-03-13T05:29:26Z

verifiers/envs/experimental/rlm_env.py


-        await asyncio.to_thread(shutil.rmtree, session.local_rollout_dir, True)
+        await asyncio.to_thread(
+            lambda: shutil.rmtree(session.local_rollout_dir, ignore_errors=True)


I don't like the lambda

hallerite · 2026-03-14T08:14:36Z

verifiers/utils/message_utils.py

+    return media_type.lower(), compact_data
+
+
+def _extract_image_part_for_output(part: Mapping[str, Any]) -> dict[str, Any] | None:


all this dict matching is annoying and it would be better to have an image type that we can use internally for better readability (which I have a PR for), but I think it's fine temporarily

d42me requested a review from hallerite March 13, 2026 02:36

Preserve multimodal media in saved eval results

ce3efb6

d42me requested review from mikasenghaas and snimu March 13, 2026 02:36

d42me force-pushed the fix/save-multimodal-base64-results branch from 2a656f0 to ce3efb6 Compare March 13, 2026 02:36

cursor bot reviewed Mar 13, 2026

View reviewed changes

verifiers/utils/message_utils.py Outdated Show resolved Hide resolved

Preserve OpenAI-style multimodal content in saved outputs

f4dfaf1

cursor bot reviewed Mar 14, 2026

View reviewed changes

hallerite approved these changes Mar 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve multimodal media in saved eval results#1015

Preserve multimodal media in saved eval results#1015
d42me wants to merge 2 commits intoPrimeIntellect-ai:mainfrom
d42me:fix/save-multimodal-base64-results

d42me commented Mar 13, 2026 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Mar 14, 2026

Uh oh!

hallerite left a comment

Uh oh!

hallerite Mar 13, 2026

Uh oh!

hallerite Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		return media_type.lower(), compact_data


		def _extract_image_part_for_output(part: Mapping[str, Any]) -> dict[str, Any] \| None:

Conversation

d42me commented Mar 13, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 14, 2026

Choose a reason for hiding this comment

Unused production function with supporting regex constants

Uh oh!

hallerite left a comment

Choose a reason for hiding this comment

Uh oh!

hallerite Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

hallerite Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

d42me commented Mar 13, 2026 •

edited by cursor bot

Loading