Skip to content

Fix crescendo JSON parsing crash producing zero results (#5058399)#45526

Open
slister1001 wants to merge 1 commit intoAzure:mainfrom
slister1001:fix/bugbash-crescendo-json-5058399
Open

Fix crescendo JSON parsing crash producing zero results (#5058399)#45526
slister1001 wants to merge 1 commit intoAzure:mainfrom
slister1001:fix/bugbash-crescendo-json-5058399

Conversation

@slister1001
Copy link
Member

Bug Fix: #5058399

When Foundry crescendo orchestration returns non-JSON responses (e.g., raw markdown), the ScenarioOrchestrator now catches the error gracefully instead of propagating it. Partial results from successful attack strategies (e.g., baseline) are preserved even when other strategies fail.

Changes

  • Wrapped run_async() in try/except in ScenarioOrchestrator.execute()
  • Logs warnings for attack execution errors
  • Attempts to retrieve partial results from the scenario on failure

When Foundry's crescendo orchestration returns non-JSON responses,
the ScenarioOrchestrator now catches the error gracefully instead
of propagating it. Partial results from successful attack strategies
(e.g., baseline) are preserved even when other strategies fail.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 5, 2026 01:16
@slister1001 slister1001 requested a review from a team as a code owner March 5, 2026 01:16
@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Mar 5, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a crash that occurs when the Foundry crescendo orchestration returns non-JSON responses (e.g., raw markdown), causing run_async() to throw a JSON parsing exception and producing zero results. The fix wraps run_async() in a try/except inside ScenarioOrchestrator.execute() to catch such errors gracefully and attempt to recover any partial results from FoundryScenario._result.

Changes:

  • Added a try/except around self._scenario.run_async() in ScenarioOrchestrator.execute() to catch exceptions and log a warning instead of propagating
  • Added a nested try/except to retrieve partial results from self._scenario._result if available after a failure

)
# Try to retrieve partial results from the scenario
try:
if self._scenario and hasattr(self._scenario, "_result"):
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if self._scenario guard on line 124 is always True at this point. self._scenario is unconditionally assigned on line 97 before the try block, and no code path between lines 97 and 124 can set it back to None. The redundant check could be misleading to future maintainers by implying _scenario could be None here. It can be removed to simplify the code.

Suggested change
if self._scenario and hasattr(self._scenario, "_result"):
if hasattr(self._scenario, "_result"):

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +127
except Exception as e:
self.logger.warning(
f"Error during attack execution for {self.risk_category}: {str(e)}. "
f"Partial results may still be available."
)
# Try to retrieve partial results from the scenario
try:
if self._scenario and hasattr(self._scenario, "_result"):
self._scenario_result = self._scenario._result
except Exception:
pass
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When run_async() throws, this catch block swallows the exception and causes execute() to return normally. However, _execution_manager.py (lines 158-175) also wraps orchestrator.execute() in a try/except that, on failure, records a "status": "failed" entry for the risk category and skips result processing. With this change, the exception is consumed here inside execute(), so that outer handler is never reached. Instead, execution proceeds to the post-processing code (lines 177-205 in _execution_manager.py) with _scenario_result set to None or a partial value.

The intended behavior (preserving partial results) may be achievable, but the side effect is that the "status": "failed" error tracking in _execution_manager.py is bypassed. If _scenario._result is also None or missing, the caller gets no results and no error indication — the failure is silently hidden. Consider whether a warning-level log is sufficient, or whether the orchestrator should surface a distinct return value or status to indicate partial/failed execution to its callers.

Copilot uses AI. Check for mistakes.
Comment on lines +126 to +127
except Exception:
pass
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No unit test covers the new exception-handling behavior introduced in execute() — specifically the case where run_async() raises an exception. The test suite has comprehensive coverage of ScenarioOrchestrator, and other error paths are tested (e.g., test_partial_execution_failure_preserves_successful_categories tests the outer _execution_manager.py handler). A test for the inner catch should verify that:

  1. When run_async() raises, execute() does not propagate the exception.
  2. If _scenario._result is populated, it is used as _scenario_result.
  3. If _scenario._result is absent, _scenario_result remains None.
Suggested change
except Exception:
pass
except Exception as retrieval_error:
self.logger.debug(
"Failed to retrieve partial scenario result after run_async error: %s",
retrieval_error,
exc_info=True,
)

Copilot uses AI. Check for mistakes.
Comment on lines +122 to +127
# Try to retrieve partial results from the scenario
try:
if self._scenario and hasattr(self._scenario, "_result"):
self._scenario_result = self._scenario._result
except Exception:
pass
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accessing self._scenario._result relies on an undocumented private attribute of the third-party PyRIT FoundryScenario class. There is no public API guarantee that this attribute exists, stores a ScenarioResult, or is populated with partial results when run_async() raises. Even if it exists today, its name, type, or semantics could change in any PyRIT version without notice.

The hasattr guard prevents a crash if the attribute is absent, but if it does exist with a different type or structure than ScenarioResult, assigning it to self._scenario_result will cause a downstream failure (e.g., in get_attack_results() which calls self._scenario_result.attack_results). Consider documenting a PyRIT version dependency or using the public API if one is available to retrieve partial results.

Suggested change
# Try to retrieve partial results from the scenario
try:
if self._scenario and hasattr(self._scenario, "_result"):
self._scenario_result = self._scenario._result
except Exception:
pass

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants