Skip to content

[Evaluation] Fix red team status tracking, cache key mismatch, and evaluation error handling#45517

Open
slister1001 wants to merge 2 commits intomainfrom
fix/redteam-bugbash-status-scoring-cache
Open

[Evaluation] Fix red team status tracking, cache key mismatch, and evaluation error handling#45517
slister1001 wants to merge 2 commits intomainfrom
fix/redteam-bugbash-status-scoring-cache

Conversation

@slister1001
Copy link
Member

Fixes three bugs discovered during the red team SDK bug bash:

Bug 1 - Run status stuck at in_progress: _determine_run_status() now treats leftover pending and running entries as failed instead of in_progress. By the time this method runs the scan is finished, so pending entries (from skipped risk categories or Foundry execution failures) indicate failure, not ongoing work.

Bug 2 - ungrounded_attributes silently skipped: _execute_attacks_with_foundry() now uses get_attack_objective_from_risk_category() to build the cache lookup key, matching the caching logic in _get_attack_objectives(). Previously, objectives were cached under isa but looked up under ungrounded_attributes, causing the category to appear to have 0 objectives despite the API returning 100.

Bug 3 - ServiceInvocationException inflating ASR: RAIServiceScorer now detects when the RAI evaluation service returns an error response (properties.outcome == 'error') and raises RuntimeError, causing PyRIT to treat the score as UNDETERMINED. Previously, the erroneous passed=False from error responses was incorrectly treated as attack success, inflating the protected_material ASR from 0% to 50%.

slister1001 and others added 2 commits March 4, 2026 16:05
…r handling

Bug 1 - Status tracking: _determine_run_status now treats 'pending' and
'running' entries as 'failed' instead of 'in_progress'. By the time this
method runs the scan is finished, so leftover 'pending' entries (from
skipped risk categories or Foundry execution failures) indicate failure,
not ongoing work.

Bug 2 - Cache key mismatch: _execute_attacks_with_foundry now uses
get_attack_objective_from_risk_category() to build the cache lookup key,
matching the caching logic in _get_attack_objectives. Previously,
ungrounded_attributes objectives were cached under 'isa' but looked up
under 'ungrounded_attributes', causing them to be silently skipped.

Bug 3 - Evaluation error handling: RAIServiceScorer now detects when the
RAI evaluation service returns an error response (properties.outcome ==
'error', e.g. ServiceInvocationException) and raises RuntimeError. This
causes PyRIT to treat the score as UNDETERMINED instead of using the
erroneous passed=False to incorrectly mark the attack as successful,
which was inflating ASR.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 4, 2026 21:11
@slister1001 slister1001 requested a review from a team as a code owner March 4, 2026 21:11
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes three bugs found during the red team SDK bug bash:

  1. Run status stuck at in_progress: Treats leftover pending and running statuses as failed since the scan has already finished.
  2. ungrounded_attributes silently skipped: Fixes a cache key mismatch by using get_attack_objective_from_risk_category() instead of the raw risk value for the baseline cache lookup key.
  3. ServiceInvocationException inflating ASR: Detects error responses from the RAI evaluation service and raises RuntimeError so scores are marked as UNDETERMINED rather than being incorrectly treated as attack success.

Changes:

  • Updated _determine_run_status() to collapse pending/running into the failure set
  • Fixed cache key construction in _execute_attacks_with_foundry() to match the caching logic
  • Added error-outcome detection in RAIServiceScorer._score_piece_async() to prevent false attack-success counts

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
_result_processor.py Treats pending/running as terminal failures in _determine_run_status()
_red_team.py Uses get_attack_objective_from_risk_category() for consistent cache key lookup
_rai_scorer.py Detects properties.outcome == "error" and raises RuntimeError for undetermined scoring
CHANGELOG.md Documents the three bug fixes

@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants