Skip to content

feat(eval): decouple evaluation execution with remote eval support#1317

Draft
mjnovice wants to merge 4 commits intomainfrom
feat/decouple-eval-execution-remote-backend
Draft

feat(eval): decouple evaluation execution with remote eval support#1317
mjnovice wants to merge 4 commits intomainfrom
feat/decouple-eval-execution-remote-backend

Conversation

@mjnovice
Copy link
Contributor

Summary

  • Add SerializableSpan and ReconstructedSpan models for serializing/deserializing OpenTelemetry trace spans to JSON, enabling trace data to be sent to the backend for remote evaluation
  • Introduce LocalEvaluationStrategy and RemoteEvaluationStrategy via a strategy pattern, decoupling evaluator execution from the CLI process
  • Add RemoteEvaluationClient that submits evaluation payloads to the C# Agents backend (POST /evaluate) and polls for results (GET /evaluate/status/{id}) with exponential backoff
  • Add --remote-eval CLI flag and UIPATH_REMOTE_EVAL env var to opt into remote evaluation
  • Add skip_studio_web_reporting flag to avoid duplicate Studio Web reporting when the backend handles it

Test plan

  • Verify uipath eval agent.json (without --remote-eval) works identically to current behavior
  • Verify uipath eval agent.json --remote-eval submits to backend, polls, and displays results
  • Verify SerializableSpan round-trip: ReadableSpan → serialize → deserialize → ReconstructedSpan
  • Verify fallback behavior when backend is unreachable
  • Verify skip_studio_web_reporting prevents duplicate API calls

🤖 Generated with Claude Code

mjnovice and others added 4 commits February 6, 2026 14:21
…rter

Move the service prefix into _get_base_url() so that localhost URLs use
/llmops_ while all other URLs use /llmopstenant_. This allows local
development to route to the correct service endpoint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…porter

When UIPATH_TRACE_BASE_URL is set, use it directly as the base URL
instead of deriving it from UIPATH_URL. This allows full control over
the trace endpoint without relying on the localhost/llmops_ heuristic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Simplify _get_base_url to only two paths: use UIPATH_TRACE_BASE_URL
verbatim if set, otherwise derive from UIPATH_URL with llmopstenant_
appended. The localhost/llmops_ heuristic is no longer needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… eval support

Add strategy pattern to support running evaluators either locally (default)
or on a remote C# Agents backend via --remote-eval flag / UIPATH_REMOTE_EVAL
env var. When remote, the CLI serializes traces and agent output, POSTs to
/api/evaluate, polls for results, and skips duplicate Studio Web reporting.

New files:
- SerializableSpan/ReconstructedSpan models for trace serialization
- RemoteEvaluationClient for backend communication
- EvaluationStrategy with Local and Remote implementations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added test:uipath-langchain Triggers tests in the uipath-langchain-python repository test:uipath-llamaindex Triggers tests in the uipath-llamaindex-python repository labels Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:uipath-langchain Triggers tests in the uipath-langchain-python repository test:uipath-llamaindex Triggers tests in the uipath-llamaindex-python repository

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant