[fix] Resolve failing local web tests (oss / ee) by jp-agenta · Pull Request #3950 · Agenta-AI/agenta

jp-agenta · 2026-03-10T15:52:24Z

No description provided.

vercel · 2026-03-10T15:52:30Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Mar 12, 2026 8:39am

…hub.com:Agenta-AI/agenta into fix/local-web-tests

github-actions · 2026-03-10T16:08:43Z

Railway Preview Environment


Preview URL	https://gateway-production-1082.up.railway.app/w
Project	`agenta-oss-pr-3950`
Image tag	`pr-3950-31ad3df`
Status	Deployed
Railway logs	Open logs
Workflow logs	View workflow run
Updated at 2026-03-12T08:47:22.767Z

Copilot

Pull request overview

This PR addresses stability and reliability issues across local/CI test execution (web Playwright + Python pytest) and Railway preview deployments, with supporting refactors to standardize runtime paths and environment handling across OSS/EE.

Changes:

Standardize Playwright runtime outputs/paths (results/reports/storage state/project metadata) and harden OTP/Testmail handling for web tests.
Expand/adjust CI workflows for preview environments and add/adjust test runners + reporting for API/SDK.
Improve Railway deployment scripts (compose-sourced infra images, safer secret defaults) and update docker-compose baselines (Postgres 17).

Reviewed changes

Copilot reviewed 80 out of 91 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
web/turbo.json	Include `DISABLE_PRETTIER` in Turbo global env for cache correctness.
web/tests/utils/testmail/index.ts	Refactor Testmail email/tag generation + timeout/logging improvements.
web/tests/tests/fixtures/user.fixture/authHelpers/utilities.ts	Switch user email generation to runtime-aware helper.
web/tests/tests/fixtures/user.fixture/authHelpers/index.ts	Make OTP UI automation more resilient and add flow logging.
web/tests/tests/fixtures/session.fixture/index.ts	Use runtime Chromium launch options (allowed ports).
web/tests/tests/fixtures/base.fixture/providerHelpers/index.ts	Read project metadata via runtime path helper.
web/tests/tests/fixtures/base.fixture/apiHelpers/index.ts	Read project metadata via runtime path helper; minor formatting.
web/tests/playwright/scripts/run-tests.ts	Formatting-only change for dimension flag regex.
web/tests/playwright/global-teardown.ts	Use runtime paths; rename destructive teardown env var; update messaging.
web/tests/playwright/config/testTags.ts	Collapse re-export type formatting.
web/tests/playwright/config/runtime.ts	New centralized runtime path + Chromium launch option helpers.
web/tests/playwright.config.ts	Route report/output/storageState/launchOptions through runtime helpers.
web/tests/README.md	Update auth + teardown env var guidance; document email format.
web/tests/.gitignore	Ignore new `results/` and `reports/` directories.
web/packages/eslint.config.mjs	Allow disabling prettier rule/plugin via `DISABLE_PRETTIER`.
web/oss/tests/playwright/acceptance/testsset/index.ts	Formatting-only change.
web/oss/tests/playwright/acceptance/smoke.spec.ts	Formatting-only change.
web/oss/tests/playwright/acceptance/prompt-registry/index.ts	Formatting-only change.
web/oss/tests/playwright/acceptance/playground/tests.ts	Disable `networkidle` wait (commented) and formatting tweaks.
web/oss/tests/playwright/acceptance/app/test.ts	Simplify response predicate formatting.
web/oss/tests/playwright/acceptance/.gitkeep	Add placeholder file.
web/oss/tests/manual/cell-renderers/test-extract-chat-messages.ts	Formatting-only change.
web/oss/src/lib/helpers/auth/turnstile.ts	Add commented bypass test site key for Turnstile.
web/eslint.config.mjs	Allow disabling prettier rule/plugin via `DISABLE_PRETTIER`; minor quoting changes.
web/ee/tests/playwright/acceptance/.gitkeep	Add placeholder file.
sdk/oss/tests/pytest/unit/.gitkeep	Add placeholder file.
sdk/oss/tests/pytest/acceptance/integrations/test_vault_secrets.py	Add eventual-consistency polling helper for secrets list assertions.
sdk/oss/tests/pytest/acceptance/.gitkeep	Add placeholder file.
sdk/agenta/sdk/assets.py	Remove older model IDs from built-in model lists.
hosting/railway/oss/scripts/preview-resolve-env.sh	New shared env-resolution script for preview deploys.
hosting/railway/oss/scripts/preview-create-or-update.sh	Refactor to source shared env-resolution script; adjust variable usage.
hosting/railway/oss/scripts/lib.sh	Add compose-image resolution helpers (service + redis).
hosting/railway/oss/scripts/deploy-from-images.sh	Resolve Redis image from compose; remove baked-in placeholder auth/crypt envs.
hosting/railway/oss/scripts/configure.sh	Replace placeholder key defaults with `replace-me`; refactor Postgres password resolution; allow optional Daytona key.
hosting/railway/oss/scripts/bootstrap.sh	Resolve infra images from compose baseline via new helpers.
hosting/railway/oss/README.md	Document compose-baseline image resolution and update workflow references.
hosting/docker-compose/oss/docker-compose.gh.yml	Bump Postgres image from 16 to 17.
hosting/docker-compose/oss/docker-compose.gh.ssl.yml	Bump Postgres image from 16 to 17.
hosting/docker-compose/oss/docker-compose.gh.local.yml	Bump Postgres image from 16 to 17.
hosting/docker-compose/oss/docker-compose.dev.yml	Bump Postgres image from 16 to 17.
hosting/docker-compose/ee/docker-compose.gh.local.yml	Bump Postgres image from 16 to 17.
hosting/docker-compose/ee/docker-compose.dev.yml	Bump Postgres image from 16 to 17.
docs/designs/web-tests/per-test-user-isolation.md	New design doc for worker/test-scoped user/project isolation strategy.
docs/designs/testing/testing.running.specs.md	Update workflow references for styling + unit checks.
docs/design/railway-preview-environments/status.md	Update workflow inventory for preview automation.
docs/design/railway-preview-environments/plan.md	Update plan to reference new workflow structure.
docs/design/playwright-oss-stabilization/status.md	Rename teardown safety env var in guidance.
docs/design/playwright-oss-stabilization/research.md	Rename teardown safety env var in guidance.
docs/design/playwright-oss-stabilization/qa.md	Update auth guidance and teardown safety env var.
docs/design/playwright-oss-stabilization/context.md	Rename teardown safety env var in guidance.
docs/design/playwright-oss-stabilization/backlog.md	Rename teardown safety env var in backlog item text.
docs/community-topics.md	Update Railway workflow references.
api/run-tests.py	Add `AGENTA_LICENSE` envvar support and ensure junit/html reports are generated when not provided.
api/pytest.ini	Remove default junit/html outputs from addopts (now handled by runner).
api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds_security.py	New acceptance tests for embeds security/archived behavior (with TODO notes).
api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds_retrieve_resolve.py	New acceptance tests for `resolve=True` on retrieve/query endpoints.
api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds_legacy.py	New acceptance tests for legacy adapter embed resolution paths.
api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds_errors.py	New acceptance tests covering embed resolution error policies and limits.
api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds_cross_entity.py	New acceptance tests for cross-entity embeds (environments/workflows).
api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds.py	New baseline acceptance tests for embed resolution.
api/oss/tests/pytest/acceptance/.gitkeep	Add placeholder file.
api/oss/src/utils/env.py	Add commented Turnstile bypass site key for tests.
api/oss/src/utils/caching.py	Adjust cache TTL constants/comments (L1 shorter).
api/oss/src/routers/projects_router.py	Harden org lookup when fetching project auth context.
api/ee/tests/pytest/acceptance/.gitkeep	Add placeholder file.
.windsurf/workflows/record-and-refactor-e2e-2.md	Remove workflow doc.
.gitignore	Align ignored test artifact dirs; add web/tests/reports; remove some generic results ignores.
.github/workflows/45-railway-cleanup.yml	Rename workflow display name.
.github/workflows/44-railway-tests.yml	New reusable workflow to run API/SDK/web tests against Railway preview deploys.
.github/workflows/43-railway-deploy.yml	Convert to reusable deploy workflow with additional secrets and outputs.
.github/workflows/42-railway-build.yml	Convert to reusable build workflow; support “skip build” when tag provided; remove direct deploy chaining.
.github/workflows/41-railway-setup.yml	New reusable workflow to bootstrap Railway preview project/env.
.github/workflows/40-railway.yml	Add grouping “header” workflow.
.github/workflows/33-update-api-docs.yml	Rename workflow display name.
.github/workflows/32-generate-demo-traces.yml	Rename workflow display name.
.github/workflows/31-sync-github-labels.yml	Change to scheduled/dispatch run; remove PR/push triggers; disable dry-run.
.github/workflows/30-crons.yml	Add grouping “header” workflow.
.github/workflows/14-check-pr-preview.yml	New orchestration workflow chaining build → setup → deploy → tests.
.github/workflows/12-check-unit-tests.yml	New workflow scaffolding for unit checks (SDK/API wired; web/services currently fail if tests exist).
.github/workflows/11-check-code-styling.yml	New unified Ruff/Prettier/ESLint styling workflow.
.github/workflows/10-playwright-oss-tests.yml	Remove old Playwright OSS workflow (replaced by new preview test workflow).
.github/workflows/10-checks.yml	Add grouping “header” workflow.
.github/workflows/04-check-frontend-linting.yml	Remove old frontend linting workflow (replaced by unified styling workflow).
.github/workflows/03-check-python-linting.yml	Remove old python lint workflow (replaced by unified styling workflow).
.github/workflows/02-check-python-formatting.yml	Remove old python format workflow (replaced by unified styling workflow).
.github/workflows/00-releases.yml	Add grouping “header” workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

.github/workflows/44-railway-tests.yml

Copilot

Pull request overview

Copilot reviewed 81 out of 92 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-12T00:16:19Z

.github/workflows/11-check-code-styling.yml

+        run: cd web && pnpm install --frozen-lockfile
+
+      - name: Run Prettier formatting fix
+        run: cd web && pnpm run format-fix


This workflow is labeled as a code styling check, but it runs pnpm run format-fix (Prettier --write). That can silently modify the workspace and still exit 0, so the job may pass even when the PR is not formatted. Consider switching to a check-only command (e.g. pnpm run format / prettier --check) so CI fails when formatting is needed, or commit any formatting changes within the workflow and fail if the working tree becomes dirty.

Suggested change

run: cd web && pnpm run format-fix

run: |

cd web

pnpm run format-fix

# Fail if formatting changes are needed

git diff --quiet || { echo 'Prettier formatting changes detected. Please run "pnpm run format-fix" in the web directory and commit the changes.'; exit 1; }

Copilot · 2026-03-12T00:16:20Z

.github/workflows/11-check-code-styling.yml

+      - name: Run ESLint fix
+        run: cd web && pnpm run lint-fix


This job runs pnpm run lint-fix (ESLint/Next lint with --fix), which can allow lint issues to be auto-fixed without failing the workflow or reflecting changes in the PR. For a CI check, prefer a non-fixing lint command so PRs fail when they introduce lint violations.

Suggested change

- name: Run ESLint fix

run: cd web && pnpm run lint-fix

- name: Run ESLint check

run: cd web && pnpm run lint

Copilot · 2026-03-12T00:16:20Z

web/tests/tests/fixtures/user.fixture/authHelpers/utilities.ts

 export function createInitialUserState(project: Partial<WorkerInfo["project"]>): UserState {
-    const testmail = getTestmailClient()
-
-    // Create email with structured tag
-    const email = testmail.generateTestEmail({
+    const email = generateRuntimeTestEmail({
        scope: project.name,
        branch: process.env.BRANCH_NAME,
    })


generateRuntimeTestEmail() can generate either a Testmail inbox address or a fallback @test.agenta.ai address depending on env. The docstring example above still shows the old Testmail format (...@namespace.testmail.app), which no longer matches the actual output; updating the example will avoid confusion when debugging auth flows.

Copilot · 2026-03-12T00:16:20Z

web/tests/tests/fixtures/user.fixture/authHelpers/index.ts

+                    const verifyEmailText = page.getByText("Verify your email")
+                    const continueWithOtpButton = page.getByRole("button", {
+                        name: "Continue with OTP",
+                    })
+                    const resendOtpLink = page.getByText("Resend one-time password")


There is a block-scoped const continueWithOtpButton declared earlier in this function, and then another const continueWithOtpButton declared again inside the OTP branch. The shadowing makes it easy to accidentally reference the wrong locator when editing this flow; consider renaming the inner locator (or reusing the outer one) to avoid shadowing.

Copilot

Pull request overview

Copilot reviewed 81 out of 92 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-12T01:17:32Z

web/tests/tests/fixtures/user.fixture/authHelpers/index.ts

                const timestamp = Date.now()
                await uiHelpers.typeWithDelay('input[type="email"]', email)


const timestamp = Date.now() is now unused in this helper. This will likely trigger lint/TS unused-variable checks; remove it or use it consistently (e.g., for timestamp_from).

Copilot · 2026-03-12T01:17:32Z

.github/workflows/11-check-code-styling.yml

+      - name: Install dependencies
+        run: cd web && pnpm install --frozen-lockfile
+
+      - name: Run Prettier formatting fix
+        run: cd web && pnpm run format-fix


This workflow runs format-fix, which will auto-modify files and still exit 0. That makes the job ineffective as a PR gate (formatting problems won’t fail CI). Prefer a check-only command (e.g., prettier --check / pnpm run format-check) and fail if formatting is off.

jp-agenta added 3 commits March 10, 2026 13:30

fix playwright setup

f1841fe

fix turnstile in web/ee

681a55a

Add per-test-user-isolation docs

5989cbc

jp-agenta marked this pull request as ready for review March 10, 2026 15:52

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Mar 10, 2026

vercel bot deployed to Preview March 10, 2026 15:52 View deployment

Merge branch 'main' into fix/local-web-tests

8977bf9

jp-agenta changed the base branch from main to fix/turnstile-loopholes March 10, 2026 15:54

dosubot bot added the bug Something isn't working label Mar 10, 2026

vercel bot deployed to Preview March 10, 2026 15:55 View deployment

jp-agenta added 2 commits March 10, 2026 16:55

Merge branch 'fix/turnstile-loopholes' into fix/local-web-tests

54352c8

Merge branches 'fix/local-web-tests' and 'fix/local-web-tests' of git…

815d7b6

…hub.com:Agenta-AI/agenta into fix/local-web-tests

vercel bot deployed to Preview March 10, 2026 15:57 View deployment

jp-agenta added 15 commits March 11, 2026 10:37

add fast checks, cleanup e2e

4a13f5d

Merge branch 'main' into chore/cleanup-ci-tests

ef4a3f8

Break down railway, reorganize workflows, add unit tests as checks

a74d2b0

fix permissions

2a9822d

merge code styling

3b2c38f

clean up unit tests order

cc78ea8

run pterrier check

6a7ed4a

fix devin issues

5117728

fix health check

3e38d79

postgres:16 > 17

da8f118

fix missing secrets in tests

e0de468

no failure on missing tests

533c707

Merge branch 'fix/local-web-tests' into ci/cleanup-workflows

f3f92c6

always run unit tests

98a59e0

fix styling

bba6a78

Copilot AI review requested due to automatic review settings March 11, 2026 23:11

Copilot started reviewing on behalf of junaway March 11, 2026 23:11 View session

dosubot bot added the Bug Report Something isn't working label Mar 11, 2026

Copilot AI reviewed Mar 11, 2026

View reviewed changes

.github/workflows/44-railway-tests.yml Show resolved Hide resolved

.github/workflows/44-railway-tests.yml Outdated Show resolved Hide resolved

initial oss setup

e96cc33

junaway marked this pull request as draft March 12, 2026 00:10

fix test results path

e0812a1

junaway marked this pull request as ready for review March 12, 2026 00:11

Copilot AI review requested due to automatic review settings March 12, 2026 00:11

Copilot started reviewing on behalf of junaway March 12, 2026 00:11 View session

vercel bot deployed to Preview March 12, 2026 00:12 View deployment

dosubot bot added the ci/cd label Mar 12, 2026

junaway marked this pull request as draft March 12, 2026 00:15

junaway marked this pull request as ready for review March 12, 2026 00:15

Copilot AI reviewed Mar 12, 2026

View reviewed changes

fix readiness

a2d50fd

vercel bot deployed to Preview March 12, 2026 00:17 View deployment

fix L1 cache removed

ab7c396

vercel bot deployed to Preview March 12, 2026 00:35 View deployment

fix invites and always honour oss admin info

1075535

vercel bot deployed to Preview March 12, 2026 00:43 View deployment

fix linting

93f5181

vercel bot deployed to Preview March 12, 2026 00:59 View deployment

junaway marked this pull request as draft March 12, 2026 01:04

junaway marked this pull request as ready for review March 12, 2026 01:12

Copilot AI review requested due to automatic review settings March 12, 2026 01:12

Copilot started reviewing on behalf of junaway March 12, 2026 01:12 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

Merge branch 'fix/turnstile-loopholes' into fix/local-web-tests

94c5206

vercel bot deployed to Preview March 12, 2026 08:39 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix] Resolve failing local web tests (oss / ee)#3950

[fix] Resolve failing local web tests (oss / ee)#3950
jp-agenta wants to merge 43 commits intofix/turnstile-loopholesfrom
fix/local-web-tests

jp-agenta commented Mar 10, 2026

Uh oh!

vercel bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		const timestamp = Date.now()
		await uiHelpers.typeWithDelay('input[type="email"]', email)

Conversation

jp-agenta commented Mar 10, 2026

Uh oh!

vercel bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Railway Preview Environment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

vercel bot commented Mar 10, 2026 •

edited

Loading

github-actions bot commented Mar 10, 2026 •

edited

Loading