[fix] Resolve failing local web tests (oss / ee)#3950
[fix] Resolve failing local web tests (oss / ee)#3950jp-agenta wants to merge 43 commits intofix/turnstile-loopholesfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…hub.com:Agenta-AI/agenta into fix/local-web-tests
Railway Preview Environment
|
There was a problem hiding this comment.
Pull request overview
This PR addresses stability and reliability issues across local/CI test execution (web Playwright + Python pytest) and Railway preview deployments, with supporting refactors to standardize runtime paths and environment handling across OSS/EE.
Changes:
- Standardize Playwright runtime outputs/paths (results/reports/storage state/project metadata) and harden OTP/Testmail handling for web tests.
- Expand/adjust CI workflows for preview environments and add/adjust test runners + reporting for API/SDK.
- Improve Railway deployment scripts (compose-sourced infra images, safer secret defaults) and update docker-compose baselines (Postgres 17).
Reviewed changes
Copilot reviewed 80 out of 91 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| web/turbo.json | Include DISABLE_PRETTIER in Turbo global env for cache correctness. |
| web/tests/utils/testmail/index.ts | Refactor Testmail email/tag generation + timeout/logging improvements. |
| web/tests/tests/fixtures/user.fixture/authHelpers/utilities.ts | Switch user email generation to runtime-aware helper. |
| web/tests/tests/fixtures/user.fixture/authHelpers/index.ts | Make OTP UI automation more resilient and add flow logging. |
| web/tests/tests/fixtures/session.fixture/index.ts | Use runtime Chromium launch options (allowed ports). |
| web/tests/tests/fixtures/base.fixture/providerHelpers/index.ts | Read project metadata via runtime path helper. |
| web/tests/tests/fixtures/base.fixture/apiHelpers/index.ts | Read project metadata via runtime path helper; minor formatting. |
| web/tests/playwright/scripts/run-tests.ts | Formatting-only change for dimension flag regex. |
| web/tests/playwright/global-teardown.ts | Use runtime paths; rename destructive teardown env var; update messaging. |
| web/tests/playwright/config/testTags.ts | Collapse re-export type formatting. |
| web/tests/playwright/config/runtime.ts | New centralized runtime path + Chromium launch option helpers. |
| web/tests/playwright.config.ts | Route report/output/storageState/launchOptions through runtime helpers. |
| web/tests/README.md | Update auth + teardown env var guidance; document email format. |
| web/tests/.gitignore | Ignore new results/ and reports/ directories. |
| web/packages/eslint.config.mjs | Allow disabling prettier rule/plugin via DISABLE_PRETTIER. |
| web/oss/tests/playwright/acceptance/testsset/index.ts | Formatting-only change. |
| web/oss/tests/playwright/acceptance/smoke.spec.ts | Formatting-only change. |
| web/oss/tests/playwright/acceptance/prompt-registry/index.ts | Formatting-only change. |
| web/oss/tests/playwright/acceptance/playground/tests.ts | Disable networkidle wait (commented) and formatting tweaks. |
| web/oss/tests/playwright/acceptance/app/test.ts | Simplify response predicate formatting. |
| web/oss/tests/playwright/acceptance/.gitkeep | Add placeholder file. |
| web/oss/tests/manual/cell-renderers/test-extract-chat-messages.ts | Formatting-only change. |
| web/oss/src/lib/helpers/auth/turnstile.ts | Add commented bypass test site key for Turnstile. |
| web/eslint.config.mjs | Allow disabling prettier rule/plugin via DISABLE_PRETTIER; minor quoting changes. |
| web/ee/tests/playwright/acceptance/.gitkeep | Add placeholder file. |
| sdk/oss/tests/pytest/unit/.gitkeep | Add placeholder file. |
| sdk/oss/tests/pytest/acceptance/integrations/test_vault_secrets.py | Add eventual-consistency polling helper for secrets list assertions. |
| sdk/oss/tests/pytest/acceptance/.gitkeep | Add placeholder file. |
| sdk/agenta/sdk/assets.py | Remove older model IDs from built-in model lists. |
| hosting/railway/oss/scripts/preview-resolve-env.sh | New shared env-resolution script for preview deploys. |
| hosting/railway/oss/scripts/preview-create-or-update.sh | Refactor to source shared env-resolution script; adjust variable usage. |
| hosting/railway/oss/scripts/lib.sh | Add compose-image resolution helpers (service + redis). |
| hosting/railway/oss/scripts/deploy-from-images.sh | Resolve Redis image from compose; remove baked-in placeholder auth/crypt envs. |
| hosting/railway/oss/scripts/configure.sh | Replace placeholder key defaults with replace-me; refactor Postgres password resolution; allow optional Daytona key. |
| hosting/railway/oss/scripts/bootstrap.sh | Resolve infra images from compose baseline via new helpers. |
| hosting/railway/oss/README.md | Document compose-baseline image resolution and update workflow references. |
| hosting/docker-compose/oss/docker-compose.gh.yml | Bump Postgres image from 16 to 17. |
| hosting/docker-compose/oss/docker-compose.gh.ssl.yml | Bump Postgres image from 16 to 17. |
| hosting/docker-compose/oss/docker-compose.gh.local.yml | Bump Postgres image from 16 to 17. |
| hosting/docker-compose/oss/docker-compose.dev.yml | Bump Postgres image from 16 to 17. |
| hosting/docker-compose/ee/docker-compose.gh.local.yml | Bump Postgres image from 16 to 17. |
| hosting/docker-compose/ee/docker-compose.dev.yml | Bump Postgres image from 16 to 17. |
| docs/designs/web-tests/per-test-user-isolation.md | New design doc for worker/test-scoped user/project isolation strategy. |
| docs/designs/testing/testing.running.specs.md | Update workflow references for styling + unit checks. |
| docs/design/railway-preview-environments/status.md | Update workflow inventory for preview automation. |
| docs/design/railway-preview-environments/plan.md | Update plan to reference new workflow structure. |
| docs/design/playwright-oss-stabilization/status.md | Rename teardown safety env var in guidance. |
| docs/design/playwright-oss-stabilization/research.md | Rename teardown safety env var in guidance. |
| docs/design/playwright-oss-stabilization/qa.md | Update auth guidance and teardown safety env var. |
| docs/design/playwright-oss-stabilization/context.md | Rename teardown safety env var in guidance. |
| docs/design/playwright-oss-stabilization/backlog.md | Rename teardown safety env var in backlog item text. |
| docs/community-topics.md | Update Railway workflow references. |
| api/run-tests.py | Add AGENTA_LICENSE envvar support and ensure junit/html reports are generated when not provided. |
| api/pytest.ini | Remove default junit/html outputs from addopts (now handled by runner). |
| api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds_security.py | New acceptance tests for embeds security/archived behavior (with TODO notes). |
| api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds_retrieve_resolve.py | New acceptance tests for resolve=True on retrieve/query endpoints. |
| api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds_legacy.py | New acceptance tests for legacy adapter embed resolution paths. |
| api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds_errors.py | New acceptance tests covering embed resolution error policies and limits. |
| api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds_cross_entity.py | New acceptance tests for cross-entity embeds (environments/workflows). |
| api/oss/tests/pytest/acceptance/workflows/test_workflow_embeds.py | New baseline acceptance tests for embed resolution. |
| api/oss/tests/pytest/acceptance/.gitkeep | Add placeholder file. |
| api/oss/src/utils/env.py | Add commented Turnstile bypass site key for tests. |
| api/oss/src/utils/caching.py | Adjust cache TTL constants/comments (L1 shorter). |
| api/oss/src/routers/projects_router.py | Harden org lookup when fetching project auth context. |
| api/ee/tests/pytest/acceptance/.gitkeep | Add placeholder file. |
| .windsurf/workflows/record-and-refactor-e2e-2.md | Remove workflow doc. |
| .gitignore | Align ignored test artifact dirs; add web/tests/reports; remove some generic results ignores. |
| .github/workflows/45-railway-cleanup.yml | Rename workflow display name. |
| .github/workflows/44-railway-tests.yml | New reusable workflow to run API/SDK/web tests against Railway preview deploys. |
| .github/workflows/43-railway-deploy.yml | Convert to reusable deploy workflow with additional secrets and outputs. |
| .github/workflows/42-railway-build.yml | Convert to reusable build workflow; support “skip build” when tag provided; remove direct deploy chaining. |
| .github/workflows/41-railway-setup.yml | New reusable workflow to bootstrap Railway preview project/env. |
| .github/workflows/40-railway.yml | Add grouping “header” workflow. |
| .github/workflows/33-update-api-docs.yml | Rename workflow display name. |
| .github/workflows/32-generate-demo-traces.yml | Rename workflow display name. |
| .github/workflows/31-sync-github-labels.yml | Change to scheduled/dispatch run; remove PR/push triggers; disable dry-run. |
| .github/workflows/30-crons.yml | Add grouping “header” workflow. |
| .github/workflows/14-check-pr-preview.yml | New orchestration workflow chaining build → setup → deploy → tests. |
| .github/workflows/12-check-unit-tests.yml | New workflow scaffolding for unit checks (SDK/API wired; web/services currently fail if tests exist). |
| .github/workflows/11-check-code-styling.yml | New unified Ruff/Prettier/ESLint styling workflow. |
| .github/workflows/10-playwright-oss-tests.yml | Remove old Playwright OSS workflow (replaced by new preview test workflow). |
| .github/workflows/10-checks.yml | Add grouping “header” workflow. |
| .github/workflows/04-check-frontend-linting.yml | Remove old frontend linting workflow (replaced by unified styling workflow). |
| .github/workflows/03-check-python-linting.yml | Remove old python lint workflow (replaced by unified styling workflow). |
| .github/workflows/02-check-python-formatting.yml | Remove old python format workflow (replaced by unified styling workflow). |
| .github/workflows/00-releases.yml | Add grouping “header” workflow. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 81 out of 92 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| run: cd web && pnpm install --frozen-lockfile | ||
|
|
||
| - name: Run Prettier formatting fix | ||
| run: cd web && pnpm run format-fix |
There was a problem hiding this comment.
This workflow is labeled as a code styling check, but it runs pnpm run format-fix (Prettier --write). That can silently modify the workspace and still exit 0, so the job may pass even when the PR is not formatted. Consider switching to a check-only command (e.g. pnpm run format / prettier --check) so CI fails when formatting is needed, or commit any formatting changes within the workflow and fail if the working tree becomes dirty.
| run: cd web && pnpm run format-fix | |
| run: | | |
| cd web | |
| pnpm run format-fix | |
| # Fail if formatting changes are needed | |
| git diff --quiet || { echo 'Prettier formatting changes detected. Please run "pnpm run format-fix" in the web directory and commit the changes.'; exit 1; } |
| - name: Run ESLint fix | ||
| run: cd web && pnpm run lint-fix |
There was a problem hiding this comment.
This job runs pnpm run lint-fix (ESLint/Next lint with --fix), which can allow lint issues to be auto-fixed without failing the workflow or reflecting changes in the PR. For a CI check, prefer a non-fixing lint command so PRs fail when they introduce lint violations.
| - name: Run ESLint fix | |
| run: cd web && pnpm run lint-fix | |
| - name: Run ESLint check | |
| run: cd web && pnpm run lint |
| export function createInitialUserState(project: Partial<WorkerInfo["project"]>): UserState { | ||
| const testmail = getTestmailClient() | ||
|
|
||
| // Create email with structured tag | ||
| const email = testmail.generateTestEmail({ | ||
| const email = generateRuntimeTestEmail({ | ||
| scope: project.name, | ||
| branch: process.env.BRANCH_NAME, | ||
| }) |
There was a problem hiding this comment.
generateRuntimeTestEmail() can generate either a Testmail inbox address or a fallback @test.agenta.ai address depending on env. The docstring example above still shows the old Testmail format (...@namespace.testmail.app), which no longer matches the actual output; updating the example will avoid confusion when debugging auth flows.
| const verifyEmailText = page.getByText("Verify your email") | ||
| const continueWithOtpButton = page.getByRole("button", { | ||
| name: "Continue with OTP", | ||
| }) | ||
| const resendOtpLink = page.getByText("Resend one-time password") |
There was a problem hiding this comment.
There is a block-scoped const continueWithOtpButton declared earlier in this function, and then another const continueWithOtpButton declared again inside the OTP branch. The shadowing makes it easy to accidentally reference the wrong locator when editing this flow; consider renaming the inner locator (or reusing the outer one) to avoid shadowing.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 81 out of 92 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| const timestamp = Date.now() | ||
| await uiHelpers.typeWithDelay('input[type="email"]', email) |
There was a problem hiding this comment.
const timestamp = Date.now() is now unused in this helper. This will likely trigger lint/TS unused-variable checks; remove it or use it consistently (e.g., for timestamp_from).
| - name: Install dependencies | ||
| run: cd web && pnpm install --frozen-lockfile | ||
|
|
||
| - name: Run Prettier formatting fix | ||
| run: cd web && pnpm run format-fix |
There was a problem hiding this comment.
This workflow runs format-fix, which will auto-modify files and still exit 0. That makes the job ineffective as a PR gate (formatting problems won’t fail CI). Prefer a check-only command (e.g., prettier --check / pnpm run format-check) and fail if formatting is off.
No description provided.