Skip to content

feat(server): New TTL system, enforce max queue length limits, lazy waitpoint creation#2980

Draft
ericallam wants to merge 4 commits intomainfrom
ea-branch-117
Draft

feat(server): New TTL system, enforce max queue length limits, lazy waitpoint creation#2980
ericallam wants to merge 4 commits intomainfrom
ea-branch-117

Conversation

@ericallam
Copy link
Member

@ericallam ericallam commented Jan 30, 2026

Summary

This PR implements environment-level queue size limits to prevent runaway queues from overwhelming Redis, along with several related improvements.

Environment Queue Limits

Prevents any single environment from overwhelming the shared Redis cluster by enforcing queue size limits at trigger time.

  • Added MAXIMUM_ENV_QUEUE_SIZE environment variable with sensible defaults
  • Queue size checks happen at trigger time - runs are rejected if queue would exceed limit
  • Dashboard UI shows queue limits on both the Queues page and a new Limits page
  • In-memory caching for queue size checks to reduce Redis load
Queues page showing limit Limits page

Fix: batchTriggerAndWait stuck when items skipped

Fixed a bug where batchTriggerAndWait would get stuck forever when some items were skipped due to queue limits.

Root cause: When items were skipped during batch streaming:

  1. Phase 1 (createBatch) returned original runCount to SDK
  2. Phase 2 (streamBatchItems) skipped items due to limits, updated DB with new count
  3. SDK still waited for original count but fewer runs existed
  4. Parent run stuck forever

Fix: Server now returns actual runCount in Phase 2 response, SDK uses this for waitForBatch.

New TTL System

Improved TTL (time-to-live) handling for run expiration with better test coverage.

Lazy Waitpoint Creation

Better idempotency handling through lazy waitpoint creation - waitpoints are now created on-demand rather than eagerly.

Run Queue Improvements

Various improvements to the run queue system with expanded test coverage.

Test plan

  • Test triggering runs when queue is at/near limit
  • Test batchTriggerAndWait with queue limits causing skipped items
  • Verify dashboard displays limits correctly
  • Run new TTL and waitpoint tests

refs TRI-7167

🤖 Generated with Claude Code

@changeset-bot
Copy link

changeset-bot bot commented Jan 30, 2026

⚠️ No Changeset found

Latest commit: aaea8d6

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 30, 2026

Walkthrough

This PR centralizes per-environment queue size logic and enforces per-queue limits across batching and streaming paths, adds a cached environment queue-size lookup, and exposes queueSizeLimit to UI presenters. It refactors queue validation to operate per-queue (resolveQueueNamesForBatchItems, validateMultipleQueueLimits), permits skipping items that would exceed queue limits, and propagates itemsSkipped/runCount through APIs. It introduces TTL-based run expiration (batched TTL system and Redis TTL integration), lazy get-or-create waitpoints, multiple RunEngine/RunQueue API additions, and numerous tests and config additions.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: it highlights three core objectives—TTL system, queue length limits enforcement, and lazy waitpoint creation—all of which are substantially represented in the changeset.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering implementation details, bug fixes, new features, and testing approaches.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ea-branch-117

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vibe-kanban-cloud
Copy link

Review Complete

Your review story is ready!

View Story

Comment !reviewfast on this PR to re-generate the story.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@apps/webapp/app/presenters/v3/LimitsPresenter.server.ts`:
- Around line 170-194: The code silently treats a missing runtime environment as
a zero queue by checking runtimeEnv and continuing; change this to explicitly
handle the case where runtimeEnv is null for the required environmentId by
returning/throwing an error (or returning a clear "not found" result) instead of
falling through to the deployed default logic—locate the block where runtimeEnv
is fetched (the runtimeEnvironment.findFirst call and the subsequent use of
runtimeEnv, currentQueueSize and engine.lengthOfEnvQueue) and add an explicit
branch that raises a descriptive error or returns an appropriate 404-style
response when runtimeEnv is falsy so callers don’t get misleading limits
computed for a non-existent environment.
🧹 Nitpick comments (1)
apps/webapp/app/presenters/v3/LimitsPresenter.server.ts (1)

311-329: Queue size limit logic is duplicated with EnvironmentQueuePresenter.

The limit computation logic (lines 314-317) and source determination (lines 321-328) duplicate the same pattern found in EnvironmentQueuePresenter.server.ts (lines 44-47). Consider extracting this into a shared utility function to ensure consistency and reduce maintenance burden.

♻️ Example shared utility
// In a shared utils file, e.g., ~/utils/queueSizeLimit.ts
export function getQueueSizeLimitForEnvironment(
  environmentType: string,
  organization: { maximumDevQueueSize: number | null; maximumDeployedQueueSize: number | null },
  defaultDevLimit: number | null,
  defaultDeployedLimit: number | null
): { limit: number | null; source: "plan" | "default" } {
  if (environmentType === "DEVELOPMENT") {
    return {
      limit: organization.maximumDevQueueSize ?? defaultDevLimit ?? null,
      source: organization.maximumDevQueueSize ? "plan" : "default",
    };
  }
  return {
    limit: organization.maximumDeployedQueueSize ?? defaultDeployedLimit ?? null,
    source: organization.maximumDeployedQueueSize ? "plan" : "default",
  };
}
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 279102c and a68ef30.

📒 Files selected for processing (4)
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
🧰 Additional context used
📓 Path-based instructions (7)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

**/*.{ts,tsx}: Always import tasks from @trigger.dev/sdk, never use @trigger.dev/sdk/v3 or deprecated client.defineJob pattern
Every Trigger.dev task must be exported and have a unique id property with no timeouts in the run function

Files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Import from @trigger.dev/core using subpaths only, never import from root

Files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
apps/webapp/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Access all environment variables through the env export of env.server.ts instead of directly accessing process.env in the Trigger.dev webapp

Files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: When importing from @trigger.dev/core in the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp

Access environment variables via env export from apps/webapp/app/env.server.ts, never use process.env directly

Files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
🧠 Learnings (10)
📚 Learning: 2025-12-08T15:19:56.823Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 2760
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam/route.tsx:278-281
Timestamp: 2025-12-08T15:19:56.823Z
Learning: In apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam/route.tsx, the tableState search parameter uses intentional double-encoding: the parameter value contains a URL-encoded URLSearchParams string, so decodeURIComponent(value("tableState") ?? "") is required to fully decode it before parsing with new URLSearchParams(). This pattern allows bundling multiple filter/pagination params as a single search parameter.

Applied to files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Control concurrency using the `queue` property with `concurrencyLimit` option

Applied to files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/v3/presenters/**/*.server.{ts,tsx} : Organize presenters in the webapp following the pattern `app/v3/presenters/*/*.server.ts` to move complex loader code into classes

Applied to files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
📚 Learning: 2026-01-15T11:50:06.067Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T11:50:06.067Z
Learning: Applies to apps/webapp/**/*.{ts,tsx} : Access environment variables via `env` export from `apps/webapp/app/env.server.ts`, never use `process.env` directly

Applied to files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/**/*.{ts,tsx} : Access all environment variables through the `env` export of `env.server.ts` instead of directly accessing `process.env` in the Trigger.dev webapp

Applied to files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
📚 Learning: 2026-01-15T11:50:06.067Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T11:50:06.067Z
Learning: Applies to apps/webapp/**/*.test.{ts,tsx} : For testable code in the webapp, never import env.server.ts in test files - pass configuration as options instead

Applied to files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/**/*.test.{ts,tsx} : Test files should only import classes and functions from `app/**/*.ts` files and should not import `env.server.ts` directly or indirectly; pass configuration through options instead

Applied to files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
📚 Learning: 2025-11-14T16:03:06.917Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2681
File: apps/webapp/app/services/platform.v3.server.ts:258-302
Timestamp: 2025-11-14T16:03:06.917Z
Learning: In `apps/webapp/app/services/platform.v3.server.ts`, the `getDefaultEnvironmentConcurrencyLimit` function intentionally throws an error (rather than falling back to org.maximumConcurrencyLimit) when the billing client returns undefined plan limits. This fail-fast behavior prevents users from receiving more concurrency than their plan entitles them to. The org.maximumConcurrencyLimit fallback is only for self-hosted deployments where no billing client exists.

Applied to files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
📚 Learning: 2026-01-12T17:18:09.451Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2870
File: apps/webapp/app/services/redisConcurrencyLimiter.server.ts:56-66
Timestamp: 2026-01-12T17:18:09.451Z
Learning: In `apps/webapp/app/services/redisConcurrencyLimiter.server.ts`, the query concurrency limiter will not be deployed with Redis Cluster mode, so multi-key operations (keyKey and globalKey in different hash slots) are acceptable and will function correctly in standalone Redis mode.

Applied to files:

  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Use the Run Engine 2.0 from `internal/run-engine` for new run lifecycle code in the webapp instead of the legacy run engine

Applied to files:

  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
🧬 Code graph analysis (2)
apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx (2)
apps/webapp/app/utils/numberFormatter.ts (1)
  • formatNumberCompact (3-5)
apps/webapp/app/components/primitives/Tooltip.tsx (1)
  • InfoIconTooltip (110-139)
apps/webapp/app/presenters/v3/LimitsPresenter.server.ts (1)
apps/webapp/app/v3/runEngine.server.ts (1)
  • engine (11-11)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: sdk-compat / Cloudflare Workers
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: sdk-compat / Deno Runtime
  • GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
  • GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
  • GitHub Check: sdk-compat / Bun Runtime
  • GitHub Check: typecheck / typecheck
🔇 Additional comments (5)
apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts (1)

1-56: LGTM!

The implementation correctly extends the Environment type with queueSizeLimit and computes it based on environment type with appropriate fallback chain (org override → env var → null). Environment variable access follows coding guidelines via ~/env.server.

apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx (1)

510-511: LGTM!

Clean consolidation from separate dev/deployed queue size quotas to a unified queueSize quota, conditionally rendered only when a limit is set. This aligns well with the LimitsPresenter changes.

apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx (2)

349-391: LGTM!

The queue size limit display with color-coded usage indicators is well-implemented. The conditional rendering shows the limit suffix when queueSizeLimit exists, with appropriate fallback to the paused indicator.

Minor note: The component has env from useEnvironment() (line 313) shadowing the env import from ~/env.server (line 57). This works correctly since the loader uses the imported env and the component uses the hook result, but renaming one variable (e.g., runtimeEnv for the hook result) could improve clarity.


1145-1152: LGTM!

Clean helper function with consistent threshold logic (90% warning, 100% error). Returning undefined when no limit is set allows proper fallback behavior in the calling code.

apps/webapp/app/presenters/v3/LimitsPresenter.server.ts (1)

15-15: LGTM on the structural changes.

The import of engine and the updated LimitsResult type with consolidated queueSize: QuotaInfo properly support the new unified queue size quota feature.

Also applies to: 70-70

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@apps/webapp/app/env.server.ts`:
- Around line 536-539: The MAXIMUM_DEV_QUEUE_SIZE default of 500 introduces an
unintended hard cap for dev environments because guardQueueSizeLimitsForEnv
reads this value and enforces it; remove the .default(500) so
MAXIMUM_DEV_QUEUE_SIZE remains optional/undefined unless explicitly set in the
environment, leaving the z.coerce.number().int().optional() schema for
MAXIMUM_DEV_QUEUE_SIZE and ensure any code calling guardQueueSizeLimitsForEnv
continues to treat undefined as "no cap" (verify guardQueueSizeLimitsForEnv
behavior and update it only if it currently treats undefined incorrectly).

In `@apps/webapp/app/presenters/v3/LimitsPresenter.server.ts`:
- Around line 313-319: The queue-size quota object (queueSize) in
LimitsPresenter.server.ts is missing the isUpgradable flag so the UI never shows
an Upgrade action; update the queueSize payload returned by the presenter to
include isUpgradable: true for plan-based limits (and add canExceed: true|false
as appropriate), e.g., set isUpgradable to true when
getQueueSizeLimitSource(environmentType, organization) indicates a plan-based
source and ensure the UI-facing fields (queueSize.currentUsage, queueSize.limit,
queueSize.source) remain unchanged.
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a0f94ff and a50a5f5.

📒 Files selected for processing (7)
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
🧰 Additional context used
📓 Path-based instructions (7)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

**/*.{ts,tsx}: Always import tasks from @trigger.dev/sdk, never use @trigger.dev/sdk/v3 or deprecated client.defineJob pattern
Every Trigger.dev task must be exported and have a unique id property with no timeouts in the run function

Files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Import from @trigger.dev/core using subpaths only, never import from root

Files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
apps/webapp/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Access all environment variables through the env export of env.server.ts instead of directly accessing process.env in the Trigger.dev webapp

Files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: When importing from @trigger.dev/core in the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp

Access environment variables via env export from apps/webapp/app/env.server.ts, never use process.env directly

Files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Control concurrency using the `queue` property with `concurrencyLimit` option
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Control concurrency using the `queue` property with `concurrencyLimit` option

Applied to files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
📚 Learning: 2025-11-14T16:03:06.917Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2681
File: apps/webapp/app/services/platform.v3.server.ts:258-302
Timestamp: 2025-11-14T16:03:06.917Z
Learning: In `apps/webapp/app/services/platform.v3.server.ts`, the `getDefaultEnvironmentConcurrencyLimit` function intentionally throws an error (rather than falling back to org.maximumConcurrencyLimit) when the billing client returns undefined plan limits. This fail-fast behavior prevents users from receiving more concurrency than their plan entitles them to. The org.maximumConcurrencyLimit fallback is only for self-hosted deployments where no billing client exists.

Applied to files:

  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
📚 Learning: 2025-12-08T15:19:56.823Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 2760
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam/route.tsx:278-281
Timestamp: 2025-12-08T15:19:56.823Z
Learning: In apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam/route.tsx, the tableState search parameter uses intentional double-encoding: the parameter value contains a URL-encoded URLSearchParams string, so decodeURIComponent(value("tableState") ?? "") is required to fully decode it before parsing with new URLSearchParams(). This pattern allows bundling multiple filter/pagination params as a single search parameter.

Applied to files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use the `task()` function from `trigger.dev/sdk/v3` to define tasks with id and run properties

Applied to files:

  • apps/webapp/app/runEngine/concerns/queues.server.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/**/*.{ts,tsx} : Access all environment variables through the `env` export of `env.server.ts` instead of directly accessing `process.env` in the Trigger.dev webapp

Applied to files:

  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger.config.ts : Specify runtime environment (node or bun) in trigger.config.ts using the `runtime` property

Applied to files:

  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
📚 Learning: 2026-01-12T17:18:09.451Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2870
File: apps/webapp/app/services/redisConcurrencyLimiter.server.ts:56-66
Timestamp: 2026-01-12T17:18:09.451Z
Learning: In `apps/webapp/app/services/redisConcurrencyLimiter.server.ts`, the query concurrency limiter will not be deployed with Redis Cluster mode, so multi-key operations (keyKey and globalKey in different hash slots) are acceptable and will function correctly in standalone Redis mode.

Applied to files:

  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
🧬 Code graph analysis (2)
apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts (1)
apps/webapp/app/v3/utils/queueLimits.server.ts (1)
  • getQueueSizeLimit (24-33)
apps/webapp/app/presenters/v3/LimitsPresenter.server.ts (2)
apps/webapp/app/v3/runEngine.server.ts (1)
  • engine (11-11)
apps/webapp/app/v3/utils/queueLimits.server.ts (2)
  • getQueueSizeLimit (24-33)
  • getQueueSizeLimitSource (42-51)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (26)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: sdk-compat / Cloudflare Workers
  • GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
  • GitHub Check: sdk-compat / Bun Runtime
  • GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
  • GitHub Check: sdk-compat / Deno Runtime
  • GitHub Check: typecheck / typecheck
🔇 Additional comments (7)
apps/webapp/app/runEngine/concerns/queues.server.ts (3)

18-33: LRU cache setup looks solid.

The singleton + bounded LRU store should help reduce Redis churn without unbounded memory growth.


298-317: Confirm stale cache is acceptable for enforcement.

guardQueueSizeLimitsForEnv now relies on an SWR cache; with a ~30s TTL, queue size can be undercounted and allow bursts above the limit. If this is a hard/billing limit, consider bypassing the cache for enforcement (or forcing a fresh read when nearing the limit).


68-68: No issues to flag in these edits.

Also applies to: 94-96, 116-118, 130-131

apps/webapp/app/v3/utils/queueLimits.server.ts (1)

7-51: Queue-limit helpers look good.

Clear resolution order and source labeling; the API surface is minimal and consistent.

apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts (1)

4-52: queueSizeLimit wiring looks consistent.

The new field is derived in one place and keeps the presenter output cohesive.

apps/webapp/app/presenters/v3/LimitsPresenter.server.ts (1)

2-18: API/type expansion for queue limits looks consistent.

The new environmentType parameter and consolidated quota surface align with the updated limit logic.

Also applies to: 58-96

apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx (1)

81-86: Loader and quota table updates look good.

Passing environmentType and gating the queue-size quota on a defined limit is consistent with the new presenter shape.

Also applies to: 511-513

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 536 to 539
MAXIMUM_DEV_QUEUE_SIZE: z.coerce.number().int().optional().default(500),
MAXIMUM_DEPLOYED_QUEUE_SIZE: z.coerce.number().int().optional(),
QUEUE_SIZE_CACHE_TTL_MS: z.coerce.number().int().optional().default(30_000), // 30 seconds
QUEUE_SIZE_CACHE_MAX_SIZE: z.coerce.number().int().optional().default(5_000),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Defaulting MAXIMUM_DEV_QUEUE_SIZE changes enforcement behavior.

This turns previously-unlimited dev environments into a hard 500-queue cap (via guardQueueSizeLimitsForEnv). If that’s not intentional, remove the default and require an explicit env var to enable the limit.

💡 Suggested change (avoid unintended hard limit)
-    MAXIMUM_DEV_QUEUE_SIZE: z.coerce.number().int().optional().default(500),
+    MAXIMUM_DEV_QUEUE_SIZE: z.coerce.number().int().optional(),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
MAXIMUM_DEV_QUEUE_SIZE: z.coerce.number().int().optional().default(500),
MAXIMUM_DEPLOYED_QUEUE_SIZE: z.coerce.number().int().optional(),
QUEUE_SIZE_CACHE_TTL_MS: z.coerce.number().int().optional().default(30_000), // 30 seconds
QUEUE_SIZE_CACHE_MAX_SIZE: z.coerce.number().int().optional().default(5_000),
MAXIMUM_DEV_QUEUE_SIZE: z.coerce.number().int().optional(),
MAXIMUM_DEPLOYED_QUEUE_SIZE: z.coerce.number().int().optional(),
QUEUE_SIZE_CACHE_TTL_MS: z.coerce.number().int().optional().default(30_000), // 30 seconds
QUEUE_SIZE_CACHE_MAX_SIZE: z.coerce.number().int().optional().default(5_000),
🤖 Prompt for AI Agents
In `@apps/webapp/app/env.server.ts` around lines 536 - 539, The
MAXIMUM_DEV_QUEUE_SIZE default of 500 introduces an unintended hard cap for dev
environments because guardQueueSizeLimitsForEnv reads this value and enforces
it; remove the .default(500) so MAXIMUM_DEV_QUEUE_SIZE remains
optional/undefined unless explicitly set in the environment, leaving the
z.coerce.number().int().optional() schema for MAXIMUM_DEV_QUEUE_SIZE and ensure
any code calling guardQueueSizeLimitsForEnv continues to treat undefined as "no
cap" (verify guardQueueSizeLimitsForEnv behavior and update it only if it
currently treats undefined incorrectly).

Comment on lines 313 to 319
queueSize: {
name: "Max queued runs",
description: "Maximum pending runs across all queues in this environment",
limit: getQueueSizeLimit(environmentType, organization),
currentUsage: currentQueueSize,
source: getQueueSizeLimitSource(environmentType, organization),
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Queue-size quota never shows an upgrade action.

isUpgradable is omitted, so the Upgrade column is empty even for plan-based limits. If upgrades should be offered, set it to true (and optionally canExceed).

💡 Suggested change
         queueSize: {
           name: "Max queued runs",
           description: "Maximum pending runs across all queues in this environment",
           limit: getQueueSizeLimit(environmentType, organization),
           currentUsage: currentQueueSize,
           source: getQueueSizeLimitSource(environmentType, organization),
+          isUpgradable: true,
         },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
queueSize: {
name: "Max queued runs",
description: "Maximum pending runs across all queues in this environment",
limit: getQueueSizeLimit(environmentType, organization),
currentUsage: currentQueueSize,
source: getQueueSizeLimitSource(environmentType, organization),
},
queueSize: {
name: "Max queued runs",
description: "Maximum pending runs across all queues in this environment",
limit: getQueueSizeLimit(environmentType, organization),
currentUsage: currentQueueSize,
source: getQueueSizeLimitSource(environmentType, organization),
isUpgradable: true,
},
🤖 Prompt for AI Agents
In `@apps/webapp/app/presenters/v3/LimitsPresenter.server.ts` around lines 313 -
319, The queue-size quota object (queueSize) in LimitsPresenter.server.ts is
missing the isUpgradable flag so the UI never shows an Upgrade action; update
the queueSize payload returned by the presenter to include isUpgradable: true
for plan-based limits (and add canExceed: true|false as appropriate), e.g., set
isUpgradable to true when getQueueSizeLimitSource(environmentType, organization)
indicates a plan-based source and ensure the UI-facing fields
(queueSize.currentUsage, queueSize.limit, queueSize.source) remain unchanged.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal-packages/run-engine/src/engine/tests/ttl.test.ts (1)

1-9: ⚠️ Potential issue | 🟡 Minor

Missing vi import from vitest.

Similar to lazyWaitpoint.test.ts, the vi global is used but not imported.

🔧 Proposed fix
 import { containerTest, assertNonNullable } from "@internal/testcontainers";
 import { trace } from "@internal/tracing";
-import { expect } from "vitest";
+import { expect, vi } from "vitest";
 import { RunEngine } from "../index.js";
apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx (1)

5-5: ⚠️ Potential issue | 🟡 Minor

Use subpath export for tryCatch from @trigger.dev/core/utils.

Per coding guidelines, imports from @trigger.dev/core in the webapp must use subpath exports from the package.json. Change import { tryCatch } from "@trigger.dev/core"; to import { tryCatch } from "@trigger.dev/core/utils";.

🤖 Fix all issues with AI agents
In `@internal-packages/run-engine/src/batch-queue/completionTracker.ts`:
- Around line 112-130: The updateRunCount method currently trusts newRunCount;
add input validation in updateRunCount (in completionTracker.ts) to reject
negative, NaN, non-finite or non-integer values before calling
getMeta/storeMeta: use Number.isInteger(newRunCount) && newRunCount >= 0 (and
optionally Number.isFinite) and if the check fails log an error via
this.logger.error with batchId and the invalid value and then throw an Error or
return a rejected Promise so the invalid value is never persisted; keep existing
behavior for meta retrieval (getMeta) and storage (storeMeta) otherwise.

In `@internal-packages/run-engine/src/batch-queue/index.ts`:
- Around line 380-386: The updateRunCount method can lower runCount below the
already-processed count and leave a batch stuck; before calling
this.completionTracker.updateRunCount(batchId, newRunCount) clamp newRunCount to
at least the current processed count (e.g. newRunCount = Math.max(newRunCount,
this.completionTracker.getProcessedCount(batchId))) or, alternatively, after
updating call into the completion tracker to run the completion/finalization
check (e.g. this.completionTracker.checkCompletion/finalizeBatch for batchId) so
that if processedCount >= runCount the batch is finalized immediately; update
updateRunCount to perform one of these fixes using the existing
completionTracker APIs.

In `@internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts`:
- Around line 1-8: The test uses the vitest runner's mock/time API via
vi.setConfig but never imports vi; update the imports to explicitly import vi
from 'vitest' (for example change the existing import of expect to import {
expect, vi } from "vitest") so the vi symbol used by vi.setConfig is defined and
the test runner APIs are available.

In `@internal-packages/run-engine/src/run-queue/index.ts`:
- Around line 1238-1304: The code currently calls `#expireTtlRuns`(shard, now,
batchSize) which deletes/marks runs as expired before invoking
options.ttlSystem!.callback(expiredRuns), so callback failures only get logged
and those runs can be lost; update `#startTtlConsumer` (or `#expireTtlRuns`) so that
expiration is durable: either (A) defer permanent deletion/marking of runs until
after the callback completes successfully (move the delete/mark logic to after
options.ttlSystem!.callback returns) or (B) if deletion must happen first, add a
durable retry/enqueue path when the callback throws — e.g., push failed expired
run IDs to a retry store (Redis list/sorted set) with metadata and backoff, and
spawn a retry worker to reprocess them; ensure the callback and retry path are
idempotent and update logging to include run IDs and shard when queuing retries.
- Around line 2561-2637: expireTtlRuns currently removes run entries from
per-queue structures but never updates the master queue (masterQueueKey),
leaving stale entries; update the logic so that either (A) the Lua script also
computes the corresponding master/shard key from queueKey and ZREM the runId
from masterQueueKey (or adjust scores) for each expired member, or (B) return
the affected queueKey values (not just member) from expireTtlRuns so the
TypeScript caller can call the existing rebalance/cleanup routine for
masterQueueKey per returned queue; locate expireTtlRuns, the queueKey local
variable and the final results/return section to implement one of these fixes.
🧹 Nitpick comments (7)
packages/core/src/v3/schemas/api.ts (1)

412-424: Tighten validation for new count fields.
These are counts, so restricting them to non‑negative integers prevents accidental floats/negatives in API responses.

♻️ Suggested update
-  itemsSkipped: z.number().optional(),
+  itemsSkipped: z.number().int().nonnegative().optional(),
...
-  runCount: z.number().optional(),
+  runCount: z.number().int().nonnegative().optional(),
apps/webapp/app/runEngine/concerns/queues.server.ts (1)

421-432: Consider potential cache stampede scenario.

The SWR pattern handles stale reads well, but when the cache is completely cold (e.g., after deployment), multiple concurrent requests for the same queue could all trigger Redis lookups simultaneously before the first one populates the cache.

For most use cases this is acceptable, but if queue size lookups become a bottleneck, consider adding a coalescing mechanism.

apps/webapp/app/runEngine/services/triggerTask.server.ts (1)

258-269: Queue validation correctly uses resolved queue name.

The reordering to validate after resolveQueueProperties is correct since the queue name is needed for per-queue validation.

Consider enhancing the error message to include the queue name for better debuggability:

💡 Optional: Include queue name in error message
         if (!queueSizeGuard.ok) {
           throw new ServiceValidationError(
-            `Cannot trigger ${taskId} as the queue size limit for this environment has been reached. The maximum size is ${queueSizeGuard.maximumSize}`
+            `Cannot trigger ${taskId} as the queue size limit for queue '${queueName}' has been reached. The maximum size is ${queueSizeGuard.maximumSize}`
           );
         }
internal-packages/run-engine/src/run-queue/types.ts (1)

56-128: Convert RunQueueKeyProducer interface to a type alias to align with TypeScript guidelines.

♻️ Suggested refactor
-export interface RunQueueKeyProducer {
+export type RunQueueKeyProducer = {
   //queue
   queueKey(
     orgId: string,
     projId: string,
     envId: string,
     queue: string,
     concurrencyKey?: string
   ): string;
   queueKey(env: RunQueueKeyProducerEnvironment, queue: string, concurrencyKey?: string): string;

   legacyMasterQueueKey(masterQueueName: string): string;

   masterQueueKeyForEnvironment(envId: string, shardCount: number): string;
   masterQueueKeyForShard(shard: number): string;
   masterQueueShardForEnvironment(envId: string, shardCount: number): number;
   workerQueueKey(workerQueue: string): string;

   envQueueKey(env: RunQueueKeyProducerEnvironment): string;
   envQueueKeyFromQueue(queue: string): string;
   queueConcurrencyLimitKey(env: RunQueueKeyProducerEnvironment, queue: string): string;
   queueConcurrencyLimitKeyFromQueue(queue: string): string;
   queueCurrentConcurrencyKeyFromQueue(queue: string): string;
   queueCurrentConcurrencyKey(
     env: RunQueueKeyProducerEnvironment,
     queue: string,
     concurrencyKey?: string
   ): string;
   queueCurrentDequeuedKeyFromQueue(queue: string): string;
   queueCurrentDequeuedKey(
     env: RunQueueKeyProducerEnvironment,
     queue: string,
     concurrencyKey?: string
   ): string;

   //env oncurrency
   envCurrentConcurrencyKey(env: EnvDescriptor): string;
   envCurrentConcurrencyKey(env: RunQueueKeyProducerEnvironment): string;

   envConcurrencyLimitKey(env: EnvDescriptor): string;
   envConcurrencyLimitKey(env: RunQueueKeyProducerEnvironment): string;

   envCurrentDequeuedKey(env: EnvDescriptor): string;
   envCurrentDequeuedKey(env: RunQueueKeyProducerEnvironment): string;

   envConcurrencyLimitBurstFactorKey(env: EnvDescriptor): string;
   envConcurrencyLimitBurstFactorKey(env: RunQueueKeyProducerEnvironment): string;
   envConcurrencyLimitBurstFactorKeyFromQueue(queue: string): string;

   envConcurrencyLimitKeyFromQueue(queue: string): string;
   envCurrentConcurrencyKeyFromQueue(queue: string): string;
   envCurrentDequeuedKeyFromQueue(queue: string): string;

   //message payload
   messageKeyPrefixFromQueue(queue: string): string;
   messageKey(orgId: string, messageId: string): string;
   //utils
   orgIdFromQueue(queue: string): string;
   envIdFromQueue(queue: string): string;
   projectIdFromQueue(queue: string): string;
   descriptorFromQueue(queue: string): QueueDescriptor;

   deadLetterQueueKey(env: RunQueueKeyProducerEnvironment): string;
   deadLetterQueueKey(env: EnvDescriptor): string;
   deadLetterQueueKeyFromQueue(queue: string): string;

   // Concurrency sweeper methods
   markedForAckKey(): string;
   currentConcurrencySetKeyScanPattern(): string;

   // TTL system methods
   ttlQueueKeyForShard(shard: number): string;
-}
+};
internal-packages/run-engine/src/engine/systems/ttlSystem.ts (2)

252-260: Unnecessary type cast for TaskRunStatus.

The string literal "EXPIRED" is a valid TaskRunStatus value, so the as TaskRunStatus cast is redundant.

🧹 Minor cleanup
         await this.$.prisma.taskRun.updateMany({
           where: { id: { in: runIdsToExpire } },
           data: {
-            status: "EXPIRED" as TaskRunStatus,
+            status: "EXPIRED",
             completedAt: now,
             expiredAt: now,
           },
         });

263-321: Consider error handling for individual run processing within the batch.

If any single run fails during snapshot creation or waitpoint completion, Promise.all will reject and subsequent runs won't be processed. Since the batch updateMany on line 252 has already succeeded, this could leave some runs in a partially processed state (status=EXPIRED but no snapshot).

For improved resilience, consider using Promise.allSettled and collecting failures:

💡 Suggested approach
-        await Promise.all(
-          runsToExpire.map(async (run) => {
+        const results = await Promise.allSettled(
+          runsToExpire.map(async (run) => {
             // ... existing processing ...
             expired.push(run.id);
+            return run.id;
           })
         );
+
+        // Log any failures but don't throw - the status update already succeeded
+        for (const result of results) {
+          if (result.status === "rejected") {
+            this.$.logger.error("Failed to fully process expired run", {
+              error: result.reason,
+            });
+          }
+        }
internal-packages/run-engine/src/run-queue/index.ts (1)

108-110: Use a type alias for the callback.

This project prefers type over interface in TS; a function-shaped callback is a clean fit for a type alias.

♻️ Suggested change
-export interface TtlSystemCallback {
-  (runs: Array<{ queueKey: string; runId: string; orgId: string }>): Promise<void>;
-}
+export type TtlSystemCallback = (
+  runs: Array<{ queueKey: string; runId: string; orgId: string }>
+) => Promise<void>;

As per coding guidelines: Use types over interfaces for TypeScript.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a50a5f5 and aaea8d6.

⛔ Files ignored due to path filters (1)
  • references/hello-world/src/trigger/batches.ts is excluded by !references/**
📒 Files selected for processing (32)
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • apps/webapp/app/runEngine/services/createBatch.server.ts
  • apps/webapp/app/runEngine/services/streamBatchItems.server.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • internal-packages/run-engine/src/batch-queue/completionTracker.ts
  • internal-packages/run-engine/src/batch-queue/index.ts
  • internal-packages/run-engine/src/engine/index.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • internal-packages/run-engine/src/engine/tests/attemptFailures.test.ts
  • internal-packages/run-engine/src/engine/tests/getSnapshotsSince.test.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • internal-packages/run-engine/src/engine/types.ts
  • internal-packages/run-engine/src/run-queue/index.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • internal-packages/run-engine/src/run-queue/types.ts
  • packages/core/src/v3/schemas/api.ts
  • packages/trigger-sdk/src/v3/shared.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx
  • apps/webapp/app/presenters/v3/EnvironmentQueuePresenter.server.ts
🧰 Additional context used
📓 Path-based instructions (12)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

**/*.{ts,tsx}: Always import tasks from @trigger.dev/sdk, never use @trigger.dev/sdk/v3 or deprecated client.defineJob pattern
Every Trigger.dev task must be exported and have a unique id property with no timeouts in the run function

Files:

  • packages/core/src/v3/schemas/api.ts
  • apps/webapp/app/runEngine/services/createBatch.server.ts
  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/engine/tests/getSnapshotsSince.test.ts
  • internal-packages/run-engine/src/batch-queue/completionTracker.ts
  • packages/trigger-sdk/src/v3/shared.ts
  • internal-packages/run-engine/src/engine/types.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/runEngine/services/streamBatchItems.server.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • apps/webapp/app/env.server.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
  • internal-packages/run-engine/src/batch-queue/index.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • apps/webapp/app/runEngine/types.ts
  • internal-packages/run-engine/src/engine/tests/attemptFailures.test.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • internal-packages/run-engine/src/run-queue/index.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • internal-packages/run-engine/src/engine/index.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • packages/core/src/v3/schemas/api.ts
  • apps/webapp/app/runEngine/services/createBatch.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/runEngine/services/streamBatchItems.server.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Import from @trigger.dev/core using subpaths only, never import from root

Files:

  • packages/core/src/v3/schemas/api.ts
  • apps/webapp/app/runEngine/services/createBatch.server.ts
  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/engine/tests/getSnapshotsSince.test.ts
  • internal-packages/run-engine/src/batch-queue/completionTracker.ts
  • packages/trigger-sdk/src/v3/shared.ts
  • internal-packages/run-engine/src/engine/types.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/runEngine/services/streamBatchItems.server.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • apps/webapp/app/env.server.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
  • internal-packages/run-engine/src/batch-queue/index.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • apps/webapp/app/runEngine/types.ts
  • internal-packages/run-engine/src/engine/tests/attemptFailures.test.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • internal-packages/run-engine/src/run-queue/index.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • internal-packages/run-engine/src/engine/index.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • packages/core/src/v3/schemas/api.ts
  • apps/webapp/app/runEngine/services/createBatch.server.ts
  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/engine/tests/getSnapshotsSince.test.ts
  • internal-packages/run-engine/src/batch-queue/completionTracker.ts
  • packages/trigger-sdk/src/v3/shared.ts
  • internal-packages/run-engine/src/engine/types.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/runEngine/services/streamBatchItems.server.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • apps/webapp/app/env.server.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
  • internal-packages/run-engine/src/batch-queue/index.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • apps/webapp/app/runEngine/types.ts
  • internal-packages/run-engine/src/engine/tests/attemptFailures.test.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • internal-packages/run-engine/src/run-queue/index.ts
  • internal-packages/run-engine/src/engine/index.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

  • packages/core/src/v3/schemas/api.ts
  • apps/webapp/app/runEngine/services/createBatch.server.ts
  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/engine/tests/getSnapshotsSince.test.ts
  • internal-packages/run-engine/src/batch-queue/completionTracker.ts
  • packages/trigger-sdk/src/v3/shared.ts
  • internal-packages/run-engine/src/engine/types.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/runEngine/services/streamBatchItems.server.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • apps/webapp/app/env.server.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
  • internal-packages/run-engine/src/batch-queue/index.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • apps/webapp/app/runEngine/types.ts
  • internal-packages/run-engine/src/engine/tests/attemptFailures.test.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • internal-packages/run-engine/src/run-queue/index.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
  • internal-packages/run-engine/src/engine/index.ts
{packages,integrations}/**/*

📄 CodeRabbit inference engine (CLAUDE.md)

Add a changeset when modifying any public package in packages/* or integrations/* using pnpm run changeset:add

Files:

  • packages/core/src/v3/schemas/api.ts
  • packages/trigger-sdk/src/v3/shared.ts
apps/webapp/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Access all environment variables through the env export of env.server.ts instead of directly accessing process.env in the Trigger.dev webapp

Files:

  • apps/webapp/app/runEngine/services/createBatch.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/runEngine/services/streamBatchItems.server.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: When importing from @trigger.dev/core in the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp

Access environment variables via env export from apps/webapp/app/env.server.ts, never use process.env directly

Files:

  • apps/webapp/app/runEngine/services/createBatch.server.ts
  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/runEngine/services/streamBatchItems.server.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
**/*.{test,spec}.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use vitest for all tests in the Trigger.dev repository

Files:

  • internal-packages/run-engine/src/engine/tests/getSnapshotsSince.test.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • internal-packages/run-engine/src/engine/tests/attemptFailures.test.ts
**/*.test.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.test.{ts,tsx,js,jsx}: Test files should live beside the files under test and use descriptive describe and it blocks
Tests should avoid mocks or stubs and use the helpers from @internal/testcontainers when Redis or Postgres are needed
Use vitest for running unit tests

**/*.test.{ts,tsx,js,jsx}: Use vitest exclusively for testing and never mock anything - use testcontainers instead
Place test files next to source files with naming pattern: source file (e.g., MyService.ts) → MyService.test.ts

Files:

  • internal-packages/run-engine/src/engine/tests/getSnapshotsSince.test.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • internal-packages/run-engine/src/engine/tests/attemptFailures.test.ts
**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use testcontainers helpers (redisTest, postgresTest, containerTest) from @internal/testcontainers for Redis/PostgreSQL testing instead of mocks

Files:

  • internal-packages/run-engine/src/engine/tests/getSnapshotsSince.test.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • internal-packages/run-engine/src/engine/tests/attemptFailures.test.ts
packages/trigger-sdk/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code

Files:

  • packages/trigger-sdk/src/v3/shared.ts
🧠 Learnings (40)
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `idempotencyKeyTTL` option to define a time window during which duplicate triggers return the original run

Applied to files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • internal-packages/run-engine/src/engine/types.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/run-queue/keyProducer.ts
  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
  • internal-packages/run-engine/src/run-queue/index.ts
  • internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2026-01-12T17:18:09.451Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2870
File: apps/webapp/app/services/redisConcurrencyLimiter.server.ts:56-66
Timestamp: 2026-01-12T17:18:09.451Z
Learning: In `apps/webapp/app/services/redisConcurrencyLimiter.server.ts`, the query concurrency limiter will not be deployed with Redis Cluster mode, so multi-key operations (keyKey and globalKey in different hash slots) are acceptable and will function correctly in standalone Redis mode.

Applied to files:

  • internal-packages/run-engine/src/run-queue/types.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use metadata methods (set, del, replace, append, remove, increment, decrement, stream, flush) to update metadata during task execution

Applied to files:

  • internal-packages/run-engine/src/batch-queue/completionTracker.ts
  • packages/trigger-sdk/src/v3/shared.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `runs.subscribeToBatch()` to subscribe to changes for all runs in a batch

Applied to files:

  • internal-packages/run-engine/src/batch-queue/completionTracker.ts
  • packages/trigger-sdk/src/v3/shared.ts
  • internal-packages/run-engine/src/batch-queue/index.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
  • internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Attach metadata to task runs using the metadata option when triggering, and access/update it inside runs using metadata functions

Applied to files:

  • internal-packages/run-engine/src/batch-queue/completionTracker.ts
  • packages/trigger-sdk/src/v3/shared.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `.withStreams()` to subscribe to realtime streams from task metadata in addition to run changes

Applied to files:

  • packages/trigger-sdk/src/v3/shared.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Subscribe to run updates using `runs.subscribeToRun()` for realtime monitoring of task execution

Applied to files:

  • packages/trigger-sdk/src/v3/shared.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTrigger()` to trigger multiple runs of a task from inside another task

Applied to files:

  • packages/trigger-sdk/src/v3/shared.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.batchTrigger()` to trigger multiple runs of a single task with different payloads

Applied to files:

  • packages/trigger-sdk/src/v3/shared.ts
  • apps/webapp/app/runEngine/services/streamBatchItems.server.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to packages/trigger-sdk/**/*.{ts,tsx} : In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code

Applied to files:

  • packages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.trigger()` to trigger multiple different tasks at once from backend code

Applied to files:

  • packages/trigger-sdk/src/v3/shared.ts
  • apps/webapp/app/runEngine/services/streamBatchItems.server.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTaskAndWait()` to batch trigger tasks by passing task instances and wait for results

Applied to files:

  • packages/trigger-sdk/src/v3/shared.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
  • internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerAndWait()` to batch trigger multiple different tasks and wait for results

Applied to files:

  • packages/trigger-sdk/src/v3/shared.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-14T16:03:06.917Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2681
File: apps/webapp/app/services/platform.v3.server.ts:258-302
Timestamp: 2025-11-14T16:03:06.917Z
Learning: In `apps/webapp/app/services/platform.v3.server.ts`, the `getDefaultEnvironmentConcurrencyLimit` function intentionally throws an error (rather than falling back to org.maximumConcurrencyLimit) when the billing client returns undefined plan limits. This fail-fast behavior prevents users from receiving more concurrency than their plan entitles them to. The org.maximumConcurrencyLimit fallback is only for self-hosted deployments where no billing client exists.

Applied to files:

  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Control concurrency using the `queue` property with `concurrencyLimit` option

Applied to files:

  • apps/webapp/app/v3/utils/queueLimits.server.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
  • internal-packages/run-engine/src/run-queue/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use the `task()` function from `trigger.dev/sdk/v3` to define tasks with id and run properties

Applied to files:

  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
  • apps/webapp/app/runEngine/concerns/queues.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.triggerAndWait()` to trigger a task and wait for its result from a parent task

Applied to files:

  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:26:44.496Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/executing-commands.mdc:0-0
Timestamp: 2025-11-27T16:26:44.496Z
Learning: For running tests, navigate into the package directory and run `pnpm run test --run` to enable single-file test execution (e.g., `pnpm run test ./src/engine/tests/ttl.test.ts --run`)

Applied to files:

  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • internal-packages/run-engine/src/engine/tests/attemptFailures.test.ts
  • internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to **/*.{test,spec}.{ts,tsx} : Use vitest for all tests in the Trigger.dev repository

Applied to files:

  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
📚 Learning: 2026-01-15T11:50:06.067Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T11:50:06.067Z
Learning: Applies to **/*.{ts,tsx} : Every Trigger.dev task must be exported and have a unique `id` property with no timeouts in the run function

Applied to files:

  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
📚 Learning: 2025-10-08T11:48:12.327Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2593
File: packages/core/src/v3/workers/warmStartClient.ts:168-170
Timestamp: 2025-10-08T11:48:12.327Z
Learning: The trigger.dev runners execute only in Node 21 and 22 environments, so modern Node.js APIs like AbortSignal.any (introduced in v20.3.0) are supported.

Applied to files:

  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
📚 Learning: 2025-07-12T18:06:04.133Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.

Applied to files:

  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/engine/tests/ttl.test.ts
  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTriggerAndWait()` to batch trigger tasks and wait for all results from a parent task

Applied to files:

  • internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts
  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
  • internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.trigger()` with type-only imports to trigger tasks from backend code without importing the task implementation

Applied to files:

  • internal-packages/run-engine/src/engine/tests/trigger.test.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/**/*.{ts,tsx} : Access all environment variables through the `env` export of `env.server.ts` instead of directly accessing `process.env` in the Trigger.dev webapp

Applied to files:

  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
📚 Learning: 2025-08-14T18:35:44.370Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2390
File: apps/webapp/app/env.server.ts:764-765
Timestamp: 2025-08-14T18:35:44.370Z
Learning: The BoolEnv helper in apps/webapp/app/utils/boolEnv.ts uses z.preprocess with inconsistent default value types across the codebase - some usages pass boolean defaults (correct) while others pass string defaults (incorrect), leading to type confusion. The helper should enforce boolean-only defaults or have clearer documentation.

Applied to files:

  • apps/webapp/app/env.server.ts
📚 Learning: 2025-08-19T09:49:07.011Z
Learnt from: julienvanbeveren
Repo: triggerdotdev/trigger.dev PR: 2417
File: apps/webapp/app/routes/api.v1.projects.$projectRef.envvars.$slug.import.ts:56-61
Timestamp: 2025-08-19T09:49:07.011Z
Learning: In the Trigger.dev codebase, environment variables should default to `isSecret: false` when not explicitly marked as secrets in the syncEnvVars functionality. This is the intended behavior for both regular variables and parent variables.

Applied to files:

  • apps/webapp/app/env.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Scope idempotency keys globally or to current run using the scope parameter

Applied to files:

  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `idempotencyKeys.create()` to create idempotency keys for preventing duplicate task executions

Applied to files:

  • apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Export tasks with unique IDs within the project to enable proper task discovery and execution

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `schedules.task()` for scheduled/cron tasks instead of regular `task()`

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `schemaTask()` from `trigger.dev/sdk/v3` with Zod schema for payload validation

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger.config.ts : Specify runtime environment (node or bun) in trigger.config.ts using the `runtime` property

Applied to files:

  • apps/webapp/app/presenters/v3/LimitsPresenter.server.ts
📚 Learning: 2025-12-18T14:09:10.154Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 2794
File: internal-packages/run-engine/src/engine/systems/debounceSystem.ts:390-397
Timestamp: 2025-12-18T14:09:10.154Z
Learning: In the debounce system (internal-packages/run-engine/src/engine/systems/debounceSystem.ts), millisecond delays are not supported. The minimum debounce delay is 1 second (1s). The parseNaturalLanguageDuration function supports w/d/hr/h/m/s units only.

Applied to files:

  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Limit task duration using the `maxDuration` property (in seconds)

Applied to files:

  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
  • internal-packages/run-engine/src/engine/systems/ttlSystem.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use logger methods (debug, log, info, warn, error) from `trigger.dev/sdk/v3` for structured logging in tasks

Applied to files:

  • internal-packages/run-engine/src/engine/systems/enqueueSystem.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTask()` to batch trigger tasks by passing task instances for static task sets

Applied to files:

  • apps/webapp/app/runEngine/types.ts
  • apps/webapp/app/runEngine/services/batchTrigger.server.ts
📚 Learning: 2026-02-03T18:27:40.429Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 2994
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.environment-variables/route.tsx:553-555
Timestamp: 2026-02-03T18:27:40.429Z
Learning: In apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.environment-variables/route.tsx, the menu buttons (e.g., Edit with PencilSquareIcon) in the TableCellMenu are intentionally icon-only with no text labels as a compact UI pattern. This is a deliberate design choice for this route; preserve the icon-only behavior for consistency in this file.

Applied to files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
📚 Learning: 2025-12-08T15:19:56.823Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 2760
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam/route.tsx:278-281
Timestamp: 2025-12-08T15:19:56.823Z
Learning: In apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam/route.tsx, the tableState search parameter uses intentional double-encoding: the parameter value contains a URL-encoded URLSearchParams string, so decodeURIComponent(value("tableState") ?? "") is required to fully decode it before parsing with new URLSearchParams(). This pattern allows bundling multiple filter/pagination params as a single search parameter.

Applied to files:

  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Use the Run Engine 2.0 from `internal/run-engine` for new run lifecycle code in the webapp instead of the legacy run engine

Applied to files:

  • internal-packages/run-engine/src/engine/index.ts
🧬 Code graph analysis (15)
apps/webapp/app/v3/utils/queueLimits.server.ts (1)
apps/webapp/app/env.server.ts (1)
  • env (1332-1332)
internal-packages/run-engine/src/engine/systems/waitpointSystem.ts (1)
internal-packages/run-engine/src/engine/statuses.ts (1)
  • isFinalRunStatus (55-57)
apps/webapp/app/v3/runEngine.server.ts (2)
apps/webapp/app/env.server.ts (1)
  • env (1332-1332)
apps/supervisor/src/env.ts (1)
  • env (144-144)
internal-packages/run-engine/src/run-queue/keyProducer.ts (1)
internal-packages/run-engine/src/run-queue/index.ts (4)
  • shard (1170-1236)
  • shard (1258-1318)
  • shard (1324-1350)
  • shard (1420-1545)
apps/webapp/app/env.server.ts (2)
apps/webapp/app/utils/boolEnv.ts (1)
  • BoolEnv (12-14)
apps/supervisor/src/envUtil.ts (1)
  • BoolEnv (15-17)
apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts (1)
packages/core/src/v3/isomorphic/friendlyId.ts (1)
  • RunId (93-93)
apps/webapp/app/runEngine/services/triggerTask.server.ts (3)
internal-packages/run-engine/src/engine/errors.ts (1)
  • ServiceValidationError (69-78)
apps/webapp/app/v3/services/common.server.ts (1)
  • ServiceValidationError (1-6)
apps/webapp/app/utils/packets.ts (1)
  • handleMetadataPacket (11-37)
apps/webapp/app/presenters/v3/LimitsPresenter.server.ts (2)
apps/webapp/app/v3/runEngine.server.ts (1)
  • engine (11-11)
apps/webapp/app/v3/utils/queueLimits.server.ts (2)
  • getQueueSizeLimit (24-33)
  • getQueueSizeLimitSource (42-51)
internal-packages/run-engine/src/batch-queue/index.ts (3)
apps/webapp/app/runEngine/services/batchTrigger.server.ts (1)
  • batchId (164-350)
packages/core/src/v3/apiClient/index.ts (1)
  • batchId (419-537)
internal-packages/run-engine/src/engine/systems/batchSystem.ts (1)
  • batchId (39-137)
internal-packages/run-engine/src/engine/systems/enqueueSystem.ts (1)
packages/core/src/v3/isomorphic/duration.ts (1)
  • parseNaturalLanguageDuration (76-79)
apps/webapp/app/runEngine/types.ts (2)
internal-packages/run-engine/src/run-queue/fairQueueSelectionStrategy.ts (4)
  • env (501-521)
  • env (560-578)
  • env (580-594)
  • env (596-616)
internal-packages/run-engine/src/shared/index.ts (1)
  • AuthenticatedEnvironment (4-6)
apps/webapp/app/runEngine/concerns/queues.server.ts (2)
apps/webapp/app/v3/models/workerDeployment.server.ts (1)
  • findCurrentWorkerFromEnvironment (198-224)
apps/webapp/app/runEngine/types.ts (1)
  • QueueValidationResult (38-46)
internal-packages/run-engine/src/run-queue/index.ts (1)
internal-packages/redis/src/index.ts (2)
  • Callback (4-4)
  • Result (4-4)
apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.limits/route.tsx (1)
apps/webapp/app/utils/numberFormatter.ts (1)
  • formatNumber (15-22)
internal-packages/run-engine/src/engine/index.ts (2)
internal-packages/run-engine/src/run-queue/index.ts (2)
  • options (1152-1168)
  • options (1240-1256)
internal-packages/run-engine/src/shared/index.ts (1)
  • MinimalAuthenticatedEnvironment (8-19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: sdk-compat / Cloudflare Workers
  • GitHub Check: typecheck / typecheck
  • GitHub Check: sdk-compat / Deno Runtime
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: sdk-compat / Bun Runtime
  • GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +112 to +130
/**
* Update the runCount in batch metadata.
* Used when items are skipped due to queue limits.
*/
async updateRunCount(batchId: string, newRunCount: number): Promise<void> {
const meta = await this.getMeta(batchId);
if (!meta) {
this.logger.error("Cannot update runCount: batch metadata not found", { batchId });
return;
}

const updatedMeta: BatchMeta = {
...meta,
runCount: newRunCount,
};

await this.storeMeta(batchId, updatedMeta);
this.logger.debug("Updated batch runCount", { batchId, oldRunCount: meta.runCount, newRunCount });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard against invalid runCount values.
updateRunCount should reject negative/NaN/non‑integer values to avoid corrupting batch metadata.

🛡️ Suggested guard
  async updateRunCount(batchId: string, newRunCount: number): Promise<void> {
+    if (!Number.isInteger(newRunCount) || newRunCount < 0) {
+      this.logger.error("Invalid runCount update", { batchId, newRunCount });
+      return;
+    }
     const meta = await this.getMeta(batchId);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/**
* Update the runCount in batch metadata.
* Used when items are skipped due to queue limits.
*/
async updateRunCount(batchId: string, newRunCount: number): Promise<void> {
const meta = await this.getMeta(batchId);
if (!meta) {
this.logger.error("Cannot update runCount: batch metadata not found", { batchId });
return;
}
const updatedMeta: BatchMeta = {
...meta,
runCount: newRunCount,
};
await this.storeMeta(batchId, updatedMeta);
this.logger.debug("Updated batch runCount", { batchId, oldRunCount: meta.runCount, newRunCount });
}
/**
* Update the runCount in batch metadata.
* Used when items are skipped due to queue limits.
*/
async updateRunCount(batchId: string, newRunCount: number): Promise<void> {
if (!Number.isInteger(newRunCount) || newRunCount < 0) {
this.logger.error("Invalid runCount update", { batchId, newRunCount });
return;
}
const meta = await this.getMeta(batchId);
if (!meta) {
this.logger.error("Cannot update runCount: batch metadata not found", { batchId });
return;
}
const updatedMeta: BatchMeta = {
...meta,
runCount: newRunCount,
};
await this.storeMeta(batchId, updatedMeta);
this.logger.debug("Updated batch runCount", { batchId, oldRunCount: meta.runCount, newRunCount });
}
🤖 Prompt for AI Agents
In `@internal-packages/run-engine/src/batch-queue/completionTracker.ts` around
lines 112 - 130, The updateRunCount method currently trusts newRunCount; add
input validation in updateRunCount (in completionTracker.ts) to reject negative,
NaN, non-finite or non-integer values before calling getMeta/storeMeta: use
Number.isInteger(newRunCount) && newRunCount >= 0 (and optionally
Number.isFinite) and if the check fails log an error via this.logger.error with
batchId and the invalid value and then throw an Error or return a rejected
Promise so the invalid value is never persisted; keep existing behavior for meta
retrieval (getMeta) and storage (storeMeta) otherwise.

Comment on lines +380 to +386
/**
* Update the runCount for a batch.
* Used when items are skipped due to queue limits.
*/
async updateRunCount(batchId: string, newRunCount: number): Promise<void> {
return this.completionTracker.updateRunCount(batchId, newRunCount);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Batch can get stuck if runCount drops after processing completes.
If newRunCount is lowered to a value already reached by processedCount, no further items may run to trigger finalization, leaving the batch incomplete.

🛠️ Suggested fix
  async updateRunCount(batchId: string, newRunCount: number): Promise<void> {
-    return this.completionTracker.updateRunCount(batchId, newRunCount);
+    await this.completionTracker.updateRunCount(batchId, newRunCount);
+    const meta = await this.completionTracker.getMeta(batchId);
+    if (!meta) return;
+    const processedCount = await this.completionTracker.getProcessedCount(batchId);
+    if (processedCount >= meta.runCount) {
+      await this.#finalizeBatch(batchId, meta);
+    }
  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/**
* Update the runCount for a batch.
* Used when items are skipped due to queue limits.
*/
async updateRunCount(batchId: string, newRunCount: number): Promise<void> {
return this.completionTracker.updateRunCount(batchId, newRunCount);
}
/**
* Update the runCount for a batch.
* Used when items are skipped due to queue limits.
*/
async updateRunCount(batchId: string, newRunCount: number): Promise<void> {
await this.completionTracker.updateRunCount(batchId, newRunCount);
const meta = await this.completionTracker.getMeta(batchId);
if (!meta) return;
const processedCount = await this.completionTracker.getProcessedCount(batchId);
if (processedCount >= meta.runCount) {
await this.#finalizeBatch(batchId, meta);
}
}
🤖 Prompt for AI Agents
In `@internal-packages/run-engine/src/batch-queue/index.ts` around lines 380 -
386, The updateRunCount method can lower runCount below the already-processed
count and leave a batch stuck; before calling
this.completionTracker.updateRunCount(batchId, newRunCount) clamp newRunCount to
at least the current processed count (e.g. newRunCount = Math.max(newRunCount,
this.completionTracker.getProcessedCount(batchId))) or, alternatively, after
updating call into the completion tracker to run the completion/finalization
check (e.g. this.completionTracker.checkCompletion/finalizeBatch for batchId) so
that if processedCount >= runCount the batch is finalized immediately; update
updateRunCount to perform one of these fixes using the existing
completionTracker APIs.

Comment on lines +1 to +8
import { containerTest, assertNonNullable } from "@internal/testcontainers";
import { trace } from "@internal/tracing";
import { expect } from "vitest";
import { RunEngine } from "../index.js";
import { setTimeout } from "node:timers/promises";
import { setupAuthenticatedEnvironment, setupBackgroundWorker } from "./setup.js";

vi.setConfig({ testTimeout: 60_000 });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing vi import from vitest.

The vi global is used on line 8 but not imported. While vitest may auto-inject globals, explicit imports are more reliable.

🔧 Proposed fix
 import { containerTest, assertNonNullable } from "@internal/testcontainers";
 import { trace } from "@internal/tracing";
-import { expect } from "vitest";
+import { expect, vi } from "vitest";
 import { RunEngine } from "../index.js";
🤖 Prompt for AI Agents
In `@internal-packages/run-engine/src/engine/tests/lazyWaitpoint.test.ts` around
lines 1 - 8, The test uses the vitest runner's mock/time API via vi.setConfig
but never imports vi; update the imports to explicitly import vi from 'vitest'
(for example change the existing import of expect to import { expect, vi } from
"vitest") so the vi symbol used by vi.setConfig is defined and the test runner
APIs are available.

Comment on lines +1238 to +1304
// TTL System Methods

#startTtlConsumers() {
if (!this.options.ttlSystem) {
this.logger.debug("TTL system disabled (no ttlSystem config)");
return;
}

const shardCount = this.options.ttlSystem.shardCount ?? this.shardCount;

for (let i = 0; i < shardCount; i++) {
this.logger.debug(`Starting TTL consumer ${i}`);
this.#startTtlConsumer(i).catch((err) => {
this.logger.error(`Failed to start TTL consumer ${i}`, { error: err });
});
}

this.logger.debug(`Started ${shardCount} TTL consumers`);
}

async #startTtlConsumer(shard: number) {
if (!this.options.ttlSystem) {
return;
}

const pollIntervalMs = this.options.ttlSystem.pollIntervalMs ?? 1000;
const batchSize = this.options.ttlSystem.batchSize ?? 100;
let processedCount = 0;

try {
for await (const _ of setInterval(pollIntervalMs, null, {
signal: this.abortController.signal,
})) {
const now = Date.now();

const [error, expiredRuns] = await tryCatch(
this.#expireTtlRuns(shard, now, batchSize)
);

if (error) {
this.logger.error(`Failed to expire TTL runs for shard ${shard}`, {
error,
service: this.name,
shard,
});
continue;
}

if (expiredRuns.length > 0) {
this.logger.debug(`Expired ${expiredRuns.length} TTL runs in shard ${shard}`, {
service: this.name,
shard,
count: expiredRuns.length,
});

// Call the callback with expired runs
try {
await this.options.ttlSystem!.callback(expiredRuns);
processedCount += expiredRuns.length;
} catch (callbackError) {
this.logger.error(`TTL callback failed for shard ${shard}`, {
error: callbackError,
service: this.name,
shard,
runCount: expiredRuns.length,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Callback failures can silently drop expired runs.

#expireTtlRuns() removes runs before invoking the callback, and failures are only logged. If the callback is responsible for persisting expiration state, those runs may be lost without retry. Consider a durable retry path (e.g., a Redis list/sorted set of failed expirations, or a dedicated worker with backoff) or make the callback itself handle retry + idempotency explicitly.

🤖 Prompt for AI Agents
In `@internal-packages/run-engine/src/run-queue/index.ts` around lines 1238 -
1304, The code currently calls `#expireTtlRuns`(shard, now, batchSize) which
deletes/marks runs as expired before invoking
options.ttlSystem!.callback(expiredRuns), so callback failures only get logged
and those runs can be lost; update `#startTtlConsumer` (or `#expireTtlRuns`) so that
expiration is durable: either (A) defer permanent deletion/marking of runs until
after the callback completes successfully (move the delete/mark logic to after
options.ttlSystem!.callback returns) or (B) if deletion must happen first, add a
durable retry/enqueue path when the callback throws — e.g., push failed expired
run IDs to a retry store (Redis list/sorted set) with metadata and backoff, and
spawn a retry worker to reprocess them; ensure the callback and retry path are
idempotent and update logging to include run IDs and shard when queuing retries.

Comment on lines +2561 to +2637
// Expire TTL runs - atomically removes from TTL set and acknowledges from normal queue
this.redis.defineCommand("expireTtlRuns", {
numberOfKeys: 1,
lua: `
local ttlQueueKey = KEYS[1]
local keyPrefix = ARGV[1]
local currentTime = tonumber(ARGV[2])
local batchSize = tonumber(ARGV[3])
local shardCount = tonumber(ARGV[4])

-- Get expired runs from TTL sorted set (score <= currentTime)
local expiredMembers = redis.call('ZRANGEBYSCORE', ttlQueueKey, '-inf', currentTime, 'LIMIT', 0, batchSize)

if #expiredMembers == 0 then
return {}
end

local results = {}

for i, member in ipairs(expiredMembers) do
-- Parse member format: "queueKey|runId|orgId"
local pipePos1 = string.find(member, "|", 1, true)
if pipePos1 then
local pipePos2 = string.find(member, "|", pipePos1 + 1, true)
if pipePos2 then
local queueKey = string.sub(member, 1, pipePos1 - 1)
local runId = string.sub(member, pipePos1 + 1, pipePos2 - 1)
local orgId = string.sub(member, pipePos2 + 1)

-- Remove from TTL set
redis.call('ZREM', ttlQueueKey, member)

-- Construct keys for acknowledging the run from normal queue
-- Extract org from queueKey: {org:orgId}:proj:...
local orgKeyStart = string.find(queueKey, "{org:", 1, true)
local orgKeyEnd = string.find(queueKey, "}", orgKeyStart, true)
local orgFromQueue = string.sub(queueKey, orgKeyStart + 5, orgKeyEnd - 1)

local messageKey = keyPrefix .. "{org:" .. orgFromQueue .. "}:message:" .. runId

-- Delete message key
redis.call('DEL', messageKey)

-- Remove from queue sorted set
redis.call('ZREM', queueKey, runId)

-- Remove from env queue (derive from queueKey)
-- queueKey format: {org:X}:proj:Y:env:Z:queue:Q[:ck:C]
local envQueueKey = string.match(queueKey, "(.+):queue:")
if envQueueKey then
-- envQueueKey is now "{org:X}:proj:Y:env:Z" but we need "{org:X}:env:Z"
local envMatch = string.match(queueKey, ":env:([^:]+)")
if envMatch then
envQueueKey = "{org:" .. orgFromQueue .. "}:env:" .. envMatch
redis.call('ZREM', envQueueKey, runId)
end
end

-- Remove from concurrency sets
local concurrencyKey = queueKey .. ":currentConcurrency"
local dequeuedKey = queueKey .. ":currentDequeued"
redis.call('SREM', concurrencyKey, runId)
redis.call('SREM', dequeuedKey, runId)

-- Env concurrency (derive from queueKey)
local envConcurrencyKey = "{org:" .. orgFromQueue .. "}:env:" .. (string.match(queueKey, ":env:([^:]+)") or "") .. ":currentConcurrency"
local envDequeuedKey = "{org:" .. orgFromQueue .. "}:env:" .. (string.match(queueKey, ":env:([^:]+)") or "") .. ":currentDequeued"
redis.call('SREM', envConcurrencyKey, runId)
redis.call('SREM', envDequeuedKey, runId)

-- Add to results
table.insert(results, member)
end
end
end

return results
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

TTL expiry doesn’t rebalance the master queue.

expireTtlRuns removes messages from the queue/env queue but never updates masterQueueKey. That can leave stale entries and cause repeated empty dequeues. Consider rebalancing per affected queue (either inside the Lua script if you can derive the shard/master key, or in TS after expiry based on returned queueKey values).

🤖 Prompt for AI Agents
In `@internal-packages/run-engine/src/run-queue/index.ts` around lines 2561 -
2637, expireTtlRuns currently removes run entries from per-queue structures but
never updates the master queue (masterQueueKey), leaving stale entries; update
the logic so that either (A) the Lua script also computes the corresponding
master/shard key from queueKey and ZREM the runId from masterQueueKey (or adjust
scores) for each expired member, or (B) return the affected queueKey values (not
just member) from expireTtlRuns so the TypeScript caller can call the existing
rebalance/cleanup routine for masterQueueKey per returned queue; locate
expireTtlRuns, the queueKey local variable and the final results/return section
to implement one of these fixes.

@ericallam ericallam changed the title feat(dashboard): Display environment queue length limits on queues and limits page feat(server): New TTL system, enforce max queue length limits, lazy waitpoint creation Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant