Skip to content

Conversation

@vaibhavshn
Copy link
Collaborator

@vaibhavshn vaibhavshn commented Jan 30, 2026

Summary

Adds @cloudflare/tanstack-ai — the canonical package for using TanStack AI with Cloudflare. Supports Workers AI (direct binding and REST API), third-party providers (OpenAI, Anthropic, Gemini, Grok) routed through AI Gateway, and plain Workers AI without a gateway.

Also includes a full-featured demo app at demos/tanstack-ai-chat-react.

Package: @cloudflare/tanstack-ai

Adapters

Capability OpenAI Anthropic Gemini Grok Workers AI
Chat createOpenAiChat createAnthropicChat createGeminiChat createGrokChat createWorkersAiChat
Image createOpenAiImage createGeminiImage createGrokImage
Summarize createOpenAiSummarize createAnthropicSummarize createGeminiSummarize createGrokSummarize
Transcription createOpenAiTranscription
TTS createOpenAiTts
Video createOpenAiVideo

Workers AI config modes

// Direct binding — zero config, no API keys
createWorkersAiChat(model, { binding: env.AI })

// Direct REST — no binding needed
createWorkersAiChat(model, { accountId, apiKey })

// AI Gateway (binding) — routed through gateway
createWorkersAiChat(model, { binding: env.AI.gateway(id), apiKey })

// AI Gateway (credentials) — fully credential-based
createWorkersAiChat(model, { gatewayId, accountId, cfApiKey })

Design decisions

openai as a hard dependency

The Workers AI adapter internally uses the OpenAI SDK to provide a consistent interface. In direct binding mode (env.AI), a custom fetch shim intercepts OpenAI SDK requests, extracts the model/params from the JSON body, calls env.AI.run(), and transforms the Workers AI response into OpenAI-compatible SSE chunks via a TransformStream. This means openai is always needed at runtime, not just for the OpenAI provider.

Structural discrimination for binding vs gateway

env.AI and env.AI.gateway(id) are both objects with a .run() method, but env.AI also has a .gateway() method. We use this to distinguish them at runtime (isDirectBindingConfig checks for the presence of .gateway on the binding), avoiding any need for explicit mode or type config fields.

Gemini: credentials only, no binding support

The @google/genai SDK doesn't support a custom fetch override — only baseUrl and headers via httpOptions. So the Gemini adapter can't use the AI Gateway binding (env.AI.gateway(id).run()). Instead it directly sets baseUrl to the AI Gateway REST endpoint (gateway.ai.cloudflare.com/v1/{accountId}/{gatewayId}/google-ai-studio). A runtime guard throws a clear error if a binding config is accidentally passed. Tracking upstream: googleapis/js-genai#999.

Provider SDKs as optional dependencies

@tanstack/ai-openai, @tanstack/ai-anthropic, @tanstack/ai-gemini, @tanstack/ai-grok, and their underlying provider SDKs (@anthropic-ai/sdk, @google/genai) are all optionalDependencies. Users only install what they need. Each adapter file is a separate entry point in tsup and package.json exports, so tree-shaking works and unused adapters aren't bundled.

Workers AI embeddings/image generation held back

The adapters for WorkersAiEmbeddingAdapter and WorkersAiImageAdapter are fully implemented (workers-ai-embedding.ts, workers-ai-image.ts) but intentionally excluded from the public API and build output. TanStack AI doesn't yet export BaseEmbeddingAdapter or BaseImageAdapter for third-party use — once they do, these are ready to ship with no code changes.

Stable stream IDs

transformWorkersAiStream generates a single streamId and created timestamp at the start of each stream and reuses them across all SSE chunks, matching OpenAI's convention. Tool call IDs use a monotonic counter per stream (call_{streamId}_{n}).

Demo: demos/tanstack-ai-chat-react

A Cloudflare Workers + Vite + React demo with three tabs:

  • Chat — streaming multi-provider chat with tool calling (Llama 4 Scout, Qwen3 30B, GPT-5.2, Claude Sonnet 4.5, Gemini 2.5 Flash, Grok 4)
  • Images — image generation with OpenAI, Gemini, and Grok
  • Summarize — text summarization with OpenAI, Anthropic, and Gemini

Users can enter their own Cloudflare credentials (Account ID, AI Gateway ID, API Token) directly in the frontend UI — stored in sessionStorage, passed as request headers. The worker reads credentials from headers and falls back to environment variables for deployed instances.

Tests

  • Gateway fetchcreateGatewayFetch with binding/credentials configs, cache headers, Workers AI provider specifics, endpoint extraction
  • Binding fetchcreateWorkersAiBindingFetch request translation, stream transform, stable chunk IDs, OpenAI-format passthrough
  • Config detectionisDirectBindingConfig, isDirectCredentialsConfig, isGatewayConfig
  • WorkersAiTextAdapterchatStream and structuredOutput across all config modes, error emission
  • Message buildersbuildOpenAIMessages, buildOpenAITools
  • Public API surface — verifies correct exports, confirms held-back adapters are not exported
  • E2E (Workers AI REST) — 12 models tested against live API for chat, multi-turn, tool calling, tool round-trip, and structured output
  • E2E (Workers AI Binding) — same 12 models tested via env.AI binding through a wrangler dev harness

Notes for reviewers

  1. The as WorkersAiTextModel casts in the demo: the @cloudflare/workers-types conditional type that derives WorkersAiTextModel doesn't structurally match some model names (e.g. @cf/meta/llama-4-scout-17b-16e-instruct). The casts are only in the demo, not the library. Worth investigating if this is a workers-types bug.

  2. Gemini limitation: the @google/genai SDK lacks a fetch override, so there's no way to use the AI binding for Gemini. The upstream issue is googleapis/js-genai#999 (has an open PR: #1215). We've added a runtime guard that throws a clear error if a binding config is passed.

  3. openai version: pinned to ^6.16.0 as a hard dep. This is the minimum version that supports the APIs we need. If there are concerns about the dep size, we could consider vendoring just the streaming logic, but that's a lot of surface area.

  4. Workers AI embedding/image adapters: the code is complete in workers-ai-embedding.ts and workers-ai-image.ts but excluded from tsup.config.ts entry points and index.ts exports. Once TanStack AI ships BaseEmbeddingAdapter/BaseImageAdapter, we flip the switch — no new code needed.

  5. Demo credentials UX: credentials entered in the frontend are passed as X-CF-Account-Id, X-CF-Gateway-Id, X-CF-Api-Token headers. The worker checks for these first, then falls back to env vars. This means the deployed demo works both with and without server-side secrets configured.

Includes a new package which exports adapters for various providers which are routed via Cloudflare AI Gateway.

The 4 providers added currently are: Anthropic, OpenAI, Gemini, Grok.
@changeset-bot
Copy link

changeset-bot bot commented Jan 30, 2026

🦋 Changeset detected

Latest commit: 95f8b48

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
workers-ai-provider Minor
@cloudflare/tanstack-ai Minor
@cloudflare/tanstack-ai-example Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 30, 2026

Open in StackBlitz

npx https://pkg.pr.new/cloudflare/ai/ai-gateway-provider@389
npx https://pkg.pr.new/cloudflare/ai/@cloudflare/tanstack-ai@389
npx https://pkg.pr.new/cloudflare/ai/workers-ai-provider@389

commit: 95f8b48

@threepointone
Copy link
Collaborator

Taking over this PR.

Add @cloudflare/tanstack-ai package for using TanStack AI with Cloudflare Workers AI and AI Gateway. Supports chat via Workers AI (binding and REST), plus routing through AI Gateway for OpenAI, Anthropic, Gemini, Grok, and Workers AI.
@threepointone threepointone changed the title feat: add new tanstack ai adapters package feat: add @cloudflare/tanstack-ai package Feb 8, 2026
@threepointone
Copy link
Collaborator

Hey @vaibhavshn — thanks for the original work here, I've built on top of it quite a bit. Here's a summary of what changed:

Scope expansion: The package is now the canonical way to use TanStack AI with Cloudflare — not just AI Gateway, but also plain Workers AI (direct binding and REST API, no gateway required). Think of it as the TanStack AI equivalent of workers-ai-provider.

API changes:

  • openai is now a hard dependency (the Workers AI adapter uses it internally for the binding shim)
  • Config keys changed — e.g. ai: env.AIbinding: env.AI to match Cloudflare conventions
  • Workers AI embedding and image adapters are written but held back from the public API until TanStack AI exports the base adapter types for those capabilities
  • Provider SDKs remain optional dependencies — users only install what they need

Build & packaging:

  • Aligned tsconfig.json, tsup.config.ts, and package.json exports with workers-ai-provider and ai-gateway-provider for consistency across the monorepo
  • moduleResolution: "bundler", no file extensions on relative imports
  • Added noUncheckedIndexedAccess, noFallthroughCasesInSwitch, noUnusedLocals, noUnusedParameters
  • Renamed the directory from tanstack-ai-adapters to tanstack-ai

Tests: Added comprehensive test suites — gateway fetch, binding fetch, config detection, Workers AI adapter (all config modes), message builders, and public API surface verification. 80 tests total.

Demo app overhaul:

  • Added Image Generation and Summarize tabs (using direct fetch since @tanstack/ai-react only has useChat)
  • Added a frontend credentials UI — users can enter their Cloudflare Account ID, Gateway ID, and API Token directly in the browser (stored in localStorage, passed as headers). The worker falls back to env vars if no headers are present.
  • Updated to latest models (GPT-5.2, Gemini 2.5 Flash, Grok 4, Llama 4 Scout 17B, Qwen3 30B)
  • Added plain Workers AI as a chat provider option (no gateway)
  • General style cleanup

Important — I need you to test this end to end. I don't have API keys set up, so I haven't been able to actually run the demo or verify the adapters against live services. Please test all the providers and config modes, especially:

  • Workers AI via binding (env.AI) and via REST API
  • Workers AI through AI Gateway (both binding and credentials modes)
  • The third-party providers (OpenAI, Anthropic, Gemini, Grok) through AI Gateway
  • The demo app — all three tabs, both with env vars and with frontend-entered credentials
  • The Gemini adapter specifically, since it can't use the binding path (only credentials)

Review the code too and flag anything that looks off.

@vaibhavshn
Copy link
Collaborator Author

vaibhavshn commented Feb 10, 2026

There is an already existing issue in Google GenAI repo for supporting custom fetch implementation. Have sent a message and linked to it above.

Introduce support for two connection modes (env.AI binding vs Cloudflare REST) and per-provider API keys with a Workers AI model selector in the demo app and worker.

- Demo UI: revamped settings to toggle binding vs REST, store provider API keys, clear config, and select Workers AI models; updated Chat, Image, and Summarize tabs to use new models and provider labels.
- Config: migrated to a richer DemoConfig (cloudflare, providerKeys, useBinding), persisted to localStorage (key bumped), and expose helpers to set cloudflare/provider keys/useBinding and clear state; headers now include binding flag, gateway ID, and optional provider keys.
- Worker: extract credentials from request headers (including X-Use-Binding, provider keys, and X-Workers-AI-Model), support both binding and REST adapter configs, allow frontend model override for Workers AI routes, and route adapter creation accordingly.
- Adapters & demos: updated default model names for several providers and added Workers AI model list and selection logic.
- Packaging & tests: added .env.example for tests, updated package versions/dev deps and test scripts, and added e2e test fixtures for Workers AI binding/REST.

These changes enable BYOK/provider-key overrides, let users choose binding vs REST operation, and allow selecting/overriding Workers AI models from the demo UI.
Multiple fixes and enhancements across demos, adapters, and tests:

- Demo: switched demo config storage to sessionStorage so API keys aren't persisted across tabs; added vite build script.
- Worker: tightened web_scrape tool validation (only http/https and block private/internal hostnames) and added try/catch around chat request handling to return 500 on errors.
- Anthropic: extracted repeated gateway config into buildAnthropicConfig to reduce duplication.
- Workers AI adapters: throw on non-OK image gateway responses; ensure generated tool_call_id fallback is created and sanitized; use nullish coalescing for model selection; improve error handling to preserve message and optional code.
- Gateway fetch: normalize cache header types to strings (skip-cache and cache-ttl) and fix header typing; ensure cache values serialized to strings.
- Stream handling: log malformed SSE events instead of silently skipping for easier debugging; documented sanitizeToolCallId rationale and behavior.
- Tests: added e2e fixture files; added tests for OpenAI-format passthrough streams; adjusted several tests to use try/finally to restore global fetch and updated expectations for stringified headers; increased E2E timeouts and test harness time budget.

These changes improve security, robustness, observability, and test coverage.
Remove verbose debug logging and related displayModel logic from ChatTab to reduce console noise. Clean up tests: add vi to the vitest import list and remove the duplicate import in the e2e file, fix test block scoping/indentation in gateway-fetch, and adjust formatting in workers-ai-adapter.test for consistent alignment.
Apply formatting and small refactors across the demo, adapters, utils, and tests.

Key changes:
- demos: tidy JSX formatting, restructure settings panel (toggle/input order), add Cloudflare API Token input and adjust gateway/account fields, and clean up Chat view message markup/alignment.
- config: normalize EMPTY_CONFIG formatting and stabilize useMemo/useCallback dependency arrays.
- worker demo: minor formatting and consistent default model selection formatting.
- adapters: adjust function signatures formatting (Anthropic/Gemini), add runtime guard and docs for Gemini to reject binding configs (googleapis/js-genai lacks fetch override), and export formatting for Grok types.
- utils/create-fetcher: tighten stream parsing/types and normalize string/JSON handling for tool call arguments.
- workers-ai adapter: small formatting cleanups around tool call handling and error message extraction.
- tests: update expectations/formatting to match refactors and ensure SSE/event parsing and mocked binding usages remain stable.

These changes improve readability, tighten runtime checks (Gemini binding), and fix subtle serialization/formatting issues without altering core behavior.
@threepointone
Copy link
Collaborator

Ok I pushed a whole bunch of changes. Importantly, I've fixed al the quirks around workers ai, and the demo is a lot more comprehensive now. feel free to looks at individual commits I've pushed since my last comment. I also rewrote the PR description. I think this is ready to review and land once you've tested e2e.

Add a package-lock.json for the tanstack-ai-chat-react demo and update demos.json package_json_hash to reflect package.json changes. Adjust packages/tanstack-ai/src/utils/create-fetcher.ts and update related tests (embedding-adapter, gateway-adapters, gateway-fetch, message-builder, workers-ai-adapter) to match the updated fetcher/adapter behavior and demo package changes.
@threepointone
Copy link
Collaborator

ok pushed, let's see what CI says

Remove the "build" script from demos/tanstack-ai-chat-react/package.json and update the corresponding package_json_hash in demos.json to reflect the change. This keeps the demo's scripts consistent (preview and deploy still invoke build when needed) and records the updated package.json checksum.
Add a file:../../packages/tanstack-ai dependency to demos/tanstack-ai-chat-react/package.json and update generated lockfiles and demos.json package_json_hash. This wires the demo to use the local packages/tanstack-ai during development.
@threepointone
Copy link
Collaborator

CI passes! please test end to end, the demo app should be pretty comprehensive

threepointone and others added 9 commits February 11, 2026 01:30
Introduce @cloudflare/tanstack-ai adapters and refactor the demo UI. Adds Workers AI adapters (chat, image, transcription, TTS, summarization, embeddings) plus AI Gateway routing adapters for OpenAI, Anthropic, Gemini, Grok, and OpenRouter, along with shared utilities (gateway fetch, binding fetch shim, config detection, binary helpers). Extensive tests and E2E coverage added for adapters and REST/binding modes. Example app refactor replaces per-capability tabs with a provider-driven UI (ProviderView and panels), removes legacy Chat/Summarize tabs, and updates example config and package metadata. Misc: formatting/whitespace tweaks, new adapter implementations, and test additions.
Improve Workers AI adapter robustness and add non-chat capabilities.

- Detect and gracefully handle premature stream termination (no finish_reason) and emit closing events so consumers don't hang.
- Add graceful non-streaming fallback when models ignore stream: true and return a complete response.
- Add Deepgram Nova-3 support: binding format and REST binary path (raw audio bytes + Content-Type). Introduce workersAiRestFetchBinary, buildAudioPayload, audio normalization (normalizeAudioToBytes), and detectAudioContentType helpers.
- Update TTS adapter to use text field and adjust example TTSPanel to correctly read error JSON before parsing success payload.
- Extend README feature list and note on Gemini gateway routing.
- Add comprehensive unit and E2E tests and fixtures for TTS, transcription (including Nova-3), image generation, and summarization; include tests for non-streaming fallback and premature stream termination.

These changes ensure more robust streaming behavior, broader model support, and expanded test coverage for non-chat capabilities.
Introduce transcription, text-to-speech, and reranking capabilities to the Workers AI provider and update example apps.

Changes include:
- Implement provider.transcription, provider.speech, and provider.reranking with model-specific handling (Whisper, Nova-3, Deepgram Aura-1, BGE rerankers).
- Add utilities for binary uploads and raw response handling (createRunBinary / returnRawResponse) and ensure AbortSignal is passed through REST shims for request cancellation.
- Add new model/setting files and tests for transcription, speech, and reranking in packages/workers-ai-provider.
- Update tanstack-ai example and workers-ai example: add a Config provider, new UI components (Transcription, TTS, Reranking), refine Chat/Images/Embeddings components, styles, and example API usage.
- Add example project housekeeping: .oxlintrc.json and Tailwind/Vite dev deps.

This enables richer multimedia workflows (speech input/output and document reranking) for the Workers AI integration and provides example UI + tests to validate behavior.
Include the new TanStack AI adapter in the repo and release pipeline, plus documentation and example renames. Changes: added @cloudflare/tanstack-ai to the release workflow, updated README with package table, examples and local dev instructions, reformatted demos.json, renamed examples/tanstack-ai-chat-react → examples/tanstack-ai (updated package name and wrangler worker name), updated examples/workers-ai package name, added GPT-OSS models to the workers-ai example, and bumped pnpm lock entries to reflect the renames.
@threepointone threepointone merged commit 531d070 into main Feb 11, 2026
3 checks passed
@threepointone threepointone deleted the vaibhavshn/tanstack-ai-adapters branch February 11, 2026 18:30
@github-actions github-actions bot mentioned this pull request Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants