-
Notifications
You must be signed in to change notification settings - Fork 240
feat: add @cloudflare/tanstack-ai package #389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Includes a new package which exports adapters for various providers which are routed via Cloudflare AI Gateway. The 4 providers added currently are: Anthropic, OpenAI, Gemini, Grok.
🦋 Changeset detectedLatest commit: 95f8b48 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
|
Taking over this PR. |
Add @cloudflare/tanstack-ai package for using TanStack AI with Cloudflare Workers AI and AI Gateway. Supports chat via Workers AI (binding and REST), plus routing through AI Gateway for OpenAI, Anthropic, Gemini, Grok, and Workers AI.
|
Hey @vaibhavshn — thanks for the original work here, I've built on top of it quite a bit. Here's a summary of what changed: Scope expansion: The package is now the canonical way to use TanStack AI with Cloudflare — not just AI Gateway, but also plain Workers AI (direct binding and REST API, no gateway required). Think of it as the TanStack AI equivalent of API changes:
Build & packaging:
Tests: Added comprehensive test suites — gateway fetch, binding fetch, config detection, Workers AI adapter (all config modes), message builders, and public API surface verification. 80 tests total. Demo app overhaul:
Important — I need you to test this end to end. I don't have API keys set up, so I haven't been able to actually run the demo or verify the adapters against live services. Please test all the providers and config modes, especially:
Review the code too and flag anything that looks off. |
|
There is an already existing issue in Google GenAI repo for supporting custom fetch implementation. Have sent a message and linked to it above. |
Introduce support for two connection modes (env.AI binding vs Cloudflare REST) and per-provider API keys with a Workers AI model selector in the demo app and worker. - Demo UI: revamped settings to toggle binding vs REST, store provider API keys, clear config, and select Workers AI models; updated Chat, Image, and Summarize tabs to use new models and provider labels. - Config: migrated to a richer DemoConfig (cloudflare, providerKeys, useBinding), persisted to localStorage (key bumped), and expose helpers to set cloudflare/provider keys/useBinding and clear state; headers now include binding flag, gateway ID, and optional provider keys. - Worker: extract credentials from request headers (including X-Use-Binding, provider keys, and X-Workers-AI-Model), support both binding and REST adapter configs, allow frontend model override for Workers AI routes, and route adapter creation accordingly. - Adapters & demos: updated default model names for several providers and added Workers AI model list and selection logic. - Packaging & tests: added .env.example for tests, updated package versions/dev deps and test scripts, and added e2e test fixtures for Workers AI binding/REST. These changes enable BYOK/provider-key overrides, let users choose binding vs REST operation, and allow selecting/overriding Workers AI models from the demo UI.
Multiple fixes and enhancements across demos, adapters, and tests: - Demo: switched demo config storage to sessionStorage so API keys aren't persisted across tabs; added vite build script. - Worker: tightened web_scrape tool validation (only http/https and block private/internal hostnames) and added try/catch around chat request handling to return 500 on errors. - Anthropic: extracted repeated gateway config into buildAnthropicConfig to reduce duplication. - Workers AI adapters: throw on non-OK image gateway responses; ensure generated tool_call_id fallback is created and sanitized; use nullish coalescing for model selection; improve error handling to preserve message and optional code. - Gateway fetch: normalize cache header types to strings (skip-cache and cache-ttl) and fix header typing; ensure cache values serialized to strings. - Stream handling: log malformed SSE events instead of silently skipping for easier debugging; documented sanitizeToolCallId rationale and behavior. - Tests: added e2e fixture files; added tests for OpenAI-format passthrough streams; adjusted several tests to use try/finally to restore global fetch and updated expectations for stringified headers; increased E2E timeouts and test harness time budget. These changes improve security, robustness, observability, and test coverage.
Remove verbose debug logging and related displayModel logic from ChatTab to reduce console noise. Clean up tests: add vi to the vitest import list and remove the duplicate import in the e2e file, fix test block scoping/indentation in gateway-fetch, and adjust formatting in workers-ai-adapter.test for consistent alignment.
Apply formatting and small refactors across the demo, adapters, utils, and tests. Key changes: - demos: tidy JSX formatting, restructure settings panel (toggle/input order), add Cloudflare API Token input and adjust gateway/account fields, and clean up Chat view message markup/alignment. - config: normalize EMPTY_CONFIG formatting and stabilize useMemo/useCallback dependency arrays. - worker demo: minor formatting and consistent default model selection formatting. - adapters: adjust function signatures formatting (Anthropic/Gemini), add runtime guard and docs for Gemini to reject binding configs (googleapis/js-genai lacks fetch override), and export formatting for Grok types. - utils/create-fetcher: tighten stream parsing/types and normalize string/JSON handling for tool call arguments. - workers-ai adapter: small formatting cleanups around tool call handling and error message extraction. - tests: update expectations/formatting to match refactors and ensure SSE/event parsing and mocked binding usages remain stable. These changes improve readability, tighten runtime checks (Gemini binding), and fix subtle serialization/formatting issues without altering core behavior.
|
Ok I pushed a whole bunch of changes. Importantly, I've fixed al the quirks around workers ai, and the demo is a lot more comprehensive now. feel free to looks at individual commits I've pushed since my last comment. I also rewrote the PR description. I think this is ready to review and land once you've tested e2e. |
…cloudflare/ai into vaibhavshn/tanstack-ai-adapters
Add a package-lock.json for the tanstack-ai-chat-react demo and update demos.json package_json_hash to reflect package.json changes. Adjust packages/tanstack-ai/src/utils/create-fetcher.ts and update related tests (embedding-adapter, gateway-adapters, gateway-fetch, message-builder, workers-ai-adapter) to match the updated fetcher/adapter behavior and demo package changes.
|
ok pushed, let's see what CI says |
Remove the "build" script from demos/tanstack-ai-chat-react/package.json and update the corresponding package_json_hash in demos.json to reflect the change. This keeps the demo's scripts consistent (preview and deploy still invoke build when needed) and records the updated package.json checksum.
Add a file:../../packages/tanstack-ai dependency to demos/tanstack-ai-chat-react/package.json and update generated lockfiles and demos.json package_json_hash. This wires the demo to use the local packages/tanstack-ai during development.
|
CI passes! please test end to end, the demo app should be pretty comprehensive |
Introduce @cloudflare/tanstack-ai adapters and refactor the demo UI. Adds Workers AI adapters (chat, image, transcription, TTS, summarization, embeddings) plus AI Gateway routing adapters for OpenAI, Anthropic, Gemini, Grok, and OpenRouter, along with shared utilities (gateway fetch, binding fetch shim, config detection, binary helpers). Extensive tests and E2E coverage added for adapters and REST/binding modes. Example app refactor replaces per-capability tabs with a provider-driven UI (ProviderView and panels), removes legacy Chat/Summarize tabs, and updates example config and package metadata. Misc: formatting/whitespace tweaks, new adapter implementations, and test additions.
Improve Workers AI adapter robustness and add non-chat capabilities. - Detect and gracefully handle premature stream termination (no finish_reason) and emit closing events so consumers don't hang. - Add graceful non-streaming fallback when models ignore stream: true and return a complete response. - Add Deepgram Nova-3 support: binding format and REST binary path (raw audio bytes + Content-Type). Introduce workersAiRestFetchBinary, buildAudioPayload, audio normalization (normalizeAudioToBytes), and detectAudioContentType helpers. - Update TTS adapter to use text field and adjust example TTSPanel to correctly read error JSON before parsing success payload. - Extend README feature list and note on Gemini gateway routing. - Add comprehensive unit and E2E tests and fixtures for TTS, transcription (including Nova-3), image generation, and summarization; include tests for non-streaming fallback and premature stream termination. These changes ensure more robust streaming behavior, broader model support, and expanded test coverage for non-chat capabilities.
Introduce transcription, text-to-speech, and reranking capabilities to the Workers AI provider and update example apps. Changes include: - Implement provider.transcription, provider.speech, and provider.reranking with model-specific handling (Whisper, Nova-3, Deepgram Aura-1, BGE rerankers). - Add utilities for binary uploads and raw response handling (createRunBinary / returnRawResponse) and ensure AbortSignal is passed through REST shims for request cancellation. - Add new model/setting files and tests for transcription, speech, and reranking in packages/workers-ai-provider. - Update tanstack-ai example and workers-ai example: add a Config provider, new UI components (Transcription, TTS, Reranking), refine Chat/Images/Embeddings components, styles, and example API usage. - Add example project housekeeping: .oxlintrc.json and Tailwind/Vite dev deps. This enables richer multimedia workflows (speech input/output and document reranking) for the Workers AI integration and provides example UI + tests to validate behavior.
Include the new TanStack AI adapter in the repo and release pipeline, plus documentation and example renames. Changes: added @cloudflare/tanstack-ai to the release workflow, updated README with package table, examples and local dev instructions, reformatted demos.json, renamed examples/tanstack-ai-chat-react → examples/tanstack-ai (updated package name and wrangler worker name), updated examples/workers-ai package name, added GPT-OSS models to the workers-ai example, and bumped pnpm lock entries to reflect the renames.
Summary
Adds
@cloudflare/tanstack-ai— the canonical package for using TanStack AI with Cloudflare. Supports Workers AI (direct binding and REST API), third-party providers (OpenAI, Anthropic, Gemini, Grok) routed through AI Gateway, and plain Workers AI without a gateway.Also includes a full-featured demo app at
demos/tanstack-ai-chat-react.Package:
@cloudflare/tanstack-aiAdapters
createOpenAiChatcreateAnthropicChatcreateGeminiChatcreateGrokChatcreateWorkersAiChatcreateOpenAiImagecreateGeminiImagecreateGrokImagecreateOpenAiSummarizecreateAnthropicSummarizecreateGeminiSummarizecreateGrokSummarizecreateOpenAiTranscriptioncreateOpenAiTtscreateOpenAiVideoWorkers AI config modes
Design decisions
openaias a hard dependencyThe Workers AI adapter internally uses the OpenAI SDK to provide a consistent interface. In direct binding mode (
env.AI), a customfetchshim intercepts OpenAI SDK requests, extracts the model/params from the JSON body, callsenv.AI.run(), and transforms the Workers AI response into OpenAI-compatible SSE chunks via aTransformStream. This meansopenaiis always needed at runtime, not just for the OpenAI provider.Structural discrimination for binding vs gateway
env.AIandenv.AI.gateway(id)are both objects with a.run()method, butenv.AIalso has a.gateway()method. We use this to distinguish them at runtime (isDirectBindingConfigchecks for the presence of.gatewayon the binding), avoiding any need for explicitmodeortypeconfig fields.Gemini: credentials only, no binding support
The
@google/genaiSDK doesn't support a customfetchoverride — onlybaseUrlandheadersviahttpOptions. So the Gemini adapter can't use the AI Gateway binding (env.AI.gateway(id).run()). Instead it directly setsbaseUrlto the AI Gateway REST endpoint (gateway.ai.cloudflare.com/v1/{accountId}/{gatewayId}/google-ai-studio). A runtime guard throws a clear error if a binding config is accidentally passed. Tracking upstream: googleapis/js-genai#999.Provider SDKs as optional dependencies
@tanstack/ai-openai,@tanstack/ai-anthropic,@tanstack/ai-gemini,@tanstack/ai-grok, and their underlying provider SDKs (@anthropic-ai/sdk,@google/genai) are alloptionalDependencies. Users only install what they need. Each adapter file is a separate entry point intsupandpackage.jsonexports, so tree-shaking works and unused adapters aren't bundled.Workers AI embeddings/image generation held back
The adapters for
WorkersAiEmbeddingAdapterandWorkersAiImageAdapterare fully implemented (workers-ai-embedding.ts,workers-ai-image.ts) but intentionally excluded from the public API and build output. TanStack AI doesn't yet exportBaseEmbeddingAdapterorBaseImageAdapterfor third-party use — once they do, these are ready to ship with no code changes.Stable stream IDs
transformWorkersAiStreamgenerates a singlestreamIdandcreatedtimestamp at the start of each stream and reuses them across all SSE chunks, matching OpenAI's convention. Tool call IDs use a monotonic counter per stream (call_{streamId}_{n}).Demo:
demos/tanstack-ai-chat-reactA Cloudflare Workers + Vite + React demo with three tabs:
Users can enter their own Cloudflare credentials (Account ID, AI Gateway ID, API Token) directly in the frontend UI — stored in sessionStorage, passed as request headers. The worker reads credentials from headers and falls back to environment variables for deployed instances.
Tests
createGatewayFetchwith binding/credentials configs, cache headers, Workers AI provider specifics, endpoint extractioncreateWorkersAiBindingFetchrequest translation, stream transform, stable chunk IDs, OpenAI-format passthroughisDirectBindingConfig,isDirectCredentialsConfig,isGatewayConfigchatStreamandstructuredOutputacross all config modes, error emissionbuildOpenAIMessages,buildOpenAIToolsenv.AIbinding through awrangler devharnessNotes for reviewers
The
as WorkersAiTextModelcasts in the demo: the@cloudflare/workers-typesconditional type that derivesWorkersAiTextModeldoesn't structurally match some model names (e.g.@cf/meta/llama-4-scout-17b-16e-instruct). The casts are only in the demo, not the library. Worth investigating if this is a workers-types bug.Gemini limitation: the
@google/genaiSDK lacks afetchoverride, so there's no way to use the AI binding for Gemini. The upstream issue is googleapis/js-genai#999 (has an open PR: #1215). We've added a runtime guard that throws a clear error if a binding config is passed.openaiversion: pinned to^6.16.0as a hard dep. This is the minimum version that supports the APIs we need. If there are concerns about the dep size, we could consider vendoring just the streaming logic, but that's a lot of surface area.Workers AI embedding/image adapters: the code is complete in
workers-ai-embedding.tsandworkers-ai-image.tsbut excluded fromtsup.config.tsentry points andindex.tsexports. Once TanStack AI shipsBaseEmbeddingAdapter/BaseImageAdapter, we flip the switch — no new code needed.Demo credentials UX: credentials entered in the frontend are passed as
X-CF-Account-Id,X-CF-Gateway-Id,X-CF-Api-Tokenheaders. The worker checks for these first, then falls back to env vars. This means the deployed demo works both with and without server-side secrets configured.