Skip to content

chore(pricing): Update vertex-ai pricing#182

Closed
sivadurga-d wants to merge 1 commit intomainfrom
pricing-update/vertex-ai-20260301185618-jb0i2w
Closed

chore(pricing): Update vertex-ai pricing#182
sivadurga-d wants to merge 1 commit intomainfrom
pricing-update/vertex-ai-20260301185618-jb0i2w

Conversation

@sivadurga-d
Copy link
Contributor

🔄 Pricing Update: vertex-ai

📊 Summary

Change Type Count
➕ Models added 30
🔄 Prices updated 22

➕ New Models

  • gemini-3.1-flash-image-preview
  • gemini-2.5-pro-computer-use-preview
  • gemini-2.5-flash-live-api
  • gemini-2.0-flash-image-generation
  • gemini-2.0-flash-live-api
  • gemini-1.5-flash
  • gemini-1.5-pro
  • multilingual-e5-small
  • multilingual-e5-large
  • gpt-oss-120b
  • gpt-oss-20b
  • llama-3.3-70b-instruct-maas
  • llama-4-scout-17b-16e-instruct-maas
  • llama-4-maverick-34b-16e-instruct-maas
  • mistral-ocr-25-05
  • mistral-medium-3
  • mistral-small-3.1-25-03
  • codestral-2
  • qwen3-next-80b-thinking
  • qwen3-next-80b-instruct
  • ... and 10 more

🔄 Updated Models (any field change)

  • gemini-3.1-pro-preview
  • gemini-3-pro-preview
  • gemini-3-flash-preview
  • gemini-2.5-pro
  • gemini-2.5-flash
  • gemini-2.5-flash-lite
  • gemini-2.0-flash
  • gemini-2.0-flash-lite
  • gemini-1.0-pro
  • imagen-4.0-ultra-generate-001
  • imagen-4.0-generate-001
  • imagen-4.0-fast-generate-001
  • imagen-3.0-generate-001
  • imagen-3.0-fast-generate-001
  • veo-3.1-generate-001
  • veo-3.1-fast-generate-001
  • veo-3.0-generate-001
  • veo-3.0-fast-generate-001
  • text-embedding-004
  • textembedding-gecko@003
  • multimodalembedding@001
  • llama-3.1-405b-instruct-maas

📋 Model → Pricing Page Mapping

Google Models (Gemini, Imagen, Veo, Embeddings)

Model ID Pricing Page Section Notes
gemini-3.1-pro-preview Gemini 3 > Standard Input/output tokens + cache read + web search (1.4¢) + image output tokens
gemini-3.1-flash-image-preview Gemini 3 > Standard Input/output tokens + image output tokens (60¢/1M)
gemini-3-pro-preview Gemini 3 > Standard Input/output tokens + cache read + web search + image output tokens
gemini-3-flash-preview Gemini 3 > Standard Input/output/audio tokens + cache read + web search
gemini-2.5-pro Gemini 2.5 > Standard Input/output tokens + cache read + web search (3.5¢)
gemini-2.5-pro-computer-use-preview Gemini 2.5 > Standard Input/output tokens only
gemini-2.5-flash Gemini 2.5 > Standard Input/output tokens + cache read + web search + image output tokens
gemini-2.5-flash-live-api Gemini 2.5 > Standard Text/audio/image input tokens + text/audio output tokens
gemini-2.5-flash-lite Gemini 2.5 > Standard Input/output tokens + cache read + web search
gemini-2.0-flash Gemini 2.0 > Standard Input/output tokens + audio input + web search + batch API
gemini-2.0-flash-image-generation Gemini 2.0 > Standard Input/output tokens + audio/image input + image output tokens
gemini-2.0-flash-live-api Gemini 2.0 > Standard Text/audio/image input + text/audio output tokens
gemini-2.0-flash-lite Gemini 2.0 > Standard Input/output tokens + audio input + batch API
gemini-1.5-flash Other Gemini models Character-based pricing (converted to per_thousand_tokens)
gemini-1.5-pro Other Gemini models Character-based pricing (converted to per_thousand_tokens)
gemini-1.0-pro Other Gemini models Character-based pricing (converted to per_thousand_tokens)
imagen-4.0-ultra-generate-001 Imagen > Imagen 4 Ultra $0.06 per image
imagen-4.0-generate-001 Imagen > Imagen 4 $0.04 per image
imagen-4.0-fast-generate-001 Imagen > Imagen 4 Fast $0.02 per image
imagen-3.0-generate-001 Imagen > Imagen 3 $0.04 per image
imagen-3.0-fast-generate-001 Imagen > Imagen 3 Fast $0.02 per image
veo-3.1-generate-001 Veo > Veo 3.1 20¢/sec video, 40¢/sec video+audio
veo-3.1-fast-generate-001 Veo > Veo 3.1 Fast 10¢/sec video, 15¢/sec video+audio
veo-3.0-generate-001 Veo > Veo 3 20¢/sec video, 40¢/sec video+audio
veo-3.0-fast-generate-001 Veo > Veo 3 Fast 10¢/sec video, 15¢/sec video+audio
veo-2.0-generate-001 Veo > Veo 2 50¢/sec video
text-embedding-004 Embedding models $0.00015 per 1K tokens
gemini-embedding-001 Embedding models $0.00015 per 1K tokens (with batch pricing)
textembedding-gecko@003 Embedding models $0.000025 per 1K tokens (with batch pricing)
multimodalembedding@001 Embedding models $0.0002 per 1K tokens + image/video pricing
multilingual-e5-small Embedding models > Open Source $0.000015 per 1K tokens (with batch pricing)
multilingual-e5-large Embedding models > Open Source $0.000025 per 1K tokens (with batch pricing)

Anthropic Models (Claude)

Model ID Pricing Page Section Notes
claude-opus-4-6 Anthropic's Claude models > Global Input: $5, Output: $25, 5m Cache Write: $6.25, Cache Hit: $0.5 (with batch pricing)
claude-opus-4-5@20251101 Anthropic's Claude models > Global Input: $5, Output: $25, 5m Cache Write: $6.25, Cache Hit: $0.5 (with batch pricing)
claude-opus-4-1@20250805 Anthropic's Claude models > Uniform Input: $15, Output: $75, 5m Cache Write: $18.75, Cache Hit: $1.5 (with batch pricing)
claude-opus-4@20250514 Anthropic's Claude models > Uniform Input: $15, Output: $75, 5m Cache Write: $18.75, Cache Hit: $1.5 (with batch pricing)
claude-sonnet-4-6 Anthropic's Claude models > Global Input: $3, Output: $15, 5m Cache Write: $3.75, Cache Hit: $0.3 (with batch pricing)
claude-sonnet-4-5@20250929 Anthropic's Claude models > Global Input: $3, Output: $15, 5m Cache Write: $3.75, Cache Hit: $0.3 (with batch pricing)
claude-sonnet-4@20250514 Anthropic's Claude models > Uniform Input: $3, Output: $15, 5m Cache Write: $3.75, Cache Hit: $0.3 (with batch pricing)
claude-haiku-4-5@20251001 Anthropic's Claude models > Global Input: $1, Output: $5, 5m Cache Write: $1.25, Cache Hit: $0.1 (with batch pricing)

Model IDs source: Claude on Vertex AI - Used canonical Vertex API model IDs from the official table.

OpenAI Models

Model ID Pricing Page Section Notes
gpt-oss-120b OpenAI's models Input: $0.09, Output: $0.36 (with batch pricing)
gpt-oss-20b OpenAI's models Input: $0.07, Output: $0.25, Cache Hit: $0.007 (with batch pricing)

Partner Models

Model ID Pricing Page Section Notes
llama-3.1-405b-instruct-maas Meta's Llama models Input: $5, Output: $16
llama-3.3-70b-instruct-maas Meta's Llama models Input: $0.72, Output: $0.72 (with batch pricing)
llama-4-scout-17b-16e-instruct-maas Meta's Llama models Input: $0.25, Output: $0.70 (with batch pricing)
llama-4-maverick-34b-16e-instruct-maas Meta's Llama models Input: $0.35, Output: $1.15 (with batch pricing)
mistral-ocr-25-05 Mistral AI's models Input/Output: $0.0005 per 1M tokens (or $0.0005/page)
mistral-medium-3 Mistral AI's models Input: $0.40, Output: $2.00
mistral-small-3.1-25-03 Mistral AI's models Input: $0.10, Output: $0.30
codestral-2 Mistral AI's models Input: $0.30, Output: $0.90
qwen3-next-80b-thinking Qwen's models Input: $0.15, Output: $1.20
qwen3-next-80b-instruct Qwen's models Input: $0.15, Output: $1.20
qwen3-coder-480b-a35b-instruct Qwen's models Input: $0.22, Output: $1.80, Cache Hit: $0.022 (with batch pricing)
qwen3-235b-a22b-instruct-2507 Qwen's models Input: $0.22, Output: $0.88 (with batch pricing)
deepseek-v3.1 Deepseek's models Input: $0.60, Output: $1.70, Cache Hit: $0.06 (with batch pricing)
deepseek-v3.2 Deepseek's models Input: $0.56, Output: $1.68, Cache Hit: $0.056 (with batch pricing)
deepseek-r1-0528 Deepseek's models Input: $1.35, Output: $5.40 (with batch pricing)
deepseek-ocr Deepseek's models Input: $0.30, Output: $1.20 (or $0.0003/page, $0.00012/page)
minimax-m2 MiniMax's models Input: $0.30, Output: $1.20, Cache Hit: $0.03
kimi-k2-thinking Moonshot's models Input: $0.60, Output: $2.50, Cache Hit: $0.06
glm-4.7 GLM's models Input: $0.60, Output: $2.20
glm-5 GLM's models Input: $1.00, Output: $3.20, Cache Hit: $0.10

🔑 Key Pricing Details

  • Grounding with Google Search: $35 per 1,000 grounded prompts (with free daily allowances for some models)
  • Web Grounding for Enterprise: $45 per 1,000 grounded prompts
  • Grounding with Google Maps: $25 per 1,000 grounded prompts
  • Web search for Gemini 3: $14 per 1,000 search queries → converted to 1.4¢ per search
  • Web search for Gemini 2.x: $35 per 1,000 grounded prompts → converted to 3.5¢ per search
  • Cache pricing: Applied for Claude models (5m Cache Write preferred) and some Gemini models (cache read only)
  • Batch API: 50% discount applied where available

📝 Processing Notes

  1. Used Global pricing tab (not regional) as specified in the skill
  2. Web search unit conversion: Pricing page shows "$X per 1,000 searches" → converted to cents per search (e.g., $14/1000 = 1.4¢)
  3. Embedding models: Pricing page uses per 1,000 tokens (not per 1M), so used price_unit: "per_thousand_tokens"
  4. Claude model IDs: Used canonical Vertex API model IDs from Claude on Vertex AI documentation
  5. No lte/gt categories: One entry per model as specified
  6. Veo video pricing: Includes both video-only and video+audio pricing where applicable
  7. Image output tokens for Gemini: Used additional.image_token field (not image_pricing which is for Imagen only)
  8. All pricing in $/1M tokens except embeddings ($/1K tokens), Imagen ($/image), and Veo (cents/second)

Generated by Pricing Agent on 2026-03-01 (update_mode: full)

@sivadurga-d sivadurga-d closed this Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant