Google models

Featured Gemini models

Generally available Gemini models

diamond Gemini 2.5 Pro Our high-capability model for complex reasoning and coding. Features adaptive thinking capabilities to solve complex agentic and multimodal challenges with a 1 million token context.
spark Gemini 2.5 Flash Lightning-fast and highly capable. Delivers a balance of intelligence and latency with controllable thinking budgets for versatile applications.
🍌 Gemini 2.5 Flash Image Turn ideas into production-ready assets. Features conversational editing, multi-image fusion, and character consistency for advanced creative workflows.
performance_auto Gemini 2.5 Flash-Lite Built for massive scale. Balances cost and performance for high-throughput tasks, optimized for efficiency without sacrificing multimodal understanding.
spark Gemini 2.0 Flash Multimodal performance for developers needing a cost-effective model for general-purpose tasks.
performance_auto Gemini 2.0 Flash-Lite Streamlined and ultra-efficient for simple, high-frequency tasks where speed and price are the priority.

Preview Gemini models

preview Gemini 3 Pro Our latest reasoning-first model optimized for complex agentic workflows and coding. Features adaptive thinking, a 1M token context window, and integrated grounding for sophisticated multimodal problem solving.
preview Gemini 3 Pro Image High-fidelity image generation with reasoning-enhanced composition. Supports legible text rendering, complex multi-turn editing, and character consistency using up to 14 reference inputs.
preview Gemini 2.5 Flash Live API Designed for real-time, bidirectional streaming. Features low-latency built-in audio and affective dialogue capabilities for natural, conversational interactions.

Gemma models

Gemma 3n An open model designed for efficient execution on low-resource devices, supporting multimodal input (text, image, video, and audio) and text output in over 140 languages.
Gemma 3 An open model featuring text and image input, support for over 140 languages, and a 128K context window.
Gemma 2 An open model supporting text generation, summarization, and extraction.
Gemma A small, lightweight open model supporting text generation, summarization, and extraction.
ShieldGemma 2 Instruction-tuned models for evaluating text and image safety against defined policies.
PaliGemma An open vision-language model combining SigLIP and Gemma.
CodeGemma A powerful, lightweight open model for coding tasks, including code completion, generation, and understanding.
TxGemma A model that generates predictions, classifications, or text based on therapeutic-related data, for building AI models with less data and compute.
MedGemma A collection of Gemma 3 variants trained for performance on medical text and image comprehension.
MedSigLIP A SigLIP variant trained to encode medical images and text into a common embedding space.
T5Gemma A family of lightweight encoder-decoder research models.

Embeddings models

width_normal Embeddings for Text Converts text data into vector representations for semantic search, classification, and clustering.
width_normal Multimodal Embeddings Generates vectors based on images, for tasks such as image classification and search.

Imagen models

photo_spark Imagen 4 for Generation Use text prompts to generate novel images with higher quality than our previous image generation models
photo_spark Imagen 4 for Fast Generation Use text prompts to generate novel images with higher quality and lower latency than our previous image generation models
photo_spark Imagen 4 for Ultra Generation Use text prompts to generate novel images with higher quality and better prompt adherence than our previous image generation models
photo_spark Imagen 3 for Generation 002 Use text prompts to generate novel images
photo_spark Imagen 3 for Generation 001 Use text prompts to generate novel images
photo_spark Imagen 3 for Fast Generation Use text prompts to generate novel images with lower latency than our other image generation models
image_edit_auto Imagen 3 for Editing and Customization Edits existing images or generates new images based on text prompts and provided context.

Preview Imagen models

photo_spark Virtual Try-On Generates images of people wearing clothing products.
image_edit_auto Imagen product recontext on Vertex AI Edits product images to place them in different scenes or backgrounds based on text prompts.

Veo models

movie Veo 2 Generate Generates videos from text prompts and images.
movie Veo 3 Generate Generates videos from text prompts and images with high quality.
movie Veo 3 Fast Generates videos from text prompts and images with high quality and low latency.
movie Veo 3.1 Generate Generates videos from text prompts and images with high quality.
movie Veo 3.1 Fast Generates videos from text prompts and images with high quality and low latency.

Preview Veo models

movie Veo 3 Generate preview Generates videos from text prompts and images with high quality.
movie Veo 3 Fast preview Generates videos from text prompts and images with high quality and low latency.
movie Veo 3.1 Generate preview Generates videos from text prompts and images with high quality.
movie Veo 3.1 Fast preview Generates videos from text prompts and images with high quality and low latency.
movie Veo 2 Preview Generates videos from text prompts and images, supporting inpaint and outpaint.

Experimental Veo models

movie Veo 2 Experimental An experimental model with features under test.

MedLM models

medical_information MedLM-medium A HIPAA-compliant model for medical question answering and summarization of healthcare documents.
clinical_notes MedLM-large-large A HIPAA-compliant model for medical question answering and summarization of healthcare documents.

Language support

Gemini

All the Gemini models can understand and respond in the following languages:

Afrikaans (af), Albanian (sq), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Azerbaijani (az), Basque (eu), Belarusian (be), Bengali (bn), Bosnian (bs), Bulgarian (bg), Catalan (ca), Cebuano (ceb), Chinese (Simplified and Traditional) (zh), Corsican (co), Croatian (hr), Czech (cs), Danish (da), Dhivehi (dv), Dutch (nl), English (en), Esperanto (eo), Estonian (et), Filipino (Tagalog) (fil), Finnish (fi), French (fr), Frisian (fy), Galician (gl), Georgian (ka), German (de), Greek (el), Gujarati (gu), Haitian Creole (ht), Hausa (ha), Hawaiian (haw), Hebrew (iw), Hindi (hi), Hmong (hmn), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Krio (kri), Kurdish (ku), Kyrgyz (ky), Lao (lo), Latin (la), Latvian (lv), Lithuanian (lt), Luxembourgish (lb), Macedonian (mk), Malagasy (mg), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Meiteilon (Manipuri) (mni-Mtei), Mongolian (mn), Myanmar (Burmese) (my), Nepali (ne), Norwegian (no), Nyanja (Chichewa) (ny), Odia (Oriya) (or), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Samoan (sm), Scots Gaelic (gd), Serbian (sr), Sesotho (st), Shona (sn), Sindhi (sd), Sinhala (Sinhalese) (si), Slovak (sk), Slovenian (sl), Somali (so), Spanish (es), Sundanese (su), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Urdu (ur), Uyghur (ug), Uzbek (uz), Vietnamese (vi), Welsh (cy), Xhosa (xh), Yiddish (yi), Yoruba (yo), and Zulu (zu).

Gemma

Gemma and Gemma 2 support only the English (en) language. Gemma 3 and Gemma 3n provide multilingual support in over 140 languages.

Embeddings

Multilingual text embedding models support the following languages:

Afrikaans (af), Albanian (sq), Amharic (am), Arabic (ar), Armenian (hy), Azerbaijani (az), Basque (eu), Belarusian (be), Bengali (bn), Bulgarian (bg), Catalan (ca), Cebuano (ceb), Chinese (Simplified and Traditional) (zh), Corsican (co), Czech (cs), Danish (da), Dutch (nl), English (en), Esperanto (eo), Estonian (et), Filipino (Tagalog) (fil), Finnish (fi), French (fr), Frisian (fy), Galician (gl), Georgian (ka), German (de), Greek (el), Gujarati (gu), Haitian Creole (ht), Hausa (ha), Hawaiian (haw), Hebrew (iw), Hindi (hi), Hmong (hmn), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Kurdish (ku), Kyrgyz (ky), Lao (lo), Latin (la), Latvian (lv), Lithuanian (lt), Luxembourgish (lb), Macedonian (mk), Malagasy (mg), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Myanmar (Burmese) (my), Nepali (ne), Nyanja (Chichewa) (ny), Norwegian (no), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Samoan (sm), Scots Gaelic (gd), Serbian (sr), Sesotho (st), Shona (sn), Sindhi (sd), Sinhala (Sinhalese) (si), Slovak (sk), Slovenian (sl), Somali (so), Spanish (es), Sundanese (su), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Xhosa (xh), Yiddish (yi), Yoruba (yo), and Zulu (zu).

Imagen 3

Imagen 3 supports the following languages:

English (en), Chinese (Simplified and Traditional) (zh), Hindi (hi), Japanese (ja), Korean (ko), Portuguese (pt), and Spanish (es).

MedLM

The MedLM model supports the English (en) language.

Explore all models in Model Garden

Model Garden is a platform that helps you discover, test, customize, and deploy Google proprietary and select OSS models and assets. To explore the generative AI models and APIs that are available on Vertex AI, go to Model Garden in the Google Cloud console.

Go to Model Garden

To learn more about Model Garden, including available models and capabilities, see Explore AI models in Model Garden.

Model versions

To see all model versions, including legacy and retired models, see Model versions and lifecycle.

What's next