Google Launches Gemini 3.1 Flash Live for Real-Time Voice and Vision Agents

Details

Google AI announced the launch of Gemini 3.1 Flash Live on March 26, 2026, targeting developers building real-time voice and vision agents.
Key features include responses as fast as natural dialogue and improved task completion in noisy environments.
Available now in Gemini Live within the Gemini App and Search Live; Gemini Live API and Google AI Studio in preview; Gemini Enterprise for Customer Experience.
Model details from blog: Gemini 3.1 Flash-Lite, the fastest and most cost-efficient in the Gemini 3 series, priced at $0.25/1M input tokens and $1.50/1M output tokens.
Outperforms Gemini 2.5 Flash with 2.5x faster time to first token, 45% higher output speed, and better benchmarks like 86.9% on GPQA Diamond.
Supports multimodal inputs (text, images, audio, video), function calling, code execution, grounding with Google Search, and large context windows for high-volume tasks.

Impact

Google's Gemini 3.1 Flash Live launch intensifies competition in low-latency AI agents, pressuring rivals like OpenAI's GPT-4o mini and Anthropic's Claude Haiku by offering superior speed at lower costs—$0.25 input vs. $0.25 for GPT-5 mini but with 363 tokens/s output speed exceeding Grok 4.1's 145. This widens access for real-time apps like voice agents and content moderation, accelerating adoption in high-volume enterprise workflows while matching or exceeding prior Gemini models in quality. It positions Google to capture more developer market share amid the push for efficient multimodal AI.