Google AI Releases Veo 3.1 Lite: Giving Developers Low Cost High Speed Video Generation via The Gemini API
Google has announced the release of Veo 3.1 Lite, a new model tier within its generative video portfolio designed to address the primary bottleneck for production-scale deployments: pricing. While the generative video space has seen rapid progress in visual fidelity, the cost per second of generated content has remained high, often prohibitive for developers building […] The post Google AI Releases Veo 3.1 Lite: Giving Developers Low Cost High Speed Video Generation via The Gemini API appeared first on MarkTechPost .
Google has announced the release of Veo 3.1 Lite, a new model tier within its generative video portfolio designed to address the primary bottleneck for production-scale deployments: pricing. While the generative video space has seen rapid progress in visual fidelity, the cost per second of generated content has remained high, often prohibitive for developers building high-volume applications.
Veo 3.1 Lite is now available via the Gemini API and Google AI Studio for users in the paid tier. By offering the same generation speed as the existing Veo 3.1 Fast model at approximately half the cost, Google is positioning this model as the standard for developers focused on programmatic video generation and iterative prototyping.
https://blog.google/innovation-and-ai/technology/ai/veo-3-1-lite/
Technical Architecture: The Diffusion Transformer (DiT)
The most significant aspect of the Veo 3.1 family is its underlying Diffusion Transformer (DiT) architecture. Traditional generative video models often relied on U-Net-based diffusion, which can struggle with high-dimensional data and long-range temporal dependencies.
Veo 3.1 Lite utilizes a transformer-based backbone that operates on spatio-temporal patches. In this architecture, video frames are not processed as static 2D images but as a continuous sequence of tokens in a latent space. By applying self-attention across these patches, the model maintains better temporal consistency. This ensures that objects, lighting, and textures remain coherent across the duration of the clip, reducing the artifacts commonly seen in earlier models.
The model performs its computation in a compressed latent space rather than pixel space. This allows the model to handle the high computational demands of video generation while maintaining a lower memory footprint. For developers, this translates to a model that can generate high-definition content without the exponential increase in compute time that usually accompanies resolution scaling.
Performance and Output Specifications
Veo 3.1 Lite provides specific parameters for resolution and duration, allowing AI devs to integrate it into structured workflows. Unlike the flagship Veo 3.1 model, which supports 4K resolution, the Lite version is optimized for high-definition (HD) outputs.
-
Supported Resolutions: 720p and 1080p.
-
Aspect Ratios: Native support for both landscape (16:9) and portrait (9:16) orientations.
-
Clip Durations: Developers can specify generation lengths of 4, 6, or 8 seconds.
-
Prompt Adherence: The model is optimized for ‘Cinematic Control,’ recognizing technical directives such as ‘pan,’ ’tilt,’ and specific lighting instructions.
The ‘Lite’ tag does not refer to a reduction in generation speed compared to the ‘Fast’ tier. Instead, it refers to an optimized parameter set that allows Google team to offer the model at a significantly lower price point while maintaining the same low-latency performance characteristics of Veo 3.1 Fast.
The Pricing Shift: Democratizing Video Inference
The core value proposition of Veo 3.1 Lite is its cost structure. In the current market, high-quality video inference often costs several dollars per minute of footage, making it difficult to justify for applications like dynamic ad generation or social media automation.
Veo 3.1 Lite pricing is structured as follows:
-
720p: $0.05 per second.
-
1080p: $0.08 per second.
Deployment via Gemini API and AI Studio
The accessibility is handled through the Gemini API. This allows for the integration of video generation into existing Python or Node.js applications using standard REST or gRPC calls.
One critical technical feature for enterprise developers is the inclusion of SynthID. Developed by Google DeepMind, SynthID is a tool for watermarking and identifying AI-generated content. It embeds a digital watermark directly into the pixels of the video that is imperceptible to the human eye but detectable by specialized software. This is a mandatory component for developers concerned with safety, compliance, and distinguishing synthetic media from captured footage.
Key Takeaways
-
Half the Cost, Same Speed: Offers the same low-latency performance as the ‘Fast’ tier at less than 50% of the price ($0.05/sec for 720p).
-
Scalable HD Output: Supports 720p and 1080p resolutions in 4, 6, or 8-second clips with native 16:9 and 9:16 aspect ratios.
-
Architecture: Built on a Diffusion Transformer (DiT) using spatio-temporal patches for superior motion and physical consistency.
-
Developer Ready: Available now via Gemini API (paid tier) and Google AI Studio, featuring built-in SynthID digital watermarking.
Check out the Technical details. You can access the model via paid tier on the Gemini API and Google AI Studio. Also, feel free to follow us on Twitter and don’t forget to join our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
Michal Sutter
Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
geminimodelreleaseExclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - WSJ
<a href="https://news.google.com/rss/articles/CBMiogNBVV95cUxPazllT0hscUhNZFpyV1hBcXozd3FYb0pVaVJ1Qk84V2VvYnVPUDExV1VRTnN0dnpndFROYkhEOEpLU2tJWldUUFA5LVZ2a2F3SlFkeEMteW01bTYyOEs3NlNvNjlqd2VYb0oxMkRFaXMzekZPRUxvZTZHZ3V6Q0dfTDF4dlA1TC0tNW04RmxRWGoyQ2RkRHlwSEJwdmVaQW1xNDVmMGxxN0Nxa25odlFXYnJFNW5POE9ENkdfQkM5MUNERzBVX2E3em9IRUVKVGV4VVE2NnF4OV95dmRZZk9nZ0pvTTdHSVRxVk1nZW5DV0lrcG1lT2VKSmRLNk1uMDFWQnVaOFg2eEZNQWltXzZYQTh4TmVnS0JSZ3M3dUp2Umc5LTZ5emlWLWVvWmFZNEhMcklabnE2Y3J6SW93bXZYZmt4VmRILXBieC1wckhZUmlNakJsUVVHNUk1ZnRTcF9CdnJ3MEU3a2dfVm9aX25xYkN3dVF5bWlBQzZSc19LSlNZSUFGVHUyNlJUS1djYy1fbm5WSDhEUVNWN1dOVFR3X1FB?oc=5" target="_blank">Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT</a> <font color="#6f6f6f">WSJ</font>
Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - wsj.com
<a href="https://news.google.com/rss/articles/CBMiogNBVV95cUxQNU1NTGdtSzlqckVfLU5oMkoxamNrWjNya0oxOFNFQ3Q1aEdnSUlDN0lYY3ZKNEc5U0QtMmRXbFVyZEdpTHBwcllmbVFPZmR2WmlBSzh1TmJUN0tncFIzdXNmR2ZLNUtjc1NtYjlSNHJfMzI4NUFpQ3ZIVnJSRlJvVmZpa2U4dHJOTnVlMWVUZE9lSUNqN2ZuT3FzeE15TTlzZDIwWjkxVFhIS0hua2JDQm5pRjVBcWZhVjV5bkZIR2YxcmdkczFxMEJCcTEwQ2pQS2dhakVjdjRwOXhkbmFZV2dEM3dqUllySHJ6LXZtR21PNnRUQWxBVE11MjZ6ZkRmczNjbjAzLWlhZkFDZEJ3dkRiMnhybFhhYlluQVYtNUswX096NFNlOVptZzQ0VlB4bmx2a1ZQV0M0VE5sVDNKMWQtV1BlUzFxNENBaWYxNmlpOHdpbjVvWHZnZ2JVWndwbUFwbGRNSXhCRHFxMG53c09LZ3JkLUREb1FRLV8wcGptei0xemlKSDd4aU1oWnlkRUlwbzNaY196dmtoa1BIVlF3?oc=5" target="_blank">Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT</a> <font color="#6f6f6f">wsj.com</font>
A Very Fine Untuning
How fine-tuning made my chatbot worse (and broke my RAG pipeline) I spent weeks trying to improve my personal chatbot, Virtual Alexandra , with fine-tuning. Instead I got increased hallucination rate and broken retrieval in my RAG system. Yes, this is a story about a failed attempt, not a successful one. My husband and I called fine tuning results “Drunk Alexandra” — incoherent answers that were initially funny, but quickly became annoying. After weeks of experiments, I reached a simple conclusion: for this particular project, a small chatbot that answers questions based on my writing and instructions, fine tuning was not a good option. It was not just unnecessary, it actively degraded the experience and didn’t justify the extra time, cost, or complexity compared to the prompt + RAG system
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Releases
![[Side A] Completely Defending Python from OOM Kills: The BytesIO Trap and D-MemFS 'Hard Quota' Design Philosophy](https://media2.dev.to/dynamic/image/width=1200,height=627,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9vney0xtkjc0fo4kadmp.png)
[Side A] Completely Defending Python from OOM Kills: The BytesIO Trap and D-MemFS 'Hard Quota' Design Philosophy
<blockquote> <p><strong>From the Author:</strong><br> Recently, I introduced <strong>D-MemFS</strong> on Reddit. The response was overwhelming, confirming that memory management and file I/O performance are truly universal challenges for developers everywhere. This series is my response to that global interest.</p> </blockquote> <h3> 🧭 About this Series: The Two Sides of Development </h3> <p>To provide a complete picture of this project, I’ve split each update into two perspectives:</p> <ul> <li> <strong>Side A (Practical / from Qiita):</strong> Implementation details, benchmarks, and technical solutions.</li> <li> <strong>Side B (Philosophy / from Zenn):</strong> The development war stories, AI-collaboration, and design decisions.</li> </ul> <h2> Introduction </h2> <p>If you write in-mem
![[Side B] Pursuing OSS Quality Assurance with AI: Achieving 369 Tests, 97% Coverage, and GIL-Free Compatibility](https://media2.dev.to/dynamic/image/width=1200,height=627,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkzgnu27r1rfdz63ondpm.png)
[Side B] Pursuing OSS Quality Assurance with AI: Achieving 369 Tests, 97% Coverage, and GIL-Free Compatibility
<blockquote> <p><strong>From the Author:</strong><br> Recently, I introduced <strong>D-MemFS</strong> on Reddit. The response was overwhelming, confirming that memory management and file I/O performance are truly universal challenges for developers everywhere. This series is my response to that global interest.</p> </blockquote> <h3> 🧭 About this Series: The Two Sides of Development </h3> <p>To provide a complete picture of this project, I’ve split each update into two perspectives:</p> <ul> <li> <strong>Side A (Practical / from Qiita):</strong> Implementation details, benchmarks, and technical solutions.</li> <li> <strong>Side B (Philosophy / from Zenn):</strong> The development war stories, AI-collaboration, and design decisions.</li> </ul> <h2> Testing is a "Contract between the Design

Why Your AI Agent Health Check Is Lying to You
<p>Your monitoring dashboard shows green across the board. Process running. Port responding. CPU normal. Memory stable.</p> <p>But your AI agent hasn't done anything useful in four hours.</p> <h2> The problem with traditional health checks </h2> <p>Traditional health checks answer one question: "Is the process alive?" For web servers, that's usually enough. If Nginx is running and responding on port 80, it's probably serving pages.</p> <p>AI agents are different. An agent can be alive without being productive. The process is running, but the main work loop is stuck on a hung HTTP call, waiting on a deadlocked mutex, or spinning in a retry loop that will never succeed.</p> <h2> Three ways health checks lie </h2> <h3> 1. PID exists ≠ working </h3> <p><code>systemctl status my-agent</code> sa
Google offers researchers early access to Willow quantum processor
The Early Access Program invites researchers to design and propose quantum experiments that push the boundaries of what current hardware can achieve. It is a selective program – the processor will not be publicly available – and Google is setting firm deadlines for participation. Research teams have until May 15,... Read Entire Article

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!