Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessClaude Now Has 1 Million Token Context. Here’s What That Actually Means for Developers.Medium AIWhy EHR Data Doesn't Fit Neat ML TablesHackernoon AIAI can write code. It just can’t maintain it — About the future of creative workMedium AIMengapa “Smart City” Saja Tidak Cukup: Urgensi Deep Learning Spasiotemporal untuk Pelayanan PublikMedium AIAI for Frontend Developers — Day 18Medium AIThe Discipline of Not Fooling Ourselves: Episode 4 — The Interpreters of the RulesDEV CommunityHow We Used AI Agents to Security-Audit an Open Source ProjectDEV CommunityAI chatbot traffic grows seven times faster than social media but still trails by a factor of fourThe DecoderWhy We Ditched Bedrock Agents for Nova Pro and Built a Custom OrchestratorDEV CommunityStop leaking your .env to AI! I built a Rust/Tauri Secret Manager to inject API keys safely 🛡️DEV CommunityNevaMind AI: Advanced Memory for Proactive AgentsDEV CommunityHow to Switch Industries Without Starting OverDEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessClaude Now Has 1 Million Token Context. Here’s What That Actually Means for Developers.Medium AIWhy EHR Data Doesn't Fit Neat ML TablesHackernoon AIAI can write code. It just can’t maintain it — About the future of creative workMedium AIMengapa “Smart City” Saja Tidak Cukup: Urgensi Deep Learning Spasiotemporal untuk Pelayanan PublikMedium AIAI for Frontend Developers — Day 18Medium AIThe Discipline of Not Fooling Ourselves: Episode 4 — The Interpreters of the RulesDEV CommunityHow We Used AI Agents to Security-Audit an Open Source ProjectDEV CommunityAI chatbot traffic grows seven times faster than social media but still trails by a factor of fourThe DecoderWhy We Ditched Bedrock Agents for Nova Pro and Built a Custom OrchestratorDEV CommunityStop leaking your .env to AI! I built a Rust/Tauri Secret Manager to inject API keys safely 🛡️DEV CommunityNevaMind AI: Advanced Memory for Proactive AgentsDEV CommunityHow to Switch Industries Without Starting OverDEV Community
AI NEWS HUBbyEIGENVECTOREigenvector

Open Source Project of the Day (Part 30): banana-slides - Native AI PPT Generation App Based on nano banana pro

Dev.to AIby WonderLabApril 5, 20269 min read1 views
Source Quiz

Introduction "Vibe your PPT like vibing code." This is Part 30 of the "Open Source Project of the Day" series. Today we explore banana-slides ( GitHub ), open-sourced by Anionex . Have you ever found yourself the night before a presentation with a blank slide deck — full of brilliant ideas, but completely drained by the drudgery of layouts and design? Traditional AI PPT tools may be "fast," but they're often locked into preset templates, offer little freedom, and produce homogeneous results. banana-slides is built on Google's nano banana pro image generation model, delivering a "native Vibe PPT" experience: three creation paths — one sentence , outline , and page description — upload any template or materials, intelligently parse PDF/Docx/MD files, use natural language voice editing on spe

Introduction

"Vibe your PPT like vibing code."

This is Part 30 of the "Open Source Project of the Day" series. Today we explore banana-slides (GitHub), open-sourced by Anionex.

Have you ever found yourself the night before a presentation with a blank slide deck — full of brilliant ideas, but completely drained by the drudgery of layouts and design? Traditional AI PPT tools may be "fast," but they're often locked into preset templates, offer little freedom, and produce homogeneous results. banana-slides is built on Google's nano banana pro image generation model, delivering a "native Vibe PPT" experience: three creation paths — one sentence, outline, and page description — upload any template or materials, intelligently parse PDF/Docx/MD files, use natural language voice editing on specific areas (e.g., "change page three to a case study," "replace this chart with a pie chart"), and one-click export to PPTX or PDF, with support for Editable PPTX (Beta) — text and images remain freely editable in PowerPoint. The project uses a React + Flask full-stack architecture, Docker one-click deployment, and targets a wide audience from beginners to professionals, with the goal of "lowering the bar for PPT creation so anyone can quickly produce beautiful, professional presentations."

What You'll Learn

  • banana-slides' positioning: a native AI PPT generation app based on nano banana pro, moving toward true "Vibe PPT"

  • Three creation paths: idea, outline, page description — plus Vibe-style natural language editing

  • Material parsing capabilities: multi-format uploads, smart extraction, style references

  • Technical architecture: React + Vite + Flask + SQLite + Gemini API

  • Comparison with notebooklm slide deck and project advantages

Prerequisites

  • Comfortable with Docker or Node.js/Python development environments

  • Basic familiarity with LLM APIs (e.g., Gemini, OpenAI)

  • For self-hosted deployment, a Google Gemini API Key is required (image generation requires a paid tier) or access via a proxy like AIHubMix

Project Background

Project Introduction

banana-slides is a native AI PPT generation app built on nano banana pro (Google Gemini image generation model). It supports creating presentations from three paths — idea, outline, and page description — automatically extracts charts and text from attachments, accepts uploaded template images for style customization, and allows natural language voice edits on specific areas or entire pages. Results can be exported as standard PPTX or PDF, with support for Editable PPTX (Beta): exported pages have text and images that can be freely edited in PowerPoint, with text styles (font size, color, bold, etc.) preserved as closely as possible. The project's slogan is "Vibe your PPT like vibing code," aiming to satisfy both "fast" and "beautiful" PPT needs, solving the problems of fixed templates, low flexibility, and homogenization in traditional AI PPT tools.

Target user groups:

  • Beginners: Zero-barrier generation of attractive PPTs with no design experience needed

  • PPT professionals: Reference AI-generated layouts and text-image compositions for design inspiration

  • Educators: Quickly convert teaching content into illustrated lesson plans

  • Students: Quickly complete assignment presentations, focusing on content rather than layout

  • Professionals: Rapidly visualize business proposals and product introductions

Author/Team Introduction

  • Organization: Anionex (GitHub)

  • Website: bananaslides.online

  • Community: WeChat group available in README; sponsors include AIHubMix, AI Huobao, Yuyun, etc.

  • Commercial license: Free for personal/educational/non-profit use under AGPL-3.0; commercial closed-source or private deployment requires a Commercial License from the author

Project Stats

  • ⭐ GitHub Stars: 12.1k+

  • 🍴 Forks: 1.4k+

  • 📦 Version: v0.4.0 (February 2026)

  • 📄 License: AGPL-3.0

  • 🌐 Website: bananaslides.online

  • 🐳 Docker: Supports amd64 / arm64 with pre-built images

Main Features

Core Purpose

banana-slides' core purpose is to quickly generate high-quality, editable PPTs driven by natural language and materials:

  • Multi-path creation: Start from a one-sentence idea, a structured outline, or page-by-page descriptions — AI auto-completes the outline and page content

  • Material parsing: Upload PDF, Docx, MD, Txt files; automatically extract key points, image links, and chart data as generation materials

  • Style customization: Upload reference images or templates to control the overall visual style

  • Vibe-style editing: Verbally edit with natural language (e.g., "change page three to a case study," "replace this chart with a pie chart") — AI responds in real time

  • Export: One-click export to PPTX or PDF; editable PPTX mode allows text and images to be freely modified in PowerPoint

Use Cases

  • Reports/Proposals: Presentation due tomorrow — input a topic or outline and quickly generate a professional PPT

  • Lesson plans: Upload teaching content or documents and auto-generate illustrated lesson plans

  • Assignment presentations: Students input their topic and focus on content rather than layout

  • Design inspiration: Professionals reference AI-generated layouts and compositions

  • Iterative refinement: Verbally request changes after generation, without going through menus

Quick Start

Recommended: Docker Compose Deployment

Edit .env, configure GOOGLE_API_KEY (or a proxy like AIHubMix)

docker compose -f docker-compose.prod.yml up -d`

Enter fullscreen mode

Exit fullscreen mode

Access the frontend at http://localhost:3000, backend API at http://localhost:5000.

Sample environment variables (Gemini format):

Proxy example: https://aihubmix.com/gemini`

Enter fullscreen mode

Exit fullscreen mode

From source: Requires Python 3.10+, uv, Node.js 16+; backend: uv sync then uv run python app.py; frontend: npm install then npm run dev.

Core Features

  • Three creation paths: Idea (one sentence generates outline and descriptions), Outline (manual or AI-generated), Page Description (per-page control)

  • Natural language editing: Verbally modify outlines or descriptions (e.g., "change page three to a case study") — AI adjusts in real time

  • Multi-format material parsing: PDF, Docx, MD, Txt upload; auto-parses key points, images, charts

  • Style references: Upload templates or reference images to customize PPT visual style

  • Local re-rendering: Select an unsatisfactory area and verbally describe the change (e.g., "replace this chart with a pie chart")

  • Full-page optimization: High-quality, visually consistent pages generated by nano banana pro

  • Multi-format export: PPTX, PDF, default 16:9, ready to present

  • Editable PPTX (Beta): Export high-fidelity, clean-background editable pages with text styles preserved; Baidu OCR API recommended for best results (see issue #121)

  • Multi-model support: Gemini, OpenAI, Vertex AI, Lazyllm (can mix DeepSeek, Doubao, Tongyi, etc.)

  • Internationalization and dark mode: Chinese/English toggle, light/dark/system theme

Project Advantages

Comparison with notebooklm slide deck (official README comparison, may change with updates):

Feature notebooklm banana-slides

Page limit 15 pages Unlimited

Re-editing Not supported Selection edit + voice edit

Adding materials Cannot add after generation Add freely after generation

Export format PDF only PDF + Editable PPTX

Watermark Free tier has watermarks No watermark, freely add/remove elements

Why choose banana-slides?

  • True Vibe: Built on nano banana pro — good text-image quality and consistency, accurate text rendering and adherence to reference image styles

  • Flexible creation: Not locked into preset templates; upload any materials and templates, make multi-round voice edits

  • Editable export: Supports exporting PPTX freely editable in PowerPoint, not just stacked images

  • Open-source self-hostable: Docker one-click deployment, privatizable; supports multiple LLM APIs for cost and compliance control

Detailed Project Analysis

Technical Architecture

Frontend: React 18 + TypeScript + Vite 5, Zustand state management, React Router v6, Tailwind CSS, @dnd-kit drag-and-drop, Lucide React icons, Axios HTTP client.

Backend: Python 3.10+, Flask 3.0, uv package manager, SQLite + Flask-SQLAlchemy, Google Gemini API (or OpenAI/Vertex/Lazyllm), python-pptx for PPT handling, Pillow for image processing, ThreadPoolExecutor for concurrency, Flask-CORS for cross-origin.

AI Capabilities: Both text generation (outlines, descriptions, etc.) and image generation (page rendering) depend on LLMs; the core image generation is nano banana pro, requiring an API that supports image generation (Gemini free tier only supports text, not images).

Project Structure

  • frontend/: React app; pages/ contains Home, OutlineEditor, DetailEditor, SlidePreview, History; components/ contains outline, preview, shared, layout, history; store/ is Zustand; api/ is interface wrappers

  • backend/: Flask app; models/ contains Project, Page, Task, Material, UserTemplate, ReferenceFile, PageImageVersion; services/ contains ai_service, file_service, file_parser_service, export_service, task_manager, prompts; controllers/ is REST API

  • tests/: Tests; v0_demo/: Early demo; output/: Exported files

Key Implementation

  • Creation pipeline: Idea → AI generates outline → generates per-page descriptions → calls nano banana pro to generate page images → assembles PPT

  • Material parsing: file_parser_service parses PDF/Docx/MD/Txt, extracting text, images, charts for use during generation

  • Editable PPTX: Text in generated images is recognized via OCR and restored as editable text boxes, preserving font size, color, bold, and other styles as much as possible; requires Baidu OCR API (see issue #121)

  • Multi-model support: Via AI_PROVIDER_FORMAT and Lazyllm configuration, mix text and image models from different vendors

Development Roadmap (Selected)

  • ✅ Completed: Three creation paths, Markdown image parsing, single-page material addition, selection Vibe editing, various file parsing, editable PPTX export

  • 🔄 In progress: Multi-layer precise cutout editable export, web search, Agent mode

  • 🧭 Planned: Online playback, animations and page transitions, multi-language support

  • 🏢 Commercial: User system

Project Resources

Official Resources

  • 🌟 GitHub: github.com/Anionex/banana-slides

  • 🌐 Website: bananaslides.online

  • 📄 English README: README_EN.md

  • 🐛 Issues: GitHub Issues

  • 📋 Editable PPTX notes: issue #121

  • 📖 Beginner deployment tutorial: @ShellMonster's tutorial

Related Resources

  • AIHubMix (multi-model API proxy, reduces migration cost)

  • Baidu Cloud OCR (editable export optimization)

  • uv (Python package manager)

  • nano banana pro (Google Gemini image generation)

Who Should Use This

  • Users who need to create PPTs quickly: Reports, proposals, lesson plans, assignment presentations

  • Creators who want "fast and beautiful": Don't want to be constrained by fixed templates; need multi-round natural language edits

  • Technical teams: Want to self-host AI PPT services and control data and costs

  • Developers interested in Vibe PPT and the nano banana ecosystem: Learn full-stack AI application architecture and multi-model integration

Welcome to visit my personal homepage for more useful knowledge and interesting products

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Open Source…geminimodelavailableversionupdateopen sourceDev.to AI

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Building knowledge graph…

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!