A native AI PPT generation application based on nano banana pro 🍌
From idea to presentation in minutes—no tedious formatting, voice-driven modifications, moving towards true "Vibe PPT"
🚀 Online Demo | 📖 Documentation | Deployment
If this project is helpful to you, feel free to Star 🌟 & Fork 🍴
Have you ever found yourself in this dilemma: a presentation is due tomorrow, but your slides are still blank; your mind is full of brilliant ideas, yet your enthusiasm is drained by tedious layout and design?
We long to quickly create presentations that are both professional and aesthetically pleasing. While traditional AI PPT generation apps generally meet the need for "speed," they still face the following issues:
These shortcomings make it difficult for traditional AI PPT generators to satisfy our dual needs for "speed" and "beauty." Even those claiming to be "Vibe PPT" are, in my eyes, far from having enough "Vibe."
However, the emergence of the nano banana🍌 model has changed everything. I tried using 🍌pro to generate PPT pages and found that the results were exceptional in terms of quality, aesthetics, and consistency. It can accurately render almost all text requested in the prompts while following the style of reference images. So why not build a native "Vibe PPT" application based on 🍌pro?
🎯 Goal: Lower the barrier to PPT creation, enabling everyone to quickly create beautiful and professional presentations.
| Software Development Best Practices | DeepSeek-V3.2 Technical Showcase |
| R&D and Industrialization of Intelligent Production Equipment for Prepared Meals | The Evolution of Money: From Shells to Banknotes |
See more Use Cases
Supports three starting methods—Ideas, Outlines, and Page Descriptions—to suit various creative workflows.
No longer restricted by complex menu buttons, issue modification commands directly using natural language.
🌟 Feature Comparison with NotebookLM Slide Deck
| Feature | NotebookLM | This Project |
|---|---|---|
| Page Limit | 15 pages | Unlimited |
| Secondary Editing | Modify via prompts | Selection editing + Verbal editing |
| Adding Assets | Cannot add after generation | Add freely after generation |
| Export Formats | Supports PDF, (non-editable image) PPTX | Export as PDF, (image or editable) PPTX, presentation video |
| Watermark | Watermarks in free version | No watermarks, freely add/remove elements |
Note: This comparison may become outdated as new features are added.
| Status | Milestones |
|---|---|
| ✅ Completed | Create PPT via three paths: idea, outline, and page description |
| ✅ Completed | Parse Markdown-formatted images in text |
| ✅ Completed | Add more assets to a single PPT slide |
| ✅ Completed | Vibe verbal editing for selected areas on a single PPT slide |
| ✅ Completed | Asset module: Asset generation, uploading, etc. |
| ✅ Completed | Support for uploading and parsing multiple file types |
| ✅ Completed | Support Vibe verbal adjustment of outlines and descriptions |
| ✅ Completed | Initial support for exporting editable .pptx files |
| 🔄 In Progress | Support for multi-layer, precise background removal in editable .pptx exports |
| 🔄 In Progress | Web search |
| 🔄 In Progress | Agent mode |
| ✅ Completed | TTS narration video export (CN/EN/JP multi-voice, subtitles, Ken Burns effects) |
| 🚍 Partial | Optimize front-end loading speed |
| 🧭 Planned | Online playback functionality |
| 🧭 Planned | Simple animations and slide transitions |
| 🚍 Partial | Multi-language support |
This is the simplest method, requiring no Docker installation or project downloading. You can access the application immediately after creation.
Quickly start front-end and back-end services via Docker Compose.
If you are using Windows or macOS, please install Docker Desktop first and ensure Docker is running (Windows users can check the system tray icon; macOS users can check the menu bar icon). Then follow the same steps as described in the documentation.
Tip: If you encounter issues, Windows users should enable the WSL 2 backend in Docker Desktop settings (recommended). Also, ensure ports 3000 and 5000 are not occupied.
git clone https://github.com/Anionex/banana-slides
cd banana-slides
Create the .env file (refer to .env.example):
cp .env.example .env
(Optional, can also be configured in the UI after startup; click here for the tutorial) Edit the .env file to configure the necessary environment variables:
The LLM API in this project follows the AIHubMix platform format. It is recommended to use AIHubMix (click here to visit) to obtain API keys and reduce migration costs.
Note: The Google Nano Banana Pro model API has higher costs; please be mindful of usage expenses.
# AI Provider Configuration Format (gemini / openai / vertex)
AI_PROVIDER_FORMAT=gemini
# Gemini Format Configuration (Used when AI_PROVIDER_FORMAT=gemini)
GOOGLE_API_KEY=your-api-key-here
GOOGLE_API_BASE=https://generativelanguage.googleapis.com
# Proxy Example: https://aihubmix.com/gemini
# OpenAI Format Configuration (Used when AI_PROVIDER_FORMAT=openai)
OPENAI_API_KEY=your-api-key-here
OPENAI_API_BASE=https://api.openai.com/v1
# Proxy Example: https://aihubmix.com/v1
# Vertex AI Configuration (AI_PROVIDER_FORMAT=vertex)
# GCP Project and Service Account Key Required
# VERTEX_PROJECT_ID=your-gcp-project-id
# VERTEX_LOCATION=global
# GOOGLE_APPLICATION_CREDENTIALS=./gcp-service-account.json
# Lazyllm Format Configuration (used when AI_PROVIDER_FORMAT=lazyllm)
# Select vendors for text and image generation
TEXT_MODEL_SOURCE=deepseek # Text generation model provider
IMAGE_MODEL_SOURCE=doubao # Image editing model provider
IMAGE_CAPTION_MODEL_SOURCE=qwen # Image captioning model provider
# API Keys for Each Provider (Only configure the ones you want to use)
```env
DOUBAO_API_KEY=your-doubao-api-key # Volcengine/Doubao
DEEPSEEK_API_KEY=your-deepseek-api-key # DeepSeek
QWEN_API_KEY=your-qwen-api-key # Alibaba Cloud/Qwen
GLM_API_KEY=your-glm-api-key # Zhipu GLM
SILICONFLOW_API_KEY=your-siliconflow-api-key # SiliconFlow
SENSENOVA_API_KEY=your-sensenova-api-key # SenseTime SenseNova
MINIMAX_API_KEY=your-minimax-api-key # MiniMax
...
Use the new version of the editable export configuration method to get better editable export results: You need to obtain an API KEY from the Baidu Intelligent Cloud Platform (click here to enter), and fill it in the BAIDU_API_KEY field in the .env file (there is a sufficient free usage quota). See the instructions in https://github.com/Anionex/banana-slides/issues/121 for details.
Google Cloud Vertex AI allows calling Gemini models through a GCP service account, and new users can use trial credits. Configuration steps:
gcp-service-account.json in the project root directory..env:
AI_PROVIDER_FORMAT=vertex
VERTEX_PROJECT_ID=your-gcp-project-id
VERTEX_LOCATION=global
docker-compose.yml, mount the key file into the container, and set the GOOGLE_APPLICATION_CREDENTIALS environment variable.The
gemini-3-*series models requireVERTEX_LOCATION=global.
⚡ Use Pre-built Images (Recommended)
The project provides pre-built frontend and backend images on Docker Hub (synced with the latest version of the main branch), allowing you to skip the local build steps for rapid deployment:
# Launch with Pre-built Images (No need to build from scratch)
docker compose -f docker-compose.prod.yml up -d
Image names:
anoinex/banana-slides-frontend:latestanoinex/banana-slides-backend:latestBuild images from scratch
docker compose up -d
TIP
If you encounter network issues, you can uncomment the mirror source configurations in the .env file and then rerun the startup command:
# Uncomment the following in the .env file to use domestic mirror sources
DOCKER_REGISTRY=docker.1ms.run/
GHCR_REGISTRY=ghcr.nju.edu.cn/
APT_MIRROR=mirrors.aliyun.com
PYPI_INDEX_URL=https://mirrors.cloud.tencent.com/pypi/simple
NPM_REGISTRY=https://registry.npmmirror.com/
docker logs --tail 200 banana-slides-backend
docker logs -f --tail 100 banana-slides-backend
docker logs --tail 100 banana-slides-frontend
docker compose down
Using Pre-built Images (docker-compose.prod.yml)
docker compose -f docker-compose.prod.yml pull
docker compose -f docker-compose.prod.yml up -d
Using Local Build (docker-compose.yml)
Note: If you have manually modified the code, this method is not applicable. You must first revert the code to the version it was when pulled.
git pull
docker compose down
docker compose build --no-cache
docker compose up -d
Note: Thanks to our fellow developer @ShellMonster for providing the Newbie Deployment Tutorial. It is specially designed for beginners without any server deployment experience. You can click the link to view.
libass / ass subtitle filters.docker exec -it banana-slides-backend bash -c "apt-get update && apt-get install -y libreoffice-impress && rm -rf /var/lib/apt/lists/*"
Note: LibreOffice installed via this method will be lost when the container is rebuilt and will need to be reinstalled.
git clone https://github.com/Anionex/banana-slides
cd banana-slides
curl -LsSf https://astral.sh/uv/install.sh | sh
Run the following command in the project root directory:
# macOS (Homebrew)
brew install ffmpeg-full
brew unlink ffmpeg 2>/dev/null || true
brew link --overwrite --force ffmpeg-full
# Ubuntu / Debian
sudo apt-get update
sudo apt-get install -y ffmpeg libass9
# Then install Python dependencies
```bash
uv sync
This will automatically install all dependencies based on pyproject.toml.
Copy the environment variable template:
cp .env.example .env
A powerful, type-safe GORM code generation tool, specifically designed for cloud-native architectures.
Cloud-Native-GORM-Gen is an enhanced code generator based on GORM (Go Object Relational Mapping). It not only generates basic CRUD operations but also supports complex query logic, association mapping, and a highly customizable template engine, aiming to reduce boilerplate code and improve development efficiency.
interface{}, allowing errors to be discovered at compile time.go install github.com/cloud-native-gen/gorm-gen/tools/gentool@latest
gen.tool configuration file:version: "1.0"
database:
dsn: "root:password@tcp(127.0.0.1:3306)/dbname?charset=utf8mb4&parseTime=True&loc=Local"
dbType: "mysql"
outPath: "./dao/query"
modelPkgPath: "./dao/model"
gentool -c gen.tool
Welcome to submit Pull Requests or report Issues. Please ensure you read our Contributing Guide before submitting.
This project is licensed under the MIT License.
cd frontend
npm install
The frontend will automatically connect to the backend service at http://localhost:5000. To modify this, please edit src/api/client.ts.
(Optional) If you have important local data, it is recommended to back up the database before upgrading:
cp backend/instance/database.db backend/instance/database.db.bakNote: In the default configuration, templates, assets, and final products are stored in theuploads/folder.
cd backend
uv run alembic upgrade head && uv run python app.py
The backend service will start at http://localhost:5000.
Visit http://localhost:5000/health to verify that the service is running correctly.
cd frontend
npm run dev
The frontend development server will start at http://localhost:3000.
Open your browser to access and use the application.
React 18 + TypeScript + Vite 5 + Zustand
Python 3.10+ + Flask 3.0 + uv + SQLite
To facilitate communication and mutual assistance, this WeChat group has been created.
Feel free to suggest new features or provide feedback. I will also answer your questions at my own pace.
Follow the author on social media, where I share information about this project and AI:
See the official documentation
Welcome to contribute to this project via Issue and Pull Request!
Important: Please read CONTRIBUTING.md before contributing.
This project is open-sourced under the GNU Affero General Public License v3.0 (AGPL-3.0). It can be freely used for non-commercial purposes such as personal learning, research, experimentation, education, or non-profit scientific research activities;
Open source is not easy 🙏 If you find this project valuable, feel free to buy the developer a coffee ☕️
Thanks to the following friends for their generous sponsorship and support:
@雅俗共赏, @曹峥, @以年观日, @John, @胡yun星Ethan, @azazo1, @刘聪NLP, @🍟, @苍何, @万瑾, @biubiu, @law, @方源, @寒松Falcon If you have any questions regarding the sponsorship list, please contact the author