English | 中文
This demo runs the MiniCPM-V family of multimodal models fully on-device on iOS, Android, and HarmonyOS NEXT. Three model versions are currently supported:
This repository contains three on-device demos for MiniCPM-V (multimodal LLM) running fully locally via llama.cpp:
MiniCPM-V-demo/ — iOS demo (Xcode project)MiniCPM-V-demo-Android/ — Android demo (Gradle / Kotlin)MiniCPM-V-demo-HarmonyOS/ — HarmonyOS NEXT demo (DevEco Studio / ArkTS)All three demos share the same llama.cpp submodule (branch Support-iOS-Demo) at the repo root.
NOTE: This project bundles
llama.cppas a git submodule. After cloning, run:git clone https://github.com/OpenBMB/MiniCPM-V-Apps.git cd MiniCPM-V-Apps git submodule update --init --recursive
The README is organised in two parts:
Just want to install the app? Pre-built TestFlight (iOS) / APK (Android) / HAP (HarmonyOS) packages and step-by-step install instructions are in DOWNLOAD.md. The rest of this README is only needed if you want to build from source.
NOTE: To deploy and test the app on an iOS device, you may need an Apple Developer account.
Install Xcode:
Download Xcode from the App Store
Install the Command Line Tools:
xcode-select --install
Agree to the software license agreement:
sudo xcodebuild -license
Open MiniCPM-V-demo/MiniCPM-V-demo.xcodeproj with Xcode. It may take a moment for Xcode to automatically download the required dependencies.
In Xcode, select the target device at the top of the window, then click the "Run" (triangle) button to launch the demo.
NOTE: If you encounter errors related to the thirdparty/llama.xcframework path, please follow the steps below to build the llama.xcframework manually.
Build directly inside the submodule (no extra clone needed):
cd llama.cpp
./build-xcframework.sh
cp -r ./build-apple/llama.xcframework ../MiniCPM-V-demo/thirdparty
Requirements:
28.2.13676358 and CMake 3.22.1)arm64-v8a)Build & run:
cd MiniCPM-V-demo-Android
./gradlew assembleDebug
Or open MiniCPM-V-demo-Android/ directly in Android Studio and click Run.
The first launch will download the GGUF model files into the app's external storage. You can also sideload model files manually via adb push — see in-app Model Manager for the expected directory layout.
Requirements:
arm64-v8a)Build & run:
MiniCPM-V-demo-HarmonyOS/ in DevEco Studio.File → Project Structure → Signing Configs and tick Automatically generate signature (requires a Huawei developer account; this only needs to be done once).After the first launch, open the in-app Model Manager and tap Download. You can also sideload model files via hdc file send; see MiniCPM-V-demo-HarmonyOS/README_zh.md for the expected directory layout.
The HarmonyOS port shares the exact same
llama.cppsubmodule, model catalogue, OBS direct-link URLs and MD5 hashes with the iOS / Android demos.
The on-device memory needed to run a model is roughly (model file size) + KV cache + a few hundred MB of working memory for the vision encoder and llama.cpp internals. The recommended values below leave enough headroom for the OS and the demo app itself.
| Model | LLM params | Recommended quant | LLM file (Q4) | mmproj (f16) | Total download | Recommended device RAM |
|---|---|---|---|---|---|---|
| MiniCPM-V 2.6 | 8B | Q4_K_M | ~4.4 GB | ~1.0 GB | ~5.4 GB | ≥ 8 GB |
| MiniCPM-V 4.0 | 4.1B | Q4_K_M | ~2.0 GB | ~0.9 GB | ~2.9 GB | ≥ 6 GB |
| MiniCPM-V 4.6 | 1.3B | Q4_K_M | ~0.5 GB | ~1.1 GB | ~1.6 GB | ≥ 6 GB |
Notes:
mmproj is the vision projector + ViT weights; it is shipped in f16 because quantising the visual tower hurts perception quality noticeably more than quantising the LLM.Download the language model file (e.g., ggml-model-Q4_0.gguf) and the vision model file (mmproj-model-f16.gguf) from the repository.
Download the language model file (e.g., ggml-model-Q4_K_M.gguf) and the vision model file (mmproj-model-f16.gguf) from the repository.
Download the language model file (e.g., MiniCPM-V-4_6-Q4_K_M.gguf) and the vision model file (mmproj-model-f16.gguf) from the repository.