MiniCPM-V-4_5 是一个高效的多模态大模型,支持图像和视频理解对话。本指南提供详细的本地部署步骤,使用中国境内服务加速下载和部署。
模型特点:
项目特性:
# 如果是git仓库
git clone <repository-url>
cd MiniCPM-V-4_5
# 或者直接下载文件到本地目录
⚠️ 注意:此步骤需要手动执行
# 给脚本执行权限
chmod +x install_env.sh
# 运行环境安装脚本
./install_env.sh
脚本会自动:
脚本功能:
环境激活:
# 方法1:使用自动激活脚本
source activate_env.sh
# 方法2:手动激活虚拟环境
source venv/bin/activate
⚠️ 注意:此步骤需要手动执行
# 确保环境已激活
source activate_env.sh
# 下载模型
python download_model.py
脚本会自动:
下载说明:
./model/OpenBMB/MiniCPM-V-4_5-int4/ 目录⚠️ 注意:此步骤需要手动执行
# 确保环境已激活
source activate_env.sh
# 方法1:使用Python脚本启动
python start_api_server.py
# 方法2:使用Shell脚本启动(Linux/macOS)
./start_api_server.sh
# 方法3:使用批处理脚本启动(Windows)
start_api_server.bat
# 生产模式,多工作进程
python start_api_server.py --config production
# 或者使用Shell脚本
./start_api_server.sh --production
# 自定义配置启动
python start_api_server.py --custom --host 0.0.0.0 --port 8080 --workers 4
# 测试模式启动
python start_api_server.py --config testing
# 低内存模式启动
python start_api_server.py --config low_memory
# 启动并执行健康检查
python start_api_server.py --config development --health-check
# 列出所有可用配置
python start_api_server.py --list-configs
# 自定义端口和主机
python start_api_server.py --custom --host 127.0.0.1 --port 8080
# 热重载开发模式
python start_api_server.py --custom --reload
# 跳过环境检查
python start_api_server.py --skip-env-check
FastAPI API服务功能:
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "MiniCPM-V-4_5",
"messages": [
{
"role": "user",
"content": "请介绍一下MiniCPM-V模型的特点"
}
],
"max_tokens": 512,
"temperature": 0.7
}'
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "MiniCPM-V-4_5",
"messages": [
{
"role": "user",
"content": "请详细描述这张图片中的内容"
}
],
"image_url": "https://example.com/image.jpg",
"max_tokens": 512,
"temperature": 0.7
}'
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "MiniCPM-V-4_5",
"messages": [
{
"role": "user",
"content": "请描述这张图片"
}
],
"image_base64": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQ...",
"max_tokens": 512
}'
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "MiniCPM-V-4_5",
"messages": [
{
"role": "user",
"content": "请描述这个视频中发生了什么"
}
],
"video_url": "https://example.com/video.mp4",
"max_tokens": 512
}'
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "MiniCPM-V-4_5",
"messages": [
{
"role": "user",
"content": "请讲一个短故事"
}
],
"stream": true,
"max_tokens": 512
}'
import requests
# 基础文本对话
response = requests.post(
"http://localhost:8000/v1/chat/completions",
json={
"model": "MiniCPM-V-4_5",
"messages": [
{"role": "user", "content": "请介绍一下MiniCPM-V模型的特点"}
],
"max_tokens": 512,
"temperature": 0.7
}
)
print(response.json())
# 图像对话
response = requests.post(
"http://localhost:8000/v1/chat/completions",
json={
"model": "MiniCPM-V-4_5",
"messages": [
{"role": "user", "content": "请详细描述这张图片中的内容"}
],
"image_url": "https://example.com/image.jpg",
"max_tokens": 512
}
)
print(response.json())
MiniCPM-V-4_5/
├── .cnb.yml # CNB开发环境配置
├── install_env.sh # 环境安装脚本(手动执行)
├── activate_env.sh # 环境激活脚本(安装后生成)
├── download_model.py # 模型下载脚本(手动执行)
├── system_check.py # 系统环境检测脚本(增强版)
│
├── FastAPI API服务核心文件/
├── api_server.py # FastAPI API服务器核心文件
├── start_api_server.py # FastAPI服务启动脚本
│
├── utils/ # 工具模块
│ ├── __init__.py # 工具模块初始化
│ ├── image_utils.py # 图像处理工具
│ └── video_utils.py # 视频处理工具
│
├── 启动脚本/
├── start_api_server.py # Python启动脚本
├── start_api_server.sh # Shell启动脚本(Linux/macOS)
├── start_api_server.bat # 批处理启动脚本(Windows)
│
├── 依赖包配置/
├── requirements.txt # 完整依赖包列表
├── requirements-prod.txt # 生产环境依赖包
├── requirements-dev.txt # 开发环境依赖包
│
├── readme.md # 项目文档(原部署文档.md)
├── venv/ # Python虚拟环境(安装后生成)
├── model/ # 模型文件目录(下载后生成)
│ └── OpenBMB/MiniCPM-V-4_5-int4/
├── data/ # 数据目录(安装后生成)
│ ├── logs/ # 日志目录
│ └── scripts/ # 脚本目录
└── test_image.png # 测试图片(测试后生成)
.cnb.yml 文件配置:
# 启动一个能使用 gpu 的远程开发环境
$:
vscode:
- runner:
tags: cnb:arch:amd64:gpu
services:
- vscode
stages:
- name: 查看开发环境
script: python system_check.py
配置说明:
system_check.py 增强功能:
使用方法:
# 基础系统检查
python system_check.py
# 输出包含:
# - 完整硬件信息汇总
# - 网络连接分析
# - Python环境评估
# - GPU状态检测
# - 智能优化建议
api_server.py 核心功能:
/v1/health端点实时健康状态/docs端点交互式API文档/redoc端点专业API文档使用方法:
# 启动API服务
python start_api_server.py
# 访问API文档
open http://localhost:8000/docs
# 健康检查
curl http://localhost:8000/v1/health
start_api_server.py 配置功能:
使用方法:
# 开发模式启动
python start_api_server.py --config development
# 生产模式启动
python start_api_server.py --config production
# 自定义配置启动
python start_api_server.py --custom --host 0.0.0.0 --port 8080
# 列出所有可用配置
python start_api_server.py --list-configs
utils/ 工具模块:
使用方法:
from utils.image_utils import ImageProcessor
from utils.video_utils import VideoProcessor
# 图像处理
img_processor = ImageProcessor()
image = img_processor.load_image("input.jpg")
processed = img_processor.resize_image(image, (512, 512))
# 视频处理
video_processor = VideoProcessor()
frames = video_processor.extract_frames("input.mp4", max_frames=16)
install_env.sh 最新特性:
# 1. 给脚本执行权限
chmod +x install_env.sh
# 2. 运行安装脚本(自动适配环境)
./install_env.sh
# 3. 激活环境
source activate_env.sh
原始特性:
使用方法:
# 1. 给脚本执行权限
chmod +x install_env.sh
# 2. 运行安装脚本
./install_env.sh
# 3. 激活环境
source activate_env.sh
依赖包管理
本项目提供多套依赖包配置文件:
# 核心依赖
torch>=2.0.0
torchvision>=0.15.0
transformers>=4.35.0
# 模型和数据处理
modelscope>=1.8.0
decord>=0.6.0
numpy>=1.24.0
Pillow>=9.5.0
# vLLM API服务依赖
vllm>=0.4.0
fastapi>=0.104.0
uvicorn[standard]>=0.24.0
pydantic>=2.5.0
# 图像和视频处理
opencv-python>=4.8.0
imageio[ffmpeg]>=2.31.0
# 网络和IO
requests>=2.31.0
aiofiles>=23.2.0
# 系统监控和工具
psutil>=5.9.0
tqdm>=4.66.0
安装方法:
# 基础依赖安装
pip install -r requirements.txt
# 生产环境安装
pip install -r requirements-prod.txt
# 开发环境安装
pip install -r requirements-dev.txt
为加速下载,建议配置国内镜像源:
pip镜像源:
# 阿里云镜像
pip install -i https://mirrors.aliyun.com/pypi/simple/ [package_name]
# 清华镜像
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ [package_name]
PyTorch安装:
# CUDA 12.1版本
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
import requests
import json
class MiniCPMClient:
def __init__(self, base_url="http://localhost:8000"):
self.base_url = base_url
def chat_completion(self, messages, max_tokens=512, temperature=0.7, stream=False):
"""发送聊天完成请求"""
endpoint = f"{self.base_url}/v1/chat/completions"
payload = {
"model": "MiniCPM-V-4_5",
"messages": messages,
"max_tokens": max_tokens,
"temperature": temperature,
"stream": stream
}
headers = {
"Content-Type": "application/json"
}
try:
response = requests.post(endpoint, json=payload, headers=headers)
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
return {"error": str(e)}
def chat_with_image(self, messages, image_url, max_tokens=512, temperature=0.7):
"""发送图像对话请求"""
endpoint = f"{self.base_url}/v1/chat/completions"
payload = {
"model": "MiniCPM-V-4_5",
"messages": messages,
"image_url": image_url,
"max_tokens": max_tokens,
"temperature": temperature
}
headers = {
"Content-Type": "application/json"
}
try:
response = requests.post(endpoint, json=payload, headers=headers)
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
return {"error": str(e)}
def chat_with_video(self, messages, video_url, max_tokens=512, temperature=0.7):
"""发送视频对话请求"""
endpoint = f"{self.base_url}/v1/chat/completions"
payload = {
"model": "MiniCPM-V-4_5",
"messages": messages,
"video_url": video_url,
"max_tokens": max_tokens,
"temperature": temperature
}
headers = {
"Content-Type": "application/json"
}
try:
response = requests.post(endpoint, json=payload, headers=headers)
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
return {"error": str(e)}
# 使用示例
client = MiniCPMClient()
# 文本对话
messages = [{"role": "user", "content": "请介绍一下MiniCPM-V模型的特点"}]
response = client.chat_completion(messages)
print("文本对话回复:", response['choices'][0]['message']['content'])
# 图像对话
messages = [{"role": "user", "content": "请详细描述这张图片中的内容"}]
image_url = "https://example.com/image.jpg"
response = client.chat_with_image(messages, image_url)
print("图像对话回复:", response['choices'][0]['message']['content'])
# 视频对话
messages = [{"role": "user", "content": "请描述这个视频中发生了什么"}]
video_url = "https://example.com/video.mp4"
response = client.chat_with_video(messages, video_url)
print("视频对话回复:", response['choices'][0]['message']['content'])
class MiniCPMClient {
constructor(baseUrl = 'http://localhost:8000') {
this.baseUrl = baseUrl;
}
async chatCompletion(messages, maxTokens = 512, temperature = 0.7, stream = false) {
const endpoint = `${this.baseUrl}/v1/chat/completions`;
const payload = {
model: 'MiniCPM-V-4_5',
messages: messages,
max_tokens: maxTokens,
temperature: temperature,
stream: stream
};
try {
const response = await fetch(endpoint, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(payload)
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
return await response.json();
} catch (error) {
return { error: error.message };
}
}
async chatWithImage(messages, imageUrl, maxTokens = 512, temperature = 0.7) {
const endpoint = `${this.baseUrl}/v1/chat/completions`;
const payload = {
model: 'MiniCPM-V-4_5',
messages: messages,
image_url: imageUrl,
max_tokens: maxTokens,
temperature: temperature
};
try {
const response = await fetch(endpoint, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(payload)
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
return await response.json();
} catch (error) {
return { error: error.message };
}
}
async chatWithVideo(messages, videoUrl, maxTokens = 512, temperature = 0.7) {
const endpoint = `${this.baseUrl}/v1/chat/completions`;
const payload = {
model: 'MiniCPM-V-4_5',
messages: messages,
video_url: videoUrl,
max_tokens: maxTokens,
temperature: temperature
};
try {
const response = await fetch(endpoint, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(payload)
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
return await response.json();
} catch (error) {
return { error: error.message };
}
}
}
// 使用示例
const client = new MiniCPMClient();
// 文本对话
const messages = [{ role: 'user', content: '请介绍一下MiniCPM-V模型的特点' }];
client.chatCompletion(messages)
.then(response => {
console.log('文本对话回复:', response.choices[0].message.content);
})
.catch(error => {
console.error('请求失败:', error);
});
// 图像对话
const imageMessages = [{ role: 'user', content: '请详细描述这张图片中的内容' }];
const imageUrl = 'https://example.com/image.jpg';
client.chatWithImage(imageMessages, imageUrl)
.then(response => {
console.log('图像对话回复:', response.choices[0].message.content);
})
.catch(error => {
console.error('请求失败:', error);
});
import requests
import json
class StreamingMiniCPMClient:
def __init__(self, base_url="http://localhost:8000"):
self.base_url = base_url
def stream_chat_completion(self, messages, max_tokens=512, temperature=0.7):
"""流式聊天完成"""
endpoint = f"{self.base_url}/v1/chat/completions"
payload = {
"model": "MiniCPM-V-4_5",
"messages": messages,
"max_tokens": max_tokens,
"temperature": temperature,
"stream": True
}
headers = {
"Content-Type": "application/json"
}
try:
response = requests.post(endpoint, json=payload, headers=headers, stream=True)
response.raise_for_status()
# 处理SSE流
for line in response.iter_lines():
if line:
line = line.decode('utf-8')
if line.startswith('data: '):
data = line[6:] # 移除 'data: ' 前缀
if data == '[DONE]':
break
try:
chunk = json.loads(data)
if 'choices' in chunk and len(chunk['choices']) > 0:
delta = chunk['choices'][0].get('delta', {})
content = delta.get('content', '')
if content:
yield content
except json.JSONDecodeError:
continue
except requests.exceptions.RequestException as e:
yield f"Error: {str(e)}"
# 使用示例
client = StreamingMiniCPMClient()
messages = [{"role": "user", "content": "请讲一个短故事"}]
print("流式响应:")
for chunk in client.stream_chat_completion(messages):
print(chunk, end='', flush=True)
print()
import requests
import base64
from PIL import Image
import io
class ImageMiniCPMClient:
def __init__(self, base_url="http://localhost:8000"):
self.base_url = base_url
def image_to_base64(self, image_path):
"""将图像文件转换为Base64"""
with open(image_path, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
return f"data:image/jpeg;base64,{encoded_string}"
def chat_with_image_file(self, messages, image_path, max_tokens=512, temperature=0.7):
"""使用本地图像文件进行对话"""
image_base64 = self.image_to_base64(image_path)
endpoint = f"{self.base_url}/v1/chat/completions"
payload = {
"model": "MiniCPM-V-4_5",
"messages": messages,
"image_base64": image_base64,
"max_tokens": max_tokens,
"temperature": temperature
}
headers = {
"Content-Type": "application/json"
}
try:
response = requests.post(endpoint, json=payload, headers=headers)
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
return {"error": str(e)}
def upload_image(self, image_path):
"""上传图像文件"""
endpoint = f"{self.base_url}/v1/image/upload"
with open(image_path, 'rb') as image_file:
files = {'file': (image_path, image_file, 'image/jpeg')}
try:
response = requests.post(endpoint, files=files)
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
return {"error": str(e)}
# 使用示例
client = ImageMiniCPMClient()
# 上传图像
upload_result = client.upload_image("test_image.jpg")
print("上传结果:", upload_result)
# 使用本地图像对话
messages = [{"role": "user", "content": "请详细描述这张图片中的内容"}]
response = client.chat_with_image_file(messages, "test_image.jpg")
print("图像对话回复:", response['choices'][0]['message']['content'])
解决方案:
pip install -i https://mirrors.aliyun.com/pypi/simple/ modelscope解决方案:
.to("cpu")解决方案:
# 检查CUDA版本
nvcc --version
# 检查PyTorch CUDA支持
python -c "import torch; print(torch.cuda.is_available())"
# 安装匹配的PyTorch版本
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
解决方案:
解决方案:
# 检查图像格式
image = Image.open("your_image.jpg").convert("RGB")
# 调整图像大小
image = image.resize((448, 448))
# 检查图像通道
if image.mode != 'RGB':
image = image.convert('RGB')
# 启用半精度
model = model.half()
# 启用梯度检查点
model.gradient_checkpointing_enable()
# 批量处理
def batch_inference(images, batch_size=4):
for i in range(0, len(images), batch_size):
batch = images[i:i+batch_size]
# 处理批次
yield process_batch(batch)
# 清理缓存
torch.cuda.empty_cache()
# 使用生成器
def stream_response(model, tokenizer, image, msgs):
for chunk in model.stream_chat(
image=image,
msgs=msgs,
tokenizer=tokenizer
):
yield chunk
部署完成后,你将拥有一个功能完整的MiniCPM-V-4_5-int4多模态对话系统! 🎉