VoxCPM2优化仓库的操作指南

1. 开机

点击右上角Fork后再点右上角云原生构建
启动后，点击webIDE
在webIDE里，终端一般会自动弹出，显示在画面最下方；也可以鼠标手动点击右上角Toggle panel激活终端。

2. 检查当前机子剩余显存

在终端输入nvidia-smi的命令，然后回车，查看当前设备显存占用情况
比如下方信息里，显存占用是38783MiB / 46068MiB，还剩8GiB显存：


root@91b65d35e7e4:/workspace# nvidia-smi
Fri Apr 10 14:05:10 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L40                     On  |   00000000:84:00.0 Off |                    0 |
| N/A   54C    P0            194W /  300W |   38783MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

由于是CNB的算力云服务使用的是共享显存策略，所以启动服务之前需要先检查一下当前设备剩余的显存。由于VoxCPM2是2B模型：

当剩余5GB以上的显存都可以直接走第3步启动服务了；

但如果剩余显存很少，需要走第4步，然后再走第1步，重新挑机子。

如果新机子显存仍然占用很多，那就等半小时或者几小时再来启动项目

实测不足以满足VoxCPM2的推理，实际执行会遇到报错：


torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 24.00 MiB. GPU 0 has a total capacity of 44.39 GiB of which 4.00 MiB is free. Process 2945700 has 8.59 GiB memory in use. Process 3064112 has 7.12 GiB memory in use. Process 3206528 has 16.80 GiB memory in use. Process 3930486 has 7.43 GiB memory in use. Process 144336 has 4.42 GiB memory in use. Of the allocated memory 3.77 GiB is allocated by PyTorch, and 166.25 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)

3. 启动VoxCPM2服务

在终端输入bash run.sh的命令，然后回车，以启动web页面
等10秒左右，右下角会自动弹出蓝色的Open in Browser按钮，点击即可进入新页面

4. 关机

从web页面切回webIDE控制台，键盘按Ctrl+Z或者Ctrl+C终止VoxCPM2的服务；
之后在终端里输入kill 1的命令，关机。

用完的话，还是要关机的，避免浪费每个月的免费核时。

如果忘记关机，长时间没动静，最多4小时，也会被强行关机。

35/F,Tencent Building,Kejizhongyi Avenue,Nanshan District,Shenzhen

京ICP备11018762号-111