logo
0
0
WeChat Login
data-infra<825485697@qq.com>
升级20250101版本

Cube Studio

English | 简体中文

Infra

image

cube-studio is a one-stop cloud-native machine learning platform open sourced by Tencent Music, Currently mainly includes the following functions

  • 1、data management: feature store, online and offline features; dataset management, structure data and media data, data label platform
  • 2、develop: notebook(vscode/jupyter); docker image management; image build online
  • 3、train: pipeline drag and drop online; open template market; distributed computing/training tasks, example tf/pytorch/mxnet/spark/ray/horovod/kaldi/volcano; batch priority scheduling; resource monitoring/alarm/balancing; cron scheduling
  • 4、automl: nni, ray
  • 5、inference: model manager; serverless traffic control; tf/pytorch/onnx/tensorrt model deploy, tfserving/torchserver/onnxruntime/triton inference; VGPU; load balancing、high availability、elastic scaling
  • 6、infra: multi-user; multi-project; multi-cluster; edge cluster mode; blockchain sharing;

Doc

https://github.com/tencentmusic/cube-studio/wiki

Job Template

tips:

  • 1、You can develop your own template, Easy to develop and more suitable for your own scenarios
templatetypedescribe
linuxbaseCustom stand-alone operating environment, free to implement all custom stand-alone functions
dataximport exportImport and export of heterogeneous data sources
hadoopdata processinghdfs,hbase,sqoop,spark client
sparkjobdata processingspark serverless
volcanojobdata processingvolcano multi-machine distributed framework
raydata processingpython ray multi-machine distributed framework
ray-sklearnmachine learningsklearn based on ray framework supports multi-machine distributed parallel computing
xgbmachine learningxgb model training and inference
tfjobdeep learningMulti-machine distributed training of tensorflow
pytorchjobdeep learningMulti-machine distributed training of pytorch
horovoddeep learningMulti-machine distributed training of horovod
paddledeep learningMulti-machine distributed training of paddle
mxnetdeep learningMulti-machine distributed training of mxnet
kaldideep learningMulti-machine distributed training of kaldi
tfjob-trainmodel traindistributed training of tensorflow: plain and runner
tfjob-runnermodel traindistributed training of tensorflow: runner method
tfjob-plainmodel traindistributed training of tensorflow: plain method
tf-model-evaluationmodel evaluatedistributed model evaluation of tensorflow2.3
tf-offline-predictmodel inferencedistributed offline model inference of tensorflow2.3
model-registermodel serviceregister model to platform
model-offline-predictmodel servicedistributed offline model inference of framework
deploy-servicemodel servicedeploy inference service
media-downloadmultimedia data processingDistributed download of media files
video-audiomultimedia data processingDistributed extraction of audio from video
video-imgmultimedia data processingDistributed extraction of pictures from video
yolov7machine visionobject-detection with yolov7

Deploy

wiki

cube

Company

图片 1