logo
0
0
WeChat Login
OmniCoder

OmniCoder-9B-GGUF

GGUF quantizations of OmniCoder-9B

License Full Weights


Available Quantizations

QuantizationSizeUse Case
Q2_K~3.8 GBExtreme compression, lowest quality
Q3_K_S~4.3 GBSmall footprint
Q3_K_M~4.6 GBSmall footprint, balanced
Q3_K_L~4.9 GBSmall footprint, higher quality
Q4_0~5.3 GBGood balance
Q4_K_S~5.4 GBGood balance
Q4_K_M~5.7 GBRecommended for most users
Q5_0~6.3 GBHigh quality
Q5_K_S~6.3 GBHigh quality
Q5_K_M~6.5 GBHigh quality, balanced
Q6_K~7.4 GBNear-lossless
Q8_0~9.5 GBHighest quality quantization
BF16~17.9 GBFull precision

Usage

# Install llama.cpp brew install llama.cpp # macOS # or build from source: https://github.com/ggml-org/llama.cpp # Interactive chat llama-cli --hf-repo Tesslate/OmniCoder-9B-GGUF --hf-file omnicoder-9b-q4_k_m.gguf -p "Your prompt" -c 8192 # Server mode (OpenAI-compatible API) llama-server --hf-repo Tesslate/OmniCoder-9B-GGUF --hf-file omnicoder-9b-q4_k_m.gguf -c 8192

Built by Tesslate | See full model card: OmniCoder-9B

About

No description, topics, or website provided.