README.md · main · ai-models/tencent/Hy-MT2-1.8B-GGUF

ai-models/tencent/Hy-MT2-1.8B-GGUF

Public

WeChat Login

Code Issues Pull requests Events Packages Insights

main

Hy-MT2-1.8B-GGUF/README.md

ooo<hhoh@users.noreply.huggingface.co>

Update README.md

0c4db9cd

0 commits

PreviewCode viewBlame

English ｜中文

🖥️ Official Website | 💬 GitHub | 🪡 AngelSlim

Model Introduction

Hy-MT2 is a family of “fast-thinking” multilingual translation models designed for complex real-world scenarios. It includes three model sizes: 1.8B, 7B, and 30B-A3B (MoE), all of which support translation among 33 languages and effectively follow translation instructions in multiple languages. For on-device deployment, AngelSlim 1.25-bit extreme quantization reduces the storage requirement of the 1.8B model to only 440 MB and improves inference speed by 1.5x. Multi-dimensional evaluations show that Hy-MT2 delivers outstanding performance across general, real-world business, domain-specific, and instruction-following translation tasks. The 7B and 30B-A3B models outperform open-source models such as DeepSeek-V4-Pro and Kimi K2.6 in fast-thinking mode, while the lightweight 1.8B model also surpasses mainstream commercial APIs from providers such as Microsoft and Doubao overall.

In this release, we also open-source IFMTBench, a benchmark for evaluating translation instruction-following capabilities.

We also welcome everyone to use our released Hy-MT2-Translator Skill, which makes it easy to integrate Hy-MT2 series models for translation tasks. Download links: ClawHub and SkillHub.

Now, Tencent Hy is officially partnering with WMT26 for the "Video Subtitle Translation Task" (https://www2.statmt.org/wmt26/video-subtitle-translation.html). Participants who use the Hy-MT model series to compete in the "General Machine Translation Task" (https://www2.statmt.org/wmt26/translation-task.html) and the "Video Subtitle Translation Task" will have the chance to win special awards sponsored by Hunyuan. We sincerely invite everyone to participate and jointly push the boundaries of machine translation technology!

News

2026.5.21 We open-sourced Hy-MT2-1.8B/Hy-MT2-7B/Hy-MT2-30B-A3B/IFMTBench on HuggingFace and ModelScope.
2025.12.30 We open-sourced HY-MT1.5-1.8B and HY-MT1.5-7B on HuggingFace and ModelScope.
2025.9.1 We open-sourced Hunyuan-MT-7B and Hunyuan-MT-Chimera-7B on HuggingFace and ModelScope.

Results

For more experimental results and analysis, please refer to our report.

Model Links

Model Name	Description	Download Link
Hy-MT2-1.8B	Hy 1.8B translation model	🤗 Model
Hy-MT2-1.8B-FP8	Hy 1.8B translation model, FP8 quantization	🤗 Model
Hy-MT2-1.8B-GGUF	Hy 1.8B translation model, llama.cpp	🤗 Model
Hy-MT2-1.8B-2bit-GGUF	Hy 1.8B translation model, llama.cpp, 2bit	🤗 Model
Hy-MT2-1.8B-1.25bit-GGUF	Hy 1.8B translation model, llama.cpp, 1.25bit	🤗 Model
Hy-MT2-7B	Hy 7B translation model	🤗 Model
Hy-MT2-7B-FP8	Hy 7B translation model, FP8 quantization	🤗 Model
Hy-MT2-7B-GGUF	Hy 7B translation model, llama.cpp	🤗 Model
Hy-MT2-30B-A3B	Hy 30B-A3B translation model	🤗 Model
Hy-MT2-30B-A3B-FP8	Hy 30B-A3B translation model, FP8 quantization	🤗 Model

Hy-MT2 Translation Task Instruction Examples (Chinese-English Comparison)

Note: In the following examples, both source_lang and target_lang should use the full language names. Chinese names should be used in Chinese prompts, and English names should be used in English prompts.

Type	Chinese prompt	English prompt
Default Translation	将以下文本翻译为 `{target_lang}`，注意只需要输出翻译后的结果，不要额外解释： `{source_text}`	Translate the following text into `{target_lang}`. Note that you should only output the translated result without any additional explanation: `{source_text}`
Terminology	参考下面的翻译： `{text}` 翻译成 `{text}` `{text}` 翻译成 `{text}` `{text}` 翻译成 `{text}` 将以下文本翻译为 `{target_lang}`，注意只需要输出翻译后的结果，不要额外解释： `{source_text}`	Reference the following translations: `{text}` translates to `{text}` `{text}` translates to `{text}` `{text}` translates to `{text}` Translate the following text into `{target_lang}`. Note that you must ONLY output the translated result without any additional explanation: `{source_text}`
Style	请将以下文本翻译为 `{target_lang}`。注意翻译的风格要严格符合【`{target_style}`】 `{source_text}`	Please translate the following text into `{target_lang}`. Note that the translation style must strictly conform to [`{target_style}`]: `{source_text}`
Personalization	【待翻译文本】 `{source_text}` 【翻译任务】 1、`{user_preferences}` 2、`{user_preferences}` 3、…… 4、将【待翻译文本】翻译为 `{target_lang}`。	[Source Text] `{source_text}` [Translation Tasks] 1. `{user_preferences}` 2. `{user_preferences}` 3. ... 4. Translate the [Source Text] into `{target_lang}`.
Delimiters	请将以下文本准确翻译为 `{target_lang}`。你必须在译文中保留等量的分隔符，绝对不可遗漏、转义或翻译该符号，并注意分隔符的位置。 `{source_text}`	Please accurately translate the following text into `{target_lang}`. You must retain the exact same number of delimiters in the translation. Strictly do not omit, escape, or translate these symbols, and pay close attention to their placement. `{source_text}`
Structured Data 1	# 任务目标将下方 `{source_text}` 中的 `{format_type}` 格式数据翻译为 `{target_lang}`。 # 严格约束 1. 结构锁定：绝对保持原有的 `{format_type}` 数据结构、缩进和层级完全不变。 2. 选择性翻译：仅翻译面向用户展示的可见文本内容。 3. 禁止修改：严禁翻译或更改任何代码标签、键名 (Key)、变量占位符（如 `{{var}}`、`${var}`、`%s`、`%d` 等）或代码属性。 # 数据输入 `{source_text}`	### Task Translate the user-facing text within the following `{format_type}` data into `{target_lang}`. ### Strict Rules 1. Structure Preservation: You MUST preserve the original `{format_type}` data structure, nesting, hierarchy, and indentation exactly as they are. 2. Selective Translation: Translate ONLY the visible, user-facing text content/values. 3. Strict Non-Translation: NEVER translate or alter code tags, keys, properties, object names, or variable placeholders. Leave them exactly in their original English/code form. ### Source Data `{source_text}`
Structured Data 2	【背景信息】 `{background_text}` 请结合背景信息将以下文本翻译为 `{target_lang}`。【待翻译文本】 `{source_text}`	[Background Information] `{background_text}` Please translate the following text into `{target_lang}`, taking the provided background information into consideration. [Source Text] `{source_text}`

Inference and Deployment

transformers

transformers>=5.6.0

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "tencent/Hy-MT2-30B-A3B"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

# Load model
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

model.eval()

# Example inference
prompt = "将以下文本翻译成英语,注意只需要输出翻译后的结果,不要额外解释:\n\n今天天气真好。"
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=4096,
    )
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)

vllm

Build vLLM from source:

uv venv --python 3.12 --seed --managed-python
source .venv/bin/activate
git clone https://github.com/vllm-project/vllm.git
cd vllm
uv pip install --editable . --torch-backend=auto

Start the vLLM server:

vllm serve tencent/Hy-MT2-30B-A3B --tensor-parallel-size 1

sglang

Build SGLang from source:

git clone https://github.com/sgl-project/sglang
cd sglang
pip3 install pip --upgrade
pip3 install "transformers>=5.6.0"
pip3 install -e "python"

Launch SGLang server:

python3 -m sglang.launch_server --model tencent/Hy-MT2-30B-A3B --tp 1

llama_cpp

❕❕ This gguf depends on our STQ kernel, which is released at PR #22836.

Clone llama.cpp

git clone https://github.com/ggml-org/llama.cpp.git

Enter the llama.cpp folder

cd llama.cpp

Build llama.cpp

cmake -B build
cmake --build build --config Release

Run a completion example

./build/bin/llama-completion \
  --model model.gguf  \
  -p "Translate the following segment into Chinese, without additional explanation：Hello" \
  --jinja \
  -ngl 0 \
  -n 64 -st

Run the llama.cpp benchmark

./build/bin/llama-bench -m model_zoo/model.gguf  -ngl 0

For 1.8B and 7B, we recommend using the following parameters for inference. Note that our models do not have a default system_prompt.


{
  "temperature": 0.7,
  "top_p": 0.6,
  "top_k": 20,
  "repetition_penalty": 1.05,
  "max_tokens": 4096
}

For 30B-A3B, we recommend using the following parameters for inference. Note that our models do not have a default system_prompt.


{
  "temperature": 0.7,
  "top_p": 1.0,
  "top_k": -1,
  "repetition_penalty": 1.0,
  "max_tokens": 4096
}

Model Training

Hy-MT2 provides a complete model training pipeline, supporting both full-parameter fine-tuning and LoRA fine-tuning, as well as multiple DeepSpeed ZeRO configurations and LLaMA-Factory integration.

For detailed training documentation, please refer to: Model Training Guide

Quantization Tool

We provide AngelSlim, an easy-to-use, comprehensive, and efficient large model compression toolkit covering common quantization algorithms, low-bit quantization, speculative sampling, and more.

Supported Languages

Languages	Abbr.	Chinese Names
Chinese	zh	中文
English	en	英语
French	fr	法语
Portuguese	pt	葡萄牙语
Spanish	es	西班牙语
Japanese	ja	日语
Turkish	tr	土耳其语
Russian	ru	俄语
Arabic	ar	阿拉伯语
Korean	ko	韩语
Thai	th	泰语
Italian	it	意大利语
German	de	德语
Vietnamese	vi	越南语
Malay	ms	马来语
Indonesian	id	印尼语
Filipino	tl	菲律宾语
Hindi	hi	印地语
Traditional Chinese	zh-Hant	繁体中文
Polish	pl	波兰语
Czech	cs	捷克语
Dutch	nl	荷兰语
Khmer	km	高棉语
Burmese	my	缅甸语
Persian	fa	波斯语
Gujarati	gu	古吉拉特语
Urdu	ur	乌尔都语
Telugu	te	泰卢固语
Marathi	mr	马拉地语
Hebrew	he	希伯来语
Bengali	bn	孟加拉语
Tamil	ta	泰米尔语
Ukrainian	uk	乌克兰语
Tibetan	bo	藏语
Kazakh	kk	哈萨克语
Mongolian	mn	蒙古语
Uyghur	ug	维吾尔语
Cantonese	yue	粤语

Citing Hy-MT2

@misc{hy-mt1.5,
      title={HY-MT1.5 Technical Report}, 
      author={Mao Zheng and Zheng Li and Tao Chen and Mingyang Song and Di Wang},
      year={2025},
      eprint={2512.24092},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.24092}, 
}

Contact Us

If you would like to leave feedback for our R&D and product teams, you are welcome to contact the Tencent Hunyuan LLM team. You can reach us by email at hunyuan_opensource@tencent.com.

35/F,Tencent Building,Kejizhongyi Avenue,Nanshan District,Shenzhen

京ICP备11018762号-111