这套部署以 Ubuntu 24.04 为底板,使用 mirrors.tencent.com 作为 apt 源,自动完成:
pgvector 源码编译安装pg_textsearch 源码编译安装pg_jieba 源码编译安装jieba 自定义词典Dockerfiledocker-compose.ymlbuild-and-up.shinit/001_extensions.sqlscripts/entrypoint.shconf/postgresql.confjieba/dicts/*.dictdocs/pg-jieba.mdcheck-pg-capabilities.shcd /home/user/diablo2-data
bash deploy/pgsql17-ubuntu24/build-and-up.sh
DRY_RUN=true bash deploy/pgsql17-ubuntu24/build-and-up.sh
docker exec -it d2-pg17 /opt/postgresql/bin/psql -U d2 -d d2
SELECT extname FROM pg_extension ORDER BY extname;
现在应至少包含:
pg_textsearchpg_jiebavectorpg_trgmunaccentltreehstorepg_stat_statements当前镜像已经把 pg_jieba 接入 PostgreSQL 17,并在 docker-compose.yml 里默认启用:
environment:
PG_JIEBA_BASE_DICT: jieba_base
PG_JIEBA_HMM_MODEL: jieba_hmm
PG_JIEBA_USER_DICT: d2_core,d2_items,d2_skills
volumes:
- ./jieba/dicts:/bootstrap/jieba:ro
容器启动时会把 ./jieba/dicts/*.dict 复制到:
/opt/postgresql/share/tsearch_data/*.dict
然后由 pg_jieba.user_dict 按“文件名去掉 .dict 后缀”的方式装载。
例如:
| 文件 | 配置值 |
|---|---|
jieba/dicts/d2_core.dict | d2_core |
jieba/dicts/d2_items.dict | d2_items |
jieba/dicts/d2_skills.dict | d2_skills |
完整说明见:docs/pg-jieba.md
SELECT to_tsvector('jiebacfg', '谜团 破隐法杖 祝福之锤 开荒');
SELECT alias, token, lexemes FROM ts_debug('jiebacfg', '暗金 破隐法杖 符文之语 谜团');
SELECT to_tsquery('jiebacfg', '祝福之锤 圣骑士 谜团');
符文之语 30 nz 谜团 30 nz 祝福之锤 30 nz 超市 15 nz
bash deploy/pgsql17-ubuntu24/check-pg-capabilities.sh
它会检测:
pgvector / embedding 列与 ivfflat 索引pg_textsearch 与 BM25 access method / BM25 索引pg_trgm 相似度能力pg_jieba 中文分词能力d2.strategy_edge_facts / canonical_entities / canonical_claims / provenance 等知识图谱相关表SELECT count(*) FROM d2.documents;
SELECT count(*) FROM d2.chunks;
SELECT count(*) FROM d2.canonical_entities;
SELECT count(*) FROM d2.canonical_claims;
SELECT count(*) FROM d2.provenance;
docker-compose -f deploy/pgsql17-ubuntu24/docker-compose.yml down
bash deploy/pgsql17-ubuntu24/verify-running.sh
SHOW shared_preload_libraries;
SHOW pg_jieba.user_dict;
SELECT jieba_reload_dict();
SELECT to_tsvector('jiebacfg', '谜团 破隐法杖 祝福之锤');
基于当前工作区,之前已完成过如下运行时验证(历史记录):
pgsql17-ubuntu24_pg17:latest 镜像已构建成功d2-pg17 容器已启动并 healthypg_trgm / pg_textsearch / vector / ltree / unaccent / pg_stat_statements / hstored2.documents = 455d2.chunks = 8652d2.canonical_entities = 661d2.canonical_claims = 899d2.provenance = 927dict.item_dictionary = 1080d2.chunks 中 embedding IS NOT NULL = 8652如果你这次重建镜像,建议重新执行:
bash deploy/pgsql17-ubuntu24/build-and-up.sh bash deploy/pgsql17-ubuntu24/verify-running.sh