logo
0
0
WeChat Login

Ubuntu 24.04 + PostgreSQL 17 Docker 部署

这套部署以 Ubuntu 24.04 为底板,使用 mirrors.tencent.com 作为 apt 源,自动完成:

  • PostgreSQL 17 源码编译安装
  • pgvector 源码编译安装
  • pg_textsearch 源码编译安装
  • pg_jieba 源码编译安装
  • 多阶段构建(builder/runtime),运行镜像只保留 PostgreSQL 运行所需 bin/lib/share 与扩展产物
  • 初始化数据库
  • 自动创建扩展
  • 自动加载仓库内的 schema / bundles
  • 自动同步 jieba 自定义词典

目录

  • Dockerfile
  • docker-compose.yml
  • build-and-up.sh
  • init/001_extensions.sql
  • scripts/entrypoint.sh
  • conf/postgresql.conf
  • jieba/dicts/*.dict
  • docs/pg-jieba.md
  • check-pg-capabilities.sh

快速启动

cd /home/user/diablo2-data bash deploy/pgsql17-ubuntu24/build-and-up.sh

仅查看计划(不执行)

DRY_RUN=true bash deploy/pgsql17-ubuntu24/build-and-up.sh

启动后连接

docker exec -it d2-pg17 /opt/postgresql/bin/psql -U d2 -d d2

检查扩展

SELECT extname FROM pg_extension ORDER BY extname;

现在应至少包含:

  • pg_textsearch
  • pg_jieba
  • vector
  • pg_trgm
  • unaccent
  • ltree
  • hstore
  • pg_stat_statements

pg_jieba 已接入说明

当前镜像已经把 pg_jieba 接入 PostgreSQL 17,并在 docker-compose.yml 里默认启用:

environment: PG_JIEBA_BASE_DICT: jieba_base PG_JIEBA_HMM_MODEL: jieba_hmm PG_JIEBA_USER_DICT: d2_core,d2_items,d2_skills volumes: - ./jieba/dicts:/bootstrap/jieba:ro

容器启动时会把 ./jieba/dicts/*.dict 复制到:

/opt/postgresql/share/tsearch_data/*.dict

然后由 pg_jieba.user_dict 按“文件名去掉 .dict 后缀”的方式装载。

例如:

文件配置值
jieba/dicts/d2_core.dictd2_core
jieba/dicts/d2_items.dictd2_items
jieba/dicts/d2_skills.dictd2_skills

完整说明见:docs/pg-jieba.md

jieba 分词示例

SELECT to_tsvector('jiebacfg', '谜团 破隐法杖 祝福之锤 开荒'); SELECT alias, token, lexemes FROM ts_debug('jiebacfg', '暗金 破隐法杖 符文之语 谜团'); SELECT to_tsquery('jiebacfg', '祝福之锤 圣骑士 谜团');

自定义词典格式示例

符文之语 30 nz 谜团 30 nz 祝福之锤 30 nz 超市 15 nz

检测当前 PostgreSQL 检索能力

bash deploy/pgsql17-ubuntu24/check-pg-capabilities.sh

它会检测:

  • pgvector / embedding 列与 ivfflat 索引
  • pg_textsearch 与 BM25 access method / BM25 索引
  • pg_trgm 相似度能力
  • pg_jieba 中文分词能力
  • d2.strategy_edge_facts / canonical_entities / canonical_claims / provenance 等知识图谱相关表

检查主要表

SELECT count(*) FROM d2.documents; SELECT count(*) FROM d2.chunks; SELECT count(*) FROM d2.canonical_entities; SELECT count(*) FROM d2.canonical_claims; SELECT count(*) FROM d2.provenance;

停止

docker-compose -f deploy/pgsql17-ubuntu24/docker-compose.yml down

启动后校验

bash deploy/pgsql17-ubuntu24/verify-running.sh

推荐校验项

SHOW shared_preload_libraries; SHOW pg_jieba.user_dict; SELECT jieba_reload_dict(); SELECT to_tsvector('jiebacfg', '谜团 破隐法杖 祝福之锤');

本机验证结果

基于当前工作区,之前已完成过如下运行时验证(历史记录):

  • pgsql17-ubuntu24_pg17:latest 镜像已构建成功
  • d2-pg17 容器已启动并 healthy
  • 扩展已实际加载:pg_trgm / pg_textsearch / vector / ltree / unaccent / pg_stat_statements / hstore
  • PostgreSQL 主 bundle 已导入
  • PostgreSQL 字典 bundle 已导入
  • PostgreSQL embedding bundle 已导入
  • d2.documents = 455
  • d2.chunks = 8652
  • d2.canonical_entities = 661
  • d2.canonical_claims = 899
  • d2.provenance = 927
  • dict.item_dictionary = 1080
  • d2.chunksembedding IS NOT NULL = 8652

如果你这次重建镜像,建议重新执行:

bash deploy/pgsql17-ubuntu24/build-and-up.sh bash deploy/pgsql17-ubuntu24/verify-running.sh

About

pgsql17-ubuntu24

Language
Shell66.2%
Dockerfile33.8%