silky_dev/demo/observability

Public

WeChat Login

Code Issues Pull requests Events Packages Insights

main

111

agent
server
.cnb.yml
.gitignore
Makefile
README.md
prompt.md

服务器监控系统

基于 Grafana Alloy + Prometheus + Grafana 的分布式服务器监控方案。

架构

                          ┌─────────────────────────────────────────────┐
                          │            中央服务器 (1 台)                  │
                          │                                             │
┌──────────────┐  OTLP    │  ┌─────────┐    remote     ┌────────────┐  │   PromQL    ┌─────────┐
│  Agent #1    │─gRPC────▶│  │  Alloy   │───write──────▶│ Prometheus │◀─│────────────│ Grafana  │
│  (web-01)    │          │  │  :4317   │               │   :9090    │  │            │  :3000   │
└──────────────┘          │  └─────────┘    ┌──────────▶└────────────┘  │            └─────────┘
                          │       ▲         │                           │
┌──────────────┐  OTLP    │       │         │                           │
│  Agent #2    │─gRPC────▶│───────┘         │                           │
│  (db-01)     │          │                 │                           │
└──────────────┘          └─────────────────────────────────────────────┘
                          
┌──────────────┐  OTLP
│  Agent #N    │─gRPC────▶ ...
│  (app-xx)    │
└──────────────┘

核心设计：Push 模式 — 客户端主动推送，中央端配置固定不变。新增监控只需在新机器上部署 Agent。

项目结构

server/                        # 中央服务器端
├── docker-compose.yaml        # 编排 Alloy + Prometheus + Grafana
├── alloy/
│   └── config.alloy           # 中央 Alloy 配置
├── prometheus/
│   └── prometheus.yaml        # Prometheus 配置
├── grafana/
│   ├── datasources/
│   │   └── datasources.yaml   # Prometheus 数据源
│   └── dashboards/
│       ├── dashboard.yaml     # Dashboard 自动加载配置
│       └── node-exporter.json # Node Exporter Dashboard
└── .env                       # 环境变量

agent/                         # 客户端（每台被监控服务器）
├── docker-compose.yaml        # 容器编排
├── Dockerfile                 # 构建镜像
├── main.go                    # 入口：HTTP /metrics 端点
├── collector.go               # 采集器：CPU/内存/磁盘/网络等指标
├── go.mod / go.sum            # Go 依赖

快速开始

1. 部署中央服务器

cd server
docker compose up -d

启动后可访问：

Grafana:Grafana: http://<中央IP>:3000 （默认账号 admin / grafana123）
Prometheus:Prometheus: http://<中央IP>:9090
Alloy UI:Alloy UI: http://<中央IP>:12345

2. 部署客户端 Agent

将 agent/ 目录拷贝到被监控服务器，通过环境变量注入配置并启动：

cd agent
SERVER_NAME=web-01 HOSTNAME=$(hostname -s) docker compose up -d --build

环境变量说明：

SERVER_NAME：给这台服务器起个可读名称，用于 Dashboard 筛选
HOSTNAME：主机名，默认取当前机器的 hostname

方式二：直接编译运行（开发/调试用）

cd agent
go build -o node-exporter .
SERVER_NAME=local-mac HOSTNAME=$(hostname -s) ./node-exporter
# 默认监听 :9100，可通过 -port 指定
SERVER_NAME=local-mac HOSTNAME=$(hostname -s) ./node-exporter -port 9200

访问 http://localhost:9100/metrics 验证指标输出。

3. 新增被监控服务器

在新机器上执行同样的启动命令即可，无需修改中央端任何配置：

SERVER_NAME=db-01 HOSTNAME=$(hostname -s) docker compose up -d --build

4. 移除被监控服务器

在目标机器上停止并清理容器：

cd agent
docker compose down

验证

检查 Agent 状态

# 查看容器运行状态
docker compose ps

# 查看日志
docker compose logs -f

# 验证指标端点
curl http://localhost:9100/metrics

检查中央 Alloy

打开打开 http://<中央IP>:12345，确认组件正常运行（绿色状态）。

检查 Prometheus

打开打开 http://<中央IP>:9090，执行查询：

up
node_uname_info

能看到各 Agent 上报的数据即表示链路正常。

查看 Grafana Dashboard

打开打开 http://<中央IP>:3000
使用 admin / grafana123 登录
进入 "Node Exporter - 服务器监控" Dashboard
通过顶部 Instance / Server Name 下拉框筛选服务器

常见问题

Agent 无法连接中央 Alloy

# 检查中央端口是否可达
nc -zv <中央IP> 4317

# 检查 Agent 日志
docker compose logs -f

# 检查防火墙
firewall-cmd --list-ports        # CentOS
ufw status                       # Ubuntu

Grafana 中看不到数据

确认 Prometheus 中有数据：在确认 Prometheus 中有数据：在 http://<中央IP>:9090 查询 node_uname_info
确认 Alloy UI（:12345）中组件无报错
检查 Agent 容器日志：docker compose logs 看是否有报错

Dashboard 下拉框没有服务器

模板变量基于 node_uname_info 指标，Agent 启动后需要等待约 30-60 秒才会有数据
刷新 Dashboard 页面，或点击下拉框旁的刷新按钮

如何修改 Grafana 密码

编辑 server/.env 文件，修改 GF_SECURITY_ADMIN_PASSWORD，然后重启：

cd server
docker compose restart grafana

About

No description, topics, or website provided.

7.07 MiB

0 forks 0 stars 1 branches 2 TagREADME

Release
1

v0.0.2

Packages

agent

Contributors
1

Language

Go82.5%

Makefile12.6%

Dockerfile2.7%

Shell2.2%

35/F,Tencent Building,Kejizhongyi Avenue,Nanshan District,Shenzhen

京ICP备11018762号-111