This commit is contained in:
174
deployments/observability/README.md
Normal file
174
deployments/observability/README.md
Normal file
@@ -0,0 +1,174 @@
|
||||
# chat-deploy 可观测性部署指南
|
||||
|
||||
本目录包含将 chat-deploy 接入 itom-platform 可观测中心所需的配置文件。
|
||||
|
||||
## 功能说明
|
||||
|
||||
- **日志采集**:通过 Promtail 采集 Docker 容器日志,推送到 itom-platform 的 Loki
|
||||
- **MongoDB 监控**:通过 MongoDB Exporter 采集 MongoDB 指标
|
||||
- **Redis 监控**(可选):通过 Redis Exporter 采集 Redis 指标
|
||||
- **系统监控**(可选):通过 Node Exporter 采集系统级指标(CPU/Memory/Disk/Network)
|
||||
|
||||
## 快速开始
|
||||
|
||||
### 1. 配置环境变量
|
||||
|
||||
复制示例配置文件并修改:
|
||||
|
||||
```bash
|
||||
cp config.env.example config.env
|
||||
```
|
||||
|
||||
**必须修改的配置项**:
|
||||
|
||||
| 配置项 | 说明 | 示例 |
|
||||
|--------|------|------|
|
||||
| `OBS_HOST` | itom-platform 可观测中心地址 | `192.168.1.100` 或 `obs.example.com` |
|
||||
| `OBS_AUTH_TOKEN` | 鉴权 token(与中心侧一致) | `your-secret-token` |
|
||||
| `MONGODB_URI` | MongoDB 连接 URI | `mongodb://user:pass@host:27017/db` |
|
||||
|
||||
### 2. 启动日志采集
|
||||
|
||||
仅启用日志采集:
|
||||
|
||||
```bash
|
||||
docker compose --env-file config.env -f docker-compose-observability.yaml up -d
|
||||
```
|
||||
|
||||
### 3. 启动完整可观测性(日志 + 指标)
|
||||
|
||||
同时启用日志和指标采集:
|
||||
|
||||
```bash
|
||||
docker compose --env-file config.env -f docker-compose-observability.yaml --profile metrics up -d
|
||||
```
|
||||
|
||||
### 4. 验证服务状态
|
||||
|
||||
```bash
|
||||
# 查看服务状态
|
||||
docker compose --env-file config.env -f docker-compose-observability.yaml ps
|
||||
|
||||
# 查看 Promtail 日志
|
||||
docker logs chat-deploy-promtail
|
||||
|
||||
# 查看 MongoDB Exporter 指标
|
||||
curl http://localhost:9216/metrics | head -50
|
||||
```
|
||||
|
||||
## 配置说明
|
||||
|
||||
### MongoDB 配置
|
||||
|
||||
MongoDB Exporter 需要正确的连接 URI。根据 `config/mongodb.yml` 中的配置:
|
||||
|
||||
```env
|
||||
# 默认配置
|
||||
MONGODB_URI=mongodb://openIM:openIM123@localhost:37017/openim_v3?authSource=openim_v3
|
||||
```
|
||||
|
||||
如果 MongoDB 运行在 Docker 容器中,请使用容器名或宿主机 IP:
|
||||
|
||||
```env
|
||||
# 使用容器名(需要在同一网络)
|
||||
MONGODB_URI=mongodb://openIM:openIM123@mongo:27017/openim_v3?authSource=openim_v3
|
||||
|
||||
# 使用宿主机 IP
|
||||
MONGODB_URI=mongodb://openIM:openIM123@host.docker.internal:37017/openim_v3?authSource=openim_v3
|
||||
```
|
||||
|
||||
### Redis 配置(可选)
|
||||
|
||||
如果需要监控 Redis:
|
||||
|
||||
```env
|
||||
REDIS_ADDR=localhost:6379
|
||||
REDIS_PASSWORD=your_redis_password
|
||||
```
|
||||
|
||||
### 网络配置
|
||||
|
||||
如果 chat-deploy 的服务运行在其他 Docker 网络中,需要将可观测性组件加入该网络。
|
||||
|
||||
编辑 `docker-compose-observability.yaml`,添加外部网络:
|
||||
|
||||
```yaml
|
||||
networks:
|
||||
chat-deploy-obs:
|
||||
driver: bridge
|
||||
# 添加外部网络
|
||||
chat-deploy-network:
|
||||
external: true
|
||||
|
||||
services:
|
||||
mongodb-exporter:
|
||||
networks:
|
||||
- chat-deploy-obs
|
||||
- chat-deploy-network # 加入 chat-deploy 网络
|
||||
```
|
||||
|
||||
## 监控面板
|
||||
|
||||
在 itom-platform 的 Grafana 中,使用 `chat-deploy-dashboard` 查看:
|
||||
|
||||
- 服务健康状态
|
||||
- Go 运行时指标(CPU、内存、GC)
|
||||
- MongoDB 操作统计(query/insert/update/delete)
|
||||
- MongoDB 连接数
|
||||
- Redis 指标(如已配置)
|
||||
- 应用日志
|
||||
|
||||
## 故障排查
|
||||
|
||||
### 日志未采集
|
||||
|
||||
1. 检查 Promtail 是否正常运行:
|
||||
```bash
|
||||
docker logs chat-deploy-promtail
|
||||
```
|
||||
|
||||
2. 确认 LOKI_URL 配置正确:
|
||||
```bash
|
||||
echo $LOKI_URL
|
||||
# 应输出类似:http://192.168.1.100/loki/api/v1/push
|
||||
```
|
||||
|
||||
3. 测试网络连通性:
|
||||
```bash
|
||||
curl -v http://<OBS_HOST>/loki/api/v1/push
|
||||
```
|
||||
|
||||
### MongoDB 指标缺失
|
||||
|
||||
1. 检查 MongoDB Exporter 是否正常运行:
|
||||
```bash
|
||||
docker logs chat-deploy-mongodb-exporter
|
||||
```
|
||||
|
||||
2. 验证 MongoDB 连接:
|
||||
```bash
|
||||
curl http://localhost:9216/metrics | grep mongodb_up
|
||||
# 应输出:mongodb_up 1
|
||||
```
|
||||
|
||||
3. 确认 MongoDB URI 格式正确且网络可达
|
||||
|
||||
### 指标未推送到中心
|
||||
|
||||
1. 检查 Prometheus Agent 日志:
|
||||
```bash
|
||||
docker logs chat-deploy-prometheus-agent
|
||||
```
|
||||
|
||||
2. 确认 remote_write URL 配置正确
|
||||
|
||||
3. 检查网络防火墙是否允许访问中心端口
|
||||
|
||||
## 更新配置
|
||||
|
||||
修改 `config.env` 后,重启服务:
|
||||
|
||||
```bash
|
||||
docker compose --env-file config.env -f docker-compose-observability.yaml down
|
||||
docker compose --env-file config.env -f docker-compose-observability.yaml --profile metrics up -d
|
||||
```
|
||||
71
deployments/observability/config.env.example
Normal file
71
deployments/observability/config.env.example
Normal file
@@ -0,0 +1,71 @@
|
||||
# ==============================
|
||||
# chat-deploy 可观测性配置
|
||||
# ==============================
|
||||
|
||||
# 项目标识(必须与 itom-platform 中配置一致)
|
||||
OBS_PROJECT=chat-deploy
|
||||
OBS_SERVICE=chat-deploy
|
||||
OBS_SERVICE_NAME=chat-deploy
|
||||
OBS_ENV=prod
|
||||
|
||||
# ==============================
|
||||
# 中心侧连接配置(必须修改)
|
||||
# ==============================
|
||||
# itom-platform 可观测中心的地址
|
||||
OBS_HOST=CHANGE_ME
|
||||
OBS_SCHEME=http
|
||||
|
||||
# 鉴权 token(必须与 itom-platform 中心侧一致)
|
||||
OBS_AUTH_TOKEN=CHANGE_ME
|
||||
|
||||
# ==============================
|
||||
# 日志采集配置(Promtail)
|
||||
# ==============================
|
||||
# Loki 写入 URL(走网关)
|
||||
LOKI_URL=${OBS_SCHEME}://${OBS_HOST}/loki/api/v1/push
|
||||
|
||||
# Promtail 镜像版本
|
||||
PROMTAIL_IMAGE=grafana/promtail:3.0.0
|
||||
|
||||
# Docker API 版本(避免与旧 Docker daemon 协议不兼容)
|
||||
DOCKER_API_VERSION=1.44
|
||||
|
||||
# ==============================
|
||||
# 指标采集配置(Prometheus Agent)
|
||||
# ==============================
|
||||
# 中心侧 remote_write receiver 地址
|
||||
METRICS_REMOTE_WRITE_URL=${OBS_SCHEME}://${OBS_HOST}/api/v1/write
|
||||
|
||||
# 中心侧是否开启 remote_write 鉴权
|
||||
OBS_AUTH_ENABLE=false
|
||||
|
||||
# 业务服务指标采集目标(可选,格式:name=host:port,name2=host:port)
|
||||
METRICS_TARGETS=
|
||||
|
||||
# ==============================
|
||||
# MongoDB 配置(必须修改)
|
||||
# ==============================
|
||||
# MongoDB 连接 URI(用于 mongodb-exporter)
|
||||
# 格式:mongodb://username:password@host:port/database?authSource=admin
|
||||
MONGODB_URI=mongodb://openIM:openIM123@localhost:37017/openim_v3?authSource=openim_v3
|
||||
|
||||
# MongoDB Exporter 采集目标(自动配置,通常无需修改)
|
||||
MONGODB_EXPORTER_TARGETS=mongodb-exporter:9216
|
||||
MONGODB_EXPORTER_SERVICE=chat-deploy
|
||||
|
||||
# ==============================
|
||||
# Redis 配置(可选)
|
||||
# ==============================
|
||||
# Redis 地址(用于 redis-exporter)
|
||||
REDIS_ADDR=localhost:6379
|
||||
REDIS_PASSWORD=
|
||||
|
||||
# Redis Exporter 采集目标
|
||||
REDIS_EXPORTER_TARGETS=redis-exporter:9121
|
||||
REDIS_EXPORTER_SERVICE=chat-deploy
|
||||
|
||||
# ==============================
|
||||
# Node Exporter 配置(可选)
|
||||
# ==============================
|
||||
NODE_EXPORTER_TARGETS=node-exporter:9100
|
||||
NODE_EXPORTER_SERVICE=chat-deploy
|
||||
236
deployments/observability/config/prometheus-agent-entrypoint.sh
Normal file
236
deployments/observability/config/prometheus-agent-entrypoint.sh
Normal file
@@ -0,0 +1,236 @@
|
||||
#!/usr/bin/env sh
|
||||
set -eu
|
||||
|
||||
# ------------------------------
|
||||
# chat-deploy 指标采集:Prometheus Agent Entrypoint
|
||||
# - 根据环境变量生成 /prometheus/prometheus.yml 与 /prometheus/targets.json
|
||||
# - 然后以 agent 模式启动 Prometheus(remote_write 推送到 itom-platform 中心)
|
||||
#
|
||||
# 关键环境变量(来自 config.env):
|
||||
# - METRICS_REMOTE_WRITE_URL=http(s)://<OBS_HOST>/api/v1/write
|
||||
# - METRICS_TARGETS=name=host:port,name2=host:port
|
||||
# - OBS_AUTH_ENABLE=false/true(中心侧是否要求鉴权)
|
||||
# - OBS_AUTH_TOKEN=xxxxx(当 OBS_AUTH_ENABLE=true 时必填)
|
||||
# - OBS_PROJECT/OBS_ENV:写入 labels,便于中心侧筛选
|
||||
# ------------------------------
|
||||
|
||||
METRICS_REMOTE_WRITE_URL="${METRICS_REMOTE_WRITE_URL:-}"
|
||||
OBS_AUTH_ENABLE="${OBS_AUTH_ENABLE:-false}"
|
||||
OBS_AUTH_TOKEN="${OBS_AUTH_TOKEN:-}"
|
||||
METRICS_TARGETS="${METRICS_TARGETS:-}"
|
||||
|
||||
OBS_PROJECT="${OBS_PROJECT:-chat-deploy}"
|
||||
OBS_ENV="${OBS_ENV:-prod}"
|
||||
OBS_SERVICE="${OBS_SERVICE:-chat-deploy}"
|
||||
OBS_SERVICE_NAME="${OBS_SERVICE_NAME:-$OBS_SERVICE}"
|
||||
if [ -z "$OBS_SERVICE_NAME" ]; then
|
||||
OBS_SERVICE_NAME="chat-deploy"
|
||||
fi
|
||||
|
||||
is_truthy() { case "$1" in 1|true|TRUE|yes|YES|on|ON) return 0 ;; esac; return 1; }
|
||||
|
||||
if [ -z "$METRICS_REMOTE_WRITE_URL" ]; then
|
||||
echo "[prometheus-agent] FAIL: METRICS_REMOTE_WRITE_URL 为空" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
if is_truthy "$OBS_AUTH_ENABLE" && [ -z "$OBS_AUTH_TOKEN" ]; then
|
||||
echo "[prometheus-agent] FAIL: OBS_AUTH_ENABLE=true 但 OBS_AUTH_TOKEN 为空" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
# 确保数据目录存在且有写权限
|
||||
mkdir -p /prometheus/data
|
||||
chmod -R 777 /prometheus 2>/dev/null || true
|
||||
|
||||
TARGETS_JSON="/prometheus/targets.json"
|
||||
echo "[" > "$TARGETS_JSON"
|
||||
|
||||
first=1
|
||||
IFS=','
|
||||
for item in $METRICS_TARGETS; do
|
||||
item="$(printf "%s" "$item" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')"
|
||||
[ -z "$item" ] && continue
|
||||
|
||||
name="$(printf "%s" "$item" | cut -d= -f1 | tr -d '[:space:]')"
|
||||
target="$(printf "%s" "$item" | cut -d= -f2- | tr -d '[:space:]')"
|
||||
[ -z "$name" ] && continue
|
||||
[ -z "$target" ] && continue
|
||||
|
||||
target="$(printf "%s" "$target" | sed -E 's#^https?://##; s#/.*$##')"
|
||||
|
||||
if [ $first -eq 1 ]; then first=0; else echo "," >> "$TARGETS_JSON"; fi
|
||||
cat >> "$TARGETS_JSON" <<EOF
|
||||
{"targets":["$target"],"labels":{"service":"$name","project":"$OBS_PROJECT","env":"$OBS_ENV"}}
|
||||
EOF
|
||||
done
|
||||
unset IFS
|
||||
echo "]" >> "$TARGETS_JSON"
|
||||
|
||||
if [ "$first" -eq 1 ]; then
|
||||
echo "[prometheus-agent] WARN: METRICS_TARGETS 为空或格式无效,仅采集 prometheus-agent 自身指标和 Exporter 指标" >&2
|
||||
fi
|
||||
|
||||
CONFIG="/prometheus/prometheus.yml"
|
||||
cat > "$CONFIG" <<EOF
|
||||
global:
|
||||
scrape_interval: 15s
|
||||
external_labels:
|
||||
project: $OBS_PROJECT
|
||||
env: $OBS_ENV
|
||||
service: $OBS_SERVICE_NAME
|
||||
|
||||
scrape_configs:
|
||||
# Prometheus Agent 自身指标(用于验证采集链路是否正常)
|
||||
- job_name: "prometheus-agent"
|
||||
static_configs:
|
||||
- targets: ["localhost:9090"]
|
||||
|
||||
# 业务服务指标(通过 METRICS_TARGETS 或 file_sd_configs 配置)
|
||||
- job_name: "services"
|
||||
metrics_path: /metrics
|
||||
file_sd_configs:
|
||||
- files: ["$TARGETS_JSON"]
|
||||
refresh_interval: 10s
|
||||
EOF
|
||||
|
||||
# 自动发现并采集 Exporter 指标(Redis、MongoDB)
|
||||
echo "" >> "$CONFIG"
|
||||
echo " # 自动发现/手动指定的 Exporter 指标" >> "$CONFIG"
|
||||
|
||||
trim() {
|
||||
printf "%s" "$1" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//'
|
||||
}
|
||||
|
||||
normalize_target() {
|
||||
raw="$1"
|
||||
default_port="$2"
|
||||
raw="$(printf "%s" "$raw" | sed -E 's#^https?://##; s#/.*$##')"
|
||||
raw="$(trim "$raw")"
|
||||
[ -z "$raw" ] && return 1
|
||||
case "$raw" in
|
||||
*:*) printf "%s" "$raw" ;;
|
||||
*) printf "%s:%s" "$raw" "$default_port" ;;
|
||||
esac
|
||||
}
|
||||
|
||||
append_exporter_job() {
|
||||
job="$1"
|
||||
targets="$2"
|
||||
default_port="$3"
|
||||
service_label="$4"
|
||||
targets_list=""
|
||||
|
||||
IFS=','
|
||||
for item in $targets; do
|
||||
item="$(normalize_target "$item" "$default_port" || true)"
|
||||
[ -z "$item" ] && continue
|
||||
if [ -z "$targets_list" ]; then
|
||||
targets_list="$item"
|
||||
else
|
||||
targets_list="${targets_list},$item"
|
||||
fi
|
||||
done
|
||||
unset IFS
|
||||
|
||||
if [ -z "$targets_list" ]; then
|
||||
return 0
|
||||
fi
|
||||
|
||||
echo " - job_name: '$job'" >> "$CONFIG"
|
||||
echo " static_configs:" >> "$CONFIG"
|
||||
echo " - targets:" >> "$CONFIG"
|
||||
IFS=','
|
||||
for item in $targets_list; do
|
||||
echo " - '$item'" >> "$CONFIG"
|
||||
done
|
||||
unset IFS
|
||||
cat >> "$CONFIG" <<EOF
|
||||
labels:
|
||||
project: '${OBS_PROJECT}'
|
||||
service: '${service_label}'
|
||||
EOF
|
||||
}
|
||||
|
||||
# Redis Exporter(端口 9121)
|
||||
REDIS_EXPORTER_HOST=""
|
||||
REDIS_EXPORTER_TARGETS="${REDIS_EXPORTER_TARGETS:-}"
|
||||
REDIS_EXPORTER_SERVICE="${REDIS_EXPORTER_SERVICE:-$OBS_SERVICE_NAME}"
|
||||
if getent hosts redis-exporter >/dev/null 2>&1; then
|
||||
REDIS_EXPORTER_HOST="redis-exporter"
|
||||
elif getent hosts chat-deploy-redis-exporter >/dev/null 2>&1; then
|
||||
REDIS_EXPORTER_HOST="chat-deploy-redis-exporter"
|
||||
fi
|
||||
if [ -n "$REDIS_EXPORTER_HOST" ]; then
|
||||
if [ -n "$REDIS_EXPORTER_TARGETS" ]; then
|
||||
REDIS_EXPORTER_TARGETS="${REDIS_EXPORTER_TARGETS},${REDIS_EXPORTER_HOST}:9121"
|
||||
else
|
||||
REDIS_EXPORTER_TARGETS="${REDIS_EXPORTER_HOST}:9121"
|
||||
fi
|
||||
fi
|
||||
append_exporter_job "redis" "$REDIS_EXPORTER_TARGETS" "9121" "$REDIS_EXPORTER_SERVICE"
|
||||
if [ -n "$REDIS_EXPORTER_TARGETS" ]; then
|
||||
echo "[prometheus-agent] Redis Exporter 采集目标已配置(project=${OBS_PROJECT} service=${REDIS_EXPORTER_SERVICE})"
|
||||
fi
|
||||
|
||||
# MongoDB Exporter(端口 9216)- chat-deploy 使用 MongoDB
|
||||
MONGODB_EXPORTER_HOST=""
|
||||
MONGODB_EXPORTER_TARGETS="${MONGODB_EXPORTER_TARGETS:-}"
|
||||
MONGODB_EXPORTER_SERVICE="${MONGODB_EXPORTER_SERVICE:-$OBS_SERVICE_NAME}"
|
||||
if getent hosts mongodb-exporter >/dev/null 2>&1; then
|
||||
MONGODB_EXPORTER_HOST="mongodb-exporter"
|
||||
elif getent hosts chat-deploy-mongodb-exporter >/dev/null 2>&1; then
|
||||
MONGODB_EXPORTER_HOST="chat-deploy-mongodb-exporter"
|
||||
fi
|
||||
if [ -n "$MONGODB_EXPORTER_HOST" ]; then
|
||||
if [ -n "$MONGODB_EXPORTER_TARGETS" ]; then
|
||||
MONGODB_EXPORTER_TARGETS="${MONGODB_EXPORTER_TARGETS},${MONGODB_EXPORTER_HOST}:9216"
|
||||
else
|
||||
MONGODB_EXPORTER_TARGETS="${MONGODB_EXPORTER_HOST}:9216"
|
||||
fi
|
||||
fi
|
||||
append_exporter_job "mongodb" "$MONGODB_EXPORTER_TARGETS" "9216" "$MONGODB_EXPORTER_SERVICE"
|
||||
if [ -n "$MONGODB_EXPORTER_TARGETS" ]; then
|
||||
echo "[prometheus-agent] MongoDB Exporter 采集目标已配置(project=${OBS_PROJECT} service=${MONGODB_EXPORTER_SERVICE})"
|
||||
fi
|
||||
|
||||
# Node Exporter(端口 9100)- 用于系统级指标(CPU/Memory/Disk/Network/IO)
|
||||
NODE_EXPORTER_HOST=""
|
||||
NODE_EXPORTER_TARGETS="${NODE_EXPORTER_TARGETS:-}"
|
||||
NODE_EXPORTER_SERVICE="${NODE_EXPORTER_SERVICE:-$OBS_SERVICE_NAME}"
|
||||
if getent hosts node-exporter >/dev/null 2>&1; then
|
||||
NODE_EXPORTER_HOST="node-exporter"
|
||||
elif getent hosts chat-deploy-node-exporter >/dev/null 2>&1; then
|
||||
NODE_EXPORTER_HOST="chat-deploy-node-exporter"
|
||||
fi
|
||||
if [ -n "$NODE_EXPORTER_HOST" ]; then
|
||||
if [ -n "$NODE_EXPORTER_TARGETS" ]; then
|
||||
NODE_EXPORTER_TARGETS="${NODE_EXPORTER_TARGETS},${NODE_EXPORTER_HOST}:9100"
|
||||
else
|
||||
NODE_EXPORTER_TARGETS="${NODE_EXPORTER_HOST}:9100"
|
||||
fi
|
||||
fi
|
||||
append_exporter_job "node" "$NODE_EXPORTER_TARGETS" "9100" "$NODE_EXPORTER_SERVICE"
|
||||
if [ -n "$NODE_EXPORTER_TARGETS" ]; then
|
||||
echo "[prometheus-agent] Node Exporter 采集目标已配置(project=${OBS_PROJECT} service=${NODE_EXPORTER_SERVICE})"
|
||||
fi
|
||||
|
||||
cat >> "$CONFIG" <<EOF
|
||||
|
||||
remote_write:
|
||||
- url: "$METRICS_REMOTE_WRITE_URL"
|
||||
EOF
|
||||
|
||||
if is_truthy "$OBS_AUTH_ENABLE"; then
|
||||
echo " bearer_token: \"$OBS_AUTH_TOKEN\"" >> "$CONFIG"
|
||||
echo "[prometheus-agent] remote_write 鉴权已启用" >&2
|
||||
else
|
||||
echo "[prometheus-agent] remote_write 鉴权未启用" >&2
|
||||
fi
|
||||
|
||||
echo "[prometheus-agent] 配置文件已生成:"
|
||||
cat "$CONFIG"
|
||||
echo ""
|
||||
|
||||
# Prometheus 3.x 不再需要 --enable-feature=agent
|
||||
exec /bin/prometheus --config.file=/prometheus/prometheus.yml --storage.tsdb.path=/prometheus/data --web.enable-lifecycle
|
||||
61
deployments/observability/config/promtail.yaml
Normal file
61
deployments/observability/config/promtail.yaml
Normal file
@@ -0,0 +1,61 @@
|
||||
server:
|
||||
http_listen_port: 9080
|
||||
grpc_listen_port: 0
|
||||
|
||||
positions:
|
||||
filename: /tmp/positions.yaml
|
||||
|
||||
clients:
|
||||
- url: ${LOKI_URL}
|
||||
bearer_token: ${OBS_AUTH_TOKEN}
|
||||
|
||||
scrape_configs:
|
||||
# ============================================
|
||||
# chat-deploy 业务层日志采集
|
||||
# ============================================
|
||||
- job_name: chat-deploy-logs
|
||||
static_configs:
|
||||
- targets:
|
||||
- localhost
|
||||
labels:
|
||||
job: chat-deploy-logs
|
||||
project: ${OBS_PROJECT}
|
||||
service: ${OBS_SERVICE}
|
||||
log_layer: business
|
||||
__path__: /var/lib/docker/containers/*/*-json.log
|
||||
|
||||
pipeline_stages:
|
||||
# 解析 Docker JSON 日志格式
|
||||
- docker: {}
|
||||
|
||||
# 从文件路径提取容器 ID
|
||||
- regex:
|
||||
source: filename
|
||||
expression: '/var/lib/docker/containers/(?P<container_id>[0-9a-f]{12})[0-9a-f]*/'
|
||||
- labels:
|
||||
container_id:
|
||||
|
||||
# ============================================
|
||||
# 过滤规则:只丢弃基础设施层日志
|
||||
# 保留 chat-deploy 业务层日志(包括Go服务的caller日志)
|
||||
# ============================================
|
||||
|
||||
# 1. 丢弃 promtail/prometheus/grafana 等基础设施组件的内部日志
|
||||
- drop:
|
||||
expression: 'caller=.*(promtail|prometheus|grafana|loki|exporter).*\.go:[0-9]+'
|
||||
drop_counter_reason: infra_go_internal_log
|
||||
|
||||
# 2. 丢弃基础设施组件日志(prometheus/grafana/exporter等)
|
||||
- drop:
|
||||
expression: '(component=|target=|scrape_pool=|instance=.*exporter)'
|
||||
drop_counter_reason: infrastructure_component
|
||||
|
||||
# 3. 丢弃 Docker 服务发现和 API 错误日志
|
||||
- drop:
|
||||
expression: '(docker_discovery|Unable to refresh target groups|client version.*is too old|Minimum supported API version)'
|
||||
drop_counter_reason: docker_api_error
|
||||
|
||||
# 4. 丢弃文件监控事件日志(promtail 内部)
|
||||
- drop:
|
||||
expression: '(file watcher event|filetargetmanager|fsnotify)'
|
||||
drop_counter_reason: file_watcher
|
||||
135
deployments/observability/docker-compose-observability.yaml
Normal file
135
deployments/observability/docker-compose-observability.yaml
Normal file
@@ -0,0 +1,135 @@
|
||||
# chat-deploy 可观测性组件部署配置
|
||||
# 用于向 itom-platform 可观测中心推送日志和指标
|
||||
#
|
||||
# 使用方法:
|
||||
# 1. 复制 config.env.example 为 config.env 并修改配置
|
||||
# 2. 启动服务:docker compose --env-file config.env -f docker-compose-observability.yaml up -d
|
||||
# 3. 如需启用指标采集:docker compose --env-file config.env -f docker-compose-observability.yaml --profile metrics up -d
|
||||
|
||||
services:
|
||||
# ==============================
|
||||
# 日志采集(Promtail)
|
||||
# 采集 Docker 容器日志并推送到 itom-platform 的 Loki
|
||||
# ==============================
|
||||
promtail:
|
||||
image: "${PROMTAIL_IMAGE:-grafana/promtail:3.0.0}"
|
||||
container_name: chat-deploy-promtail
|
||||
restart: always
|
||||
user: "0"
|
||||
command: ["-config.file=/etc/promtail/config.yml", "-config.expand-env=true"]
|
||||
# 禁用 promtail 自身的日志输出到 Docker,避免日志循环
|
||||
logging:
|
||||
driver: "none"
|
||||
environment:
|
||||
- LOKI_URL=${LOKI_URL}
|
||||
- OBS_AUTH_TOKEN=${OBS_AUTH_TOKEN}
|
||||
- OBS_PROJECT=${OBS_PROJECT}
|
||||
- OBS_SERVICE=${OBS_SERVICE}
|
||||
- DOCKER_API_VERSION=${DOCKER_API_VERSION:-1.44}
|
||||
volumes:
|
||||
- /var/lib/docker/containers:/var/lib/docker/containers:ro
|
||||
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||
- ./config/promtail.yaml:/etc/promtail/config.yml:ro
|
||||
networks:
|
||||
- chat-deploy-obs
|
||||
|
||||
# ==============================
|
||||
# 指标采集(Prometheus Agent Remote Write)
|
||||
# 可选,需要 --profile metrics 启用
|
||||
# ==============================
|
||||
prometheus-agent:
|
||||
profiles: ["metrics"]
|
||||
image: prom/prometheus:latest
|
||||
container_name: chat-deploy-prometheus-agent
|
||||
restart: always
|
||||
user: "0"
|
||||
command:
|
||||
- "--config.file=/prometheus/prometheus.yml"
|
||||
- "--storage.tsdb.path=/prometheus"
|
||||
environment:
|
||||
- METRICS_REMOTE_WRITE_URL=${METRICS_REMOTE_WRITE_URL}
|
||||
- METRICS_TARGETS=${METRICS_TARGETS}
|
||||
- OBS_AUTH_ENABLE=${OBS_AUTH_ENABLE:-false}
|
||||
- OBS_AUTH_TOKEN=${OBS_AUTH_TOKEN}
|
||||
- OBS_PROJECT=${OBS_PROJECT}
|
||||
- OBS_SERVICE=${OBS_SERVICE}
|
||||
- OBS_SERVICE_NAME=${OBS_SERVICE_NAME}
|
||||
- OBS_ENV=${OBS_ENV:-prod}
|
||||
- REDIS_EXPORTER_TARGETS=${REDIS_EXPORTER_TARGETS}
|
||||
- MONGODB_EXPORTER_TARGETS=${MONGODB_EXPORTER_TARGETS}
|
||||
- NODE_EXPORTER_TARGETS=${NODE_EXPORTER_TARGETS}
|
||||
- REDIS_EXPORTER_SERVICE=${REDIS_EXPORTER_SERVICE}
|
||||
- MONGODB_EXPORTER_SERVICE=${MONGODB_EXPORTER_SERVICE}
|
||||
- NODE_EXPORTER_SERVICE=${NODE_EXPORTER_SERVICE}
|
||||
volumes:
|
||||
- prometheus_agent_data:/prometheus
|
||||
- ./config/prometheus-agent-entrypoint.sh:/etc/prometheus/entrypoint.sh:ro
|
||||
entrypoint: ["/bin/sh", "/etc/prometheus/entrypoint.sh"]
|
||||
networks:
|
||||
- chat-deploy-obs
|
||||
depends_on:
|
||||
- mongodb-exporter
|
||||
|
||||
# ==============================
|
||||
# MongoDB Exporter
|
||||
# 采集 MongoDB 指标,需要 --profile metrics 启用
|
||||
# ==============================
|
||||
mongodb-exporter:
|
||||
profiles: ["metrics"]
|
||||
image: percona/mongodb_exporter:0.40.0
|
||||
container_name: chat-deploy-mongodb-exporter
|
||||
restart: always
|
||||
command:
|
||||
- "--mongodb.uri=${MONGODB_URI}"
|
||||
- "--compatible-mode"
|
||||
- "--collect-all"
|
||||
environment:
|
||||
- MONGODB_URI=${MONGODB_URI}
|
||||
ports:
|
||||
- "9216:9216"
|
||||
networks:
|
||||
- chat-deploy-obs
|
||||
|
||||
# ==============================
|
||||
# Redis Exporter(可选)
|
||||
# 采集 Redis 指标,需要 --profile metrics 启用
|
||||
# ==============================
|
||||
redis-exporter:
|
||||
profiles: ["metrics"]
|
||||
image: oliver006/redis_exporter:latest
|
||||
container_name: chat-deploy-redis-exporter
|
||||
restart: always
|
||||
environment:
|
||||
- REDIS_ADDR=${REDIS_ADDR}
|
||||
- REDIS_PASSWORD=${REDIS_PASSWORD}
|
||||
ports:
|
||||
- "9121:9121"
|
||||
networks:
|
||||
- chat-deploy-obs
|
||||
|
||||
# ==============================
|
||||
# Node Exporter(可选)
|
||||
# 采集系统级指标(CPU/Memory/Disk/Network)
|
||||
# 需要 --profile metrics 启用
|
||||
# ==============================
|
||||
node-exporter:
|
||||
profiles: ["metrics"]
|
||||
image: prom/node-exporter:latest
|
||||
container_name: chat-deploy-node-exporter
|
||||
restart: always
|
||||
command:
|
||||
- "--path.rootfs=/host"
|
||||
- "--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)"
|
||||
volumes:
|
||||
- /:/host:ro,rslave
|
||||
ports:
|
||||
- "9100:9100"
|
||||
networks:
|
||||
- chat-deploy-obs
|
||||
|
||||
networks:
|
||||
chat-deploy-obs:
|
||||
driver: bridge
|
||||
|
||||
volumes:
|
||||
prometheus_agent_data: {}
|
||||
Reference in New Issue
Block a user