Fix wrong links

This commit is contained in:
Baohua Yang
2026-02-22 16:04:41 -08:00
parent 4ca47b0ea1
commit 572266b2f4
78 changed files with 626 additions and 1136 deletions

8
.gitignore vendored
View File

@@ -3,6 +3,7 @@
*.tmp
.idea/
_book/
format_report.txt
*.swp
*.edx
.DS_Store
@@ -21,3 +22,10 @@ __pycache__/
# Check scripts
check_project_rules.py
check_dashes.py
checker.py
find_lists_no_space.py
fix_missing_spaces.py
fix_project_rules.py
fixer.py
format_headings.py

View File

@@ -129,8 +129,8 @@ $ docker rm abc123
| 方式 | 说明 | 适用场景 |
|------|------|---------|
| **[数据卷 (Volume) ](../08_data/volume.md)** | Docker 管理的存储 | 数据库应用数据 |
| **[绑定挂载 (Bind Mount) ](../08_data/bind-mounts.md)** | 挂载宿主机目录 | 开发时共享代码 |
| **[数据卷 (Volume) ](../08_data/8.1_volume.md)** | Docker 管理的存储 | 数据库应用数据 |
| **[绑定挂载 (Bind Mount) ](../08_data/8.2_bind-mounts.md)** | 挂载宿主机目录 | 开发时共享代码 |
```bash
## 使用数据卷(推荐)

View File

@@ -49,4 +49,4 @@
- [列出镜像](4.2_list.md)查看和过滤镜像
- [删除容器](../05_container/5.6_rm.md)清理容器
- [数据卷](../08_data/volume.md)清理数据卷
- [数据卷](../08_data/8.1_volume.md)清理数据卷

View File

@@ -54,4 +54,4 @@
- [终止容器](5.3_stop.md)优雅停止容器
- [删除镜像](../04_image/4.3_rm.md)清理镜像
- [数据卷](../08_data/volume.md)数据卷管理
- [数据卷](../08_data/8.1_volume.md)数据卷管理

View File

@@ -191,8 +191,8 @@
### 7.19.14 延伸阅读
- [数据卷](../08_data/volume.md)卷的管理和使用
- [挂载主机目录](../08_data/bind-mounts.md)Bind Mount
- [数据卷](../08_data/8.1_volume.md)卷的管理和使用
- [挂载主机目录](../08_data/8.2_bind-mounts.md)Bind Mount
- [Compose 数据管理](../11_compose/11.5_compose_file.md)Compose 中的卷配置
| 要点 | 说明 |
@@ -206,5 +206,5 @@
### 7.19.15 延伸阅读
- [网络配置](../09_network/README.md)Docker 网络详解
- [端口映射](../09_network/port_mapping.md)-p 参数详解
- [端口映射](../09_network/9.5_port_mapping.md)-p 参数详解
- [Compose 端口](../11_compose/11.5_compose_file.md)Compose 中的端口配置

View File

@@ -384,7 +384,7 @@ $ docker run -v mydata:/app/data nginx
$ docker run -v /host/path:/app/data nginx
```
详见[绑定挂载](bind-mounts.md)章节
详见[绑定挂载](8.2_bind-mounts.md)章节
---

View File

@@ -8,5 +8,5 @@
这一章介绍如何在 Docker 内部以及容器之间管理数据在容器中管理数据主要有两种方式
* [数据卷](volume.md)
* [挂载主机目录](bind-mounts.md)
* [数据卷](8.1_volume.md)
* [挂载主机目录](8.2_bind-mounts.md)

View File

@@ -12,8 +12,8 @@
### 8.5.1 延伸阅读
- [数据卷](volume.md)Docker 管理的持久化存储
- [tmpfs 挂载](tmpfs.md)内存临时存储
- [数据卷](8.1_volume.md)Docker 管理的持久化存储
- [tmpfs 挂载](8.3_tmpfs.md)内存临时存储
- [Compose 数据管理](../11_compose/11.5_compose_file.md)Compose 中的挂载配置
| 操作 | 命令 |
@@ -27,6 +27,6 @@
### 8.5.2 延伸阅读
- [绑定挂载](bind-mounts.md)挂载宿主机目录
- [tmpfs 挂载](tmpfs.md)内存中的临时存储
- [绑定挂载](8.2_bind-mounts.md)挂载宿主机目录
- [tmpfs 挂载](8.3_tmpfs.md)内存中的临时存储
- [存储驱动](../12_implementation/12.4_ufs.md)Docker 存储的底层原理

View File

@@ -33,9 +33,9 @@ graph TD
## 本章内容
* [配置 DNS](dns.md)
* [外部访问容器](port_mapping.md)
* [网络类型](network_types.md)
* [自定义网络](custom_network.md)
* [容器互联](container_linking.md)
* [网络隔离](network_isolation.md)
* [配置 DNS](9.1_dns.md)
* [外部访问容器](9.5_port_mapping.md)
* [网络类型](9.2_network_types.md)
* [自定义网络](9.3_custom_network.md)
* [容器互联](9.4_container_linking.md)
* [网络隔离](9.6_network_isolation.md)

View File

@@ -14,11 +14,11 @@
### 9.8.1 延伸阅读
- [配置 DNS](dns.md)自定义 DNS 设置
- [网络类型](network_types.md)BridgeHostNone 等网络模式
- [自定义网络](custom_network.md)创建和管理自定义网络
- [容器互联](container_linking.md)容器间通信方式
- [端口映射](port_mapping.md)高级端口配置
- [网络隔离](network_isolation.md)网络安全与隔离策略
- [配置 DNS](9.1_dns.md)自定义 DNS 设置
- [网络类型](9.2_network_types.md)BridgeHostNone 等网络模式
- [自定义网络](9.3_custom_network.md)创建和管理自定义网络
- [容器互联](9.4_container_linking.md)容器间通信方式
- [端口映射](9.5_port_mapping.md)高级端口配置
- [网络隔离](9.6_network_isolation.md)网络安全与隔离策略
- [EXPOSE 指令](../07_dockerfile/7.9_expose.md) Dockerfile 中声明端口
- [Compose 网络](../11_compose/11.5_compose_file.md)Compose 中的网络配置

View File

@@ -217,5 +217,5 @@ $ docker compose restart wordpress
### 11.8.7 延伸阅读
- [Compose 模板文件](11.5_compose_file.md)深入了解配置项
- [数据卷](../08_data/volume.md)理解数据持久化
- [数据卷](../08_data/8.1_volume.md)理解数据持久化
- [Docker Hub WordPress](https://hub.docker.com/_/wordpress):官方镜像文档

View File

@@ -2,7 +2,7 @@
`kubeadm` 提供了 `kubeadm init` 以及 `kubeadm join` 这两个命令作为快速创建 `Kubernetes` 集群的最佳实践
> **重要说明** Kubernetes 1.24 内置 `dockershim` 已被移除Kubernetes 默认不再直接使用 Docker Engine 作为容器运行时 (CRI)因此**更推荐参考** 同目录下的[使用 kubeadm 部署 Kubernetes (CRI 使用 containerd)](kubeadm.md)
> **重要说明** Kubernetes 1.24 内置 `dockershim` 已被移除Kubernetes 默认不再直接使用 Docker Engine 作为容器运行时 (CRI)因此**更推荐参考** 同目录下的[使用 kubeadm 部署 Kubernetes (CRI 使用 containerd)](14.1_kubeadm.md)
>
> 本文档主要用于历史环境/学习目的如果你确实需要在较新版本中继续使用 Docker Engine通常需要额外部署 `cri-dockerd` 并在 `kubeadm init/join` 中指定 `--cri-socket`

View File

@@ -13,5 +13,5 @@
### 14.9.1 延伸阅读
- [容器编排基础](../13_kubernetes_concepts/README.md)Kubernetes 核心概念
- [Dashboard](dashboard.md)部署可视化管理界面
- [kubectl](kubectl.md)命令行工具使用指南
- [Dashboard](14.7_dashboard.md)部署可视化管理界面
- [kubectl](14.8_kubectl.md)命令行工具使用指南

View File

@@ -0,0 +1,268 @@
## 19.1 Prometheus + Grafana
Prometheus Grafana 是目前最流行的开源监控组合前者负责数据采集与存储后者负责数据可视化
[Prometheus](https://prometheus.io/) 是一个开源的系统监控和报警工具包。它受 Google Borgmon 的启发,由 SoundCloud 在 2012 年创建。
### 19.1.1 架构简介
Prometheus 的主要组件包括
* **Prometheus Server**核心组件负责收集和存储时间序列数据
* **Exporters**负责向 Prometheus 暴露监控数据 ( Node ExportercAdvisor)
* **Alertmanager**处理报警发送
* **Pushgateway**用于支持短生命周期的 Job 推送数据
### 19.1.2 快速部署
我们可以使用 Docker Compose 快速部署一套 Prometheus + Grafana 监控环境
本节示例使用了
* `node-exporter`采集宿主机指标 (CPU内存磁盘网络等)
* `cAdvisor`采集容器指标 (容器 CPU/内存/网络 IO文件系统等)
在生产环境中建议将 Prometheus 的数据目录做持久化并显式配置数据保留周期
#### 1. 准备配置文件
创建 `prometheus.yml`
```yaml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
rule_files:
- /etc/prometheus/rules.yml
```
#### 2. 编写 Docker Compose 文件
创建 `compose.yaml` ( `docker-compose.yml`)
```yaml
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./rules.yml:/etc/prometheus/rules.yml
- prometheus_data:/prometheus
ports:
- "9090:9090"
command:
- --config.file=/etc/prometheus/prometheus.yml
- --storage.tsdb.path=/prometheus
- --storage.tsdb.retention.time=15d
networks:
- monitoring
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
networks:
- monitoring
depends_on:
- prometheus
node-exporter:
image: prom/node-exporter:latest
ports:
- "9100:9100"
networks:
- monitoring
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
ports:
- "8080:8080"
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
networks:
- monitoring
networks:
monitoring:
volumes:
prometheus_data:
```
#### 3. 启动服务
运行以下命令
```bash
$ docker compose up -d
```
启动后访问以下地址
* Prometheus: `http://localhost:9090`
* Grafana`http://localhost:3000` (默认账号密码admin/admin)
### 19.1.3 配置 Grafana 面板
1. Grafana 中添加 Prometheus 数据源URL 填写 `http://prometheus:9090`
2. 导入现成的 Dashboard 模板例如 [Node Exporter Full](https://grafana.com/grafana/dashboards/1860) (ID1860) 和 [Docker Container](https://grafana.com/grafana/dashboards/193) (ID193)。
这样你就拥有了一个直观的容器监控大屏
### 19.1.4 生产要点与告警闭环
完成部署后建议补齐以下生产要点
#### 指标采集的最小闭环
1. Prometheus 页面打开 **Status -> Targets**确认 `prometheus``node-exporter``cadvisor` `State` 均为 `UP`
2. **Graph** 中尝试查询
* `up`
* `rate(container_cpu_usage_seconds_total[5m])`
3. Grafana Dashboard 中重点关注
* 宿主机 CPU/Load/内存/磁盘
* 容器 CPU/内存使用率容器重启次数
如果你发现面板为空通常不是 Grafana 的问题而是 Prometheus 没抓到数据或查询标签与 Dashboard 不匹配
#### 常见问题排查
* **Target down**检查容器网络是否互通端口是否暴露到同一网络以及 exporter 是否在容器内正常监听
* **cAdvisor 无数据或报错**确认挂载了 Docker 目录与宿主机的 `/sys``/var/run` 等路径并确保宿主机上 Docker 运行正常
* **指标缺失**确认你的 Docker/内核版本与 cAdvisor 兼容对于 containerd 等运行时采集方式会不同
#### 关键指标速查 (节点/容器)
在生产环境排障时建议优先关注下面几类指标并在 Grafana 面板中建立对应的常用视图
* **节点 CPU 使用率**`100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)`
* **节点内存使用率**`(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100`
* **节点磁盘空间使用率**`(1 - (node_filesystem_avail_bytes{fstype!~"tmpfs|overlay"} / node_filesystem_size_bytes{fstype!~"tmpfs|overlay"})) * 100`
* **容器 CPU**`sum by (name) (rate(container_cpu_usage_seconds_total[5m]))`
* **容器内存**`sum by (name) (container_memory_working_set_bytes)`
说明不同版本的 cAdvisor/Docker label 命名可能存在差异 ( `name``container``container_name`)如果查询为空建议先用 `label_values(container_cpu_usage_seconds_total, __name__)` 或在 Prometheus 的图形界面查看可用 label
#### Targets down 排错清单
**Status -> Targets** 出现 `DOWN` 建议按以下顺序排查
1. **网络连通性**Prometheus 容器是否能解析并访问目标 (同一 Docker networkDNS端口)
2. **端口/路径**确认 exporter 监听端口与 Prometheus 配置一致必要时在 Prometheus 容器内 `curl http://node-exporter:9100/metrics`
3. **权限/挂载**cAdvisor 需要访问宿主机 `/sys``/var/lib/docker` 等挂载路径缺失会导致指标不全或报错
4. **时间问题**宿主机与容器时间偏差过大可能导致数据看起来断档需要检查 NTP/时区配置
5. **目标本身异常**确认 exporter 容器是否在重启查看 `docker logs`
#### 告警 (Alertmanager) 建议
生产环境建议引入 Alertmanager 做告警聚合与路由并在 Prometheus 中配置 `alerting` `rule_files`
为了保持最小告警闭环建议至少覆盖两类告警
* **采集链路告警**例如 `up == 0`用于发现 exporter 或网络故障
* **资源风险告警**例如节点磁盘空间不足用于提前发现容量风险
##### 1. 准备告警规则文件
创建 `rules.yml`
```yaml
groups:
- name: docker_practice
rules:
- alert: PrometheusTargetDown
expr: up == 0
for: 2m
labels:
severity: warning
annotations:
summary: "Prometheus 抓取目标不可达"
description: "Job={{ $labels.job }}, Instance={{ $labels.instance }}"
- alert: HostDiskSpaceLow
expr: |
(node_filesystem_avail_bytes{fstype!~"tmpfs|overlay"} / node_filesystem_size_bytes{fstype!~"tmpfs|overlay"}) < 0.10
for: 10m
labels:
severity: critical
annotations:
summary: "磁盘可用空间不足"
description: "Instance={{ $labels.instance }}, Mountpoint={{ $labels.mountpoint }}"
```
说明这里的规则是可用空间低于 10%的阈值告警并非未来 24 小时写满的预测生产环境建议针对特定文件系统与挂载点做更精确的过滤
##### 2. 配置 Prometheus 加载规则并接入 Alertmanager
修改 `prometheus.yml`增加
```yaml
rule_files:
- /etc/prometheus/rules.yml
alerting:
alertmanagers:
- static_configs:
- targets: ["alertmanager:9093"]
```
并在 Compose 中挂载规则文件
##### 3. 部署 Alertmanager
创建 `alertmanager.yml`
```yaml
route:
receiver: default
receivers:
- name: default
webhook_configs:
- url: http://example.com/webhook
```
再在 `compose.yaml` 增加服务
```yaml
alertmanager:
image: prom/alertmanager:latest
volumes:
- ./alertmanager.yml:/etc/alertmanager/alertmanager.yml
ports:
- "9093:9093"
networks:
- monitoring
```
生产环境中建议将告警发送到可追踪的渠道 ( IM 机器人事件平台工单系统)并在告警中附带 Dashboard 链接与排障入口避免告警成为噪声
#### 建议的文件清单
为了避免示例难以复现建议在同一目录下准备以下文件
* `compose.yaml`PrometheusGrafanaexportersAlertmanager 的部署文件
* `prometheus.yml`Prometheus 抓取配置与告警配置
* `rules.yml`告警规则
* `alertmanager.yml`告警路由与接收器配置

View File

@@ -0,0 +1,212 @@
## 19.2 ELK/EFK 堆栈
ELK (ElasticsearchLogstashKibana) 是目前业界最流行的开源日志解决方案而在容器领域由于 Fluentd 更加轻量级且对容器支持更好EFK (ElasticsearchFluentdKibana) 组合也变得非常流行
### 19.2.1 方案架构
我们将采用以下架构
1. **Docker Container**容器将日志输出到标准输出 (stdout/stderr)
2. **Fluentd**作为 Docker Logging Driver 或运行为守护容器收集容器日志
3. **Elasticsearch**存储从 Fluentd 接收到的日志数据
4. **Kibana** Elasticsearch 读取数据并进行可视化展示
### 19.2.2 部署流程
我们将使用 Docker Compose 来一键部署整个日志堆栈
#### 1. 编写 Compose 文件
1. 编写 `compose.yaml` ( `docker-compose.yml`) 配置如下
```yaml
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.17.0
container_name: elasticsearch
environment:
- "discovery.type=single-node"
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ports:
- "9200:9200"
volumes:
- es_data:/usr/share/elasticsearch/data
networks:
- logging
kibana:
image: docker.elastic.co/kibana/kibana:7.17.0
container_name: kibana
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
ports:
- "5601:5601"
links:
- elasticsearch
networks:
- logging
fluentd:
image: fluent/fluentd-kubernetes-daemonset:v1.14.3-debian-elasticsearch7-1.0
container_name: fluentd
environment:
- "FLUENT_ELASTICSEARCH_HOST=elasticsearch"
- "FLUENT_ELASTICSEARCH_PORT=9200"
- "FLUENT_ELASTICSEARCH_SCHEME=http"
- "FLUENT_UID=0"
ports:
- "24224:24224"
- "24224:24224/udp"
links:
- elasticsearch
volumes:
- ./fluentd/conf:/fluentd/etc
networks:
- logging
volumes:
es_data:
networks:
logging:
```
#### 2. 配置 Fluentd
创建 `fluentd/conf/fluent.conf`
```ini
<source>
@type forward
port 24224
bind 0.0.0.0
</source>
<match *.**>
@type copy
<store>
@type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix docker
logstash_dateformat %Y%m%d
include_tag_key true
type_name access_log
tag_key @log_name
flush_interval 1s
</store>
<store>
@type stdout
</store>
</match>
```
#### 3. 配置应用容器使用 fluentd 驱动
启动一个测试容器指定日志驱动为 `fluentd`
```bash
docker run -d \
--log-driver=fluentd \
--log-opt fluentd-address=localhost:24224 \
--log-opt tag=nginx-test \
--name nginx-test \
nginx
```
**注意**确保 `fluentd` 容器已经启动并监听在 `localhost:24224`在生产环境中如果你是在不同机器上需要将 `localhost` 替换为运行 fluentd 的主机 IP
#### 4. Kibana 中查看日志
1. 访问 `http://localhost:5601`
2. 进入 **Management**->**Kibana**->**Index Patterns**
3. 创建新的 Index Pattern输入 `docker-*` (我们在 fluent.conf 中配置的前缀)
4. 选择 `@timestamp` 作为时间字段
5. **Discover** 页面你就能看到 Nginx 容器的日志了
#### Kibana 建索引模式常见坑
首次接入 EFK/ELK Elasticsearch 有数据但 Kibana 看不到很常见通常是 Kibana 配置或时间窗口问题
* **Index Pattern 不匹配**确认 Kibana Index Pattern 与实际索引前缀一致可以先用 `_cat/indices` 查看真实索引名
* **时间字段选择错误**若索引里包含 `@timestamp`一般选择它如果选择了错误的字段会导致 Discover 无法按时间筛选
* **时间窗口/时区**Discover 右上角的时间范围默认可能是最近 15 分钟且时区可能影响显示建议先把范围扩大到最近 24 小时再验证
* **数据解析失败**若日志是非结构化文本仍可入库但字段不可用生产环境建议输出 JSON 并在采集端解析
#### 5. 验证日志是否写入 Elasticsearch (生产排错必备)
当你在 Kibana 看不到日志时建议先跳过 UI从存储端直接验证日志是否入库
1. 查看索引是否创建
```bash
curl -s http://localhost:9200/_cat/indices?v
```
如果 Fluentd 使用了 `logstash_format true` `logstash_prefix docker`通常会看到形如 `docker-YYYY.MM.DD` 的索引
2. 查看最近一段时间的日志文档
```bash
curl -s -H 'Content-Type: application/json' \
http://localhost:9200/docker-*/_search \
-d '{"size":1,"sort":[{"@timestamp":"desc"}]}'
```
如果 Elasticsearch 中已经有文档 Kibana 仍然为空常见原因是
* Index Pattern 没匹配到索引 (例如写成了 `docker-*` 但实际索引前缀不同)
* 时间字段没选对或时区不一致导致 Discover 时间窗口内看不到数据
### 19.2.3 总结
通过 Docker 的日志驱动机制结合 ELK/EFK 强大的收集和分析能力我们可以轻松构建一个能够处理海量日志的监控平台这对于排查生产问题至关重要
### 19.2.4 生产要点
在生产环境中日志系统往往比监控系统更容易因为容量与写入压力出问题建议特别关注
* **容量规划**日志增长速度与磁盘占用直接相关建议设置日志保留周期与索引生命周期策略 (ILM)避免 Elasticsearch 因磁盘水位触发只读或不可用
* **资源配置**Elasticsearch JVM Heap 较敏感除示例中的 `ES_JAVA_OPTS` 生产环境需要结合节点内存分片规模查询压力做评估
* **链路可靠性**采集端到存储端要考虑网络抖动背压与重试策略 Elasticsearch 写入变慢时采集端的缓冲与落盘策略决定了是否会丢日志
* **日志格式**推荐应用输出结构化日志 (JSON) 并包含关键字段 ( `trace_id``request_id``service``env`)以便快速过滤与关联分析
#### 索引与保留策略的落地建议
无论是 EFK 还是 ELK生产上都需要回答两个问题
* 日志保留多久
* 保留期内的日志如何保证可查询不过度占用存储
建议按环境与业务重要性对日志分层并制定不同的保留周期例如
* **生产环境**730
* **测试环境**17
实现方式通常有两类
* **按天滚动索引** `docker-YYYY.MM.DD`再定期删除过期索引
* **使用 ILM**定义 Hot/Warm/Cold/删除阶段按时间与容量自动滚动与回收
对于中小规模集群先把按天滚动 + 过期删除做扎实往往就能解决 80% 的容量问题当日志量上来查询压力变大后再逐步引入 ILM分层存储与更精细的分片规划
#### 最小可用的过期索引清理示例
如果你采用按天滚动索引 (例如 `docker-YYYY.MM.DD`)可以通过 Elasticsearch API 定期清理过期索引
下面示例仅用于演示思路获取所有 `docker-` 前缀索引并删除指定索引生产环境建议基于日期计算灰度验证与权限控制后再执行自动化清理
1. 列出索引
```bash
curl -s http://localhost:9200/_cat/indices/docker-*?v
```
2. 删除某个过期索引 (示例)
```bash
curl -X DELETE http://localhost:9200/docker-2026.02.01
```
如果你希望更自动化的治理能力可以进一步使用 ILM 为索引配置滚动与删除策略

View File

@@ -1,8 +1,17 @@
# 第十九章 容器监控与日志
在生产环境中容器化应用部署完成后实时掌握容器集群的状态以及应用日志非常重要本章将介绍针对 Docker 容器和 Kubernetes 集群的监控与日志管理方案
在生产环境中容器化应用部署完成后实时掌握容器的运行状态以及应用日志非常重要本章将 Docker/Compose 的场景为主介绍容器监控与日志管理的落地思路与最小实践闭环
对于 Kubernetes 场景可观测性链路与组件选择通常会有所不同 (例如使用 Prometheus Operator日志采集 DaemonSet )本章会在关键点给出迁移提示但不会展开为完整的 Kubernetes 教程
我们将重点探讨以下内容
- **容器监控** Prometheus 为主讲解如何采集和展示容器性能指标
- **日志管理** ELK (Elasticsearch, Logstash, Kibana) 套件为例介绍集中式日志收集平台
为了让读者能够在生产环境中真正用起来本章会补齐以下最小闭环
* 关键指标与日志的验证方法
* 常见故障排查路径
* 最小告警闭环 (Prometheus -> Alertmanager -> 接收端)
* 日志容量治理的最小实践

View File

@@ -1,130 +0,0 @@
## 19.2 ELK/EFK 堆栈
ELK (ElasticsearchLogstashKibana) 是目前业界最流行的开源日志解决方案而在容器领域由于 Fluentd 更加轻量级且对容器支持更好EFK (ElasticsearchFluentdKibana) 组合也变得非常流行
### 19.2.1 方案架构
我们将采用以下架构
1. **Docker Container**容器将日志输出到标准输出 (stdout/stderr)
2. **Fluentd**作为 Docker Logging Driver 或运行为守护容器收集容器日志
3. **Elasticsearch**存储从 Fluentd 接收到的日志数据
4. **Kibana** Elasticsearch 读取数据并进行可视化展示
### 19.2.2 部署流程
我们将使用 Docker Compose 来一键部署整个日志堆栈
#### 1. 编写 Compose 文件
1. 编写 `compose.yaml` ( `docker-compose.yml`) 配置如下
```yaml
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.17.0
container_name: elasticsearch
environment:
- "discovery.type=single-node"
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ports:
- "9200:9200"
volumes:
- es_data:/usr/share/elasticsearch/data
networks:
- logging
kibana:
image: docker.elastic.co/kibana/kibana:7.17.0
container_name: kibana
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
ports:
- "5601:5601"
links:
- elasticsearch
networks:
- logging
fluentd:
image: fluent/fluentd-kubernetes-daemonset:v1.14.3-debian-elasticsearch7-1.0
container_name: fluentd
environment:
- "FLUENT_ELASTICSEARCH_HOST=elasticsearch"
- "FLUENT_ELASTICSEARCH_PORT=9200"
- "FLUENT_ELASTICSEARCH_SCHEME=http"
- "FLUENT_UID=0"
ports:
- "24224:24224"
- "24224:24224/udp"
links:
- elasticsearch
volumes:
- ./fluentd/conf:/fluentd/etc
networks:
- logging
volumes:
es_data:
networks:
logging:
```
#### 2. 配置 Fluentd
创建 `fluentd/conf/fluent.conf`
```ini
<source>
@type forward
port 24224
bind 0.0.0.0
</source>
<match *.**>
@type copy
<store>
@type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix docker
logstash_dateformat %Y%m%d
include_tag_key true
type_name access_log
tag_key @log_name
flush_interval 1s
</store>
<store>
@type stdout
</store>
</match>
```
#### 3. 配置应用容器使用 fluentd 驱动
启动一个测试容器指定日志驱动为 `fluentd`
```bash
docker run -d \
--log-driver=fluentd \
--log-opt fluentd-address=localhost:24224 \
--log-opt tag=nginx-test \
--name nginx-test \
nginx
```
**注意**确保 `fluentd` 容器已经启动并监听在 `localhost:24224`在生产环境中如果你是在不同机器上需要将 `localhost` 替换为运行 fluentd 的主机 IP
#### 4. Kibana 中查看日志
1. 访问 `http://localhost:5601`
2. 进入 **Management**->**Kibana**->**Index Patterns**
3. 创建新的 Index Pattern输入 `docker-*` (我们在 fluent.conf 中配置的前缀)
4. 选择 `@timestamp` 作为时间字段
5. **Discover** 页面你就能看到 Nginx 容器的日志了
### 19.2.3 总结
通过 Docker 的日志驱动机制结合 ELK/EFK 强大的收集和分析能力我们可以轻松构建一个能够处理海量日志的监控平台这对于排查生产问题至关重要

View File

@@ -1,109 +0,0 @@
## 19.1 Prometheus + Grafana
Prometheus Grafana 是目前最流行的开源监控组合前者负责数据采集与存储后者负责数据可视化
[Prometheus](https://prometheus.io/) 是一个开源的系统监控和报警工具包。它受 Google Borgmon 的启发,由 SoundCloud 在 2012 年创建。
### 19.1.1 架构简介
Prometheus 的主要组件包括
* **Prometheus Server**核心组件负责收集和存储时间序列数据
* **Exporters**负责向 Prometheus 暴露监控数据 ( Node ExportercAdvisor)
* **Alertmanager**处理报警发送
* **Pushgateway**用于支持短生命周期的 Job 推送数据
### 19.1.2 快速部署
我们可以使用 Docker Compose 快速部署一套 Prometheus + Grafana 监控环境
#### 1. 准备配置文件
创建 `prometheus.yml`
```yaml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
```
#### 2. 编写 Docker Compose 文件
创建 `compose.yaml` ( `docker-compose.yml`)
```yaml
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
networks:
- monitoring
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
networks:
- monitoring
depends_on:
- prometheus
node-exporter:
image: prom/node-exporter:latest
ports:
- "9100:9100"
networks:
- monitoring
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
ports:
- "8080:8080"
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
networks:
- monitoring
networks:
monitoring:
```
#### 3. 启动服务
运行以下命令
```bash
$ docker compose up -d
```
启动后访问以下地址
* Prometheus: `http://localhost:9090`
* Grafana`http://localhost:3000` (默认账号密码admin/admin)
### 19.1.3 配置 Grafana 面板
1. Grafana 中添加 Prometheus 数据源URL 填写 `http://prometheus:9090`
2. 导入现成的 Dashboard 模板例如 [Node Exporter Full](https://grafana.com/grafana/dashboards/1860) (ID1860) 和 [Docker Container](https://grafana.com/grafana/dashboards/193) (ID193)。
这样你就拥有了一个直观的容器监控大屏

View File

@@ -1,10 +1,15 @@
# 日志管理
# 本章小结
在容器化环境中日志管理比传统环境更为复杂容器是短暂的意味着容器内的日志文件可能会随着容器的销毁而丢失因此我们需要一种集中式的日志管理方案来收集存储和分析容器日志
本章从两个维度介绍了容器可观测性
* **指标监控** Prometheus + Grafana 为主完成指标采集存储与可视化
* **日志管理** EFK/ELK 为例完成容器日志的集中采集检索与分析
生产环境中建议将可观测性当成一个完整闭环**采集 -> 存储 -> 展示 -> 告警 -> 排错 -> 容量治理**
## 19.3 Docker 日志驱动
Docker 提供了多种日志驱动 (Log Driver) 机制允许我们将容器日志转发到不同后端
Docker 提供了多种日志驱动 (Log Driver)用于将容器标准输出的日志转发到不同后端
常见的日志驱动包括
@@ -15,12 +20,30 @@ Docker 提供了多种日志驱动 (Log Driver) 机制,允许我们将容器
* `gelf`支持 GELF 协议的日志后端 ( Graylog)
* `awslogs`发送到 Amazon CloudWatch Logs
## 19.3 日志管理方案
生产建议无论采用哪种驱动都要明确日志的保留周期容量上限与传输可靠性避免日志把磁盘写满链路抖动导致丢日志
对于大规模的容器集群我们通常会采用 EFK (Elasticsearch + Fluentd + Kibana) ELK (Elasticsearch + Logstash + Kibana) 方案
## 19.4 日志平台选型对比与注意事项
* **Elasticsearch**负责日志的存储和全文检索
* **Fluentd/Logstash**负责日志的采集过滤和转发
* **Kibana**负责日志的可视化展示
日志平台通常由采集/处理/存储/查询展示几部分组成常见选型包括
本章将介绍如何使用 EFK 方案来处理 Docker 容器日志
* **EFK/ELK**Elasticsearch + Fluentd/Logstash + Kibana适合全文检索与结构化查询
* **Loki + Grafana**更偏日志像指标一样存储的思路部署与成本可能更友好但查询能力与使用习惯不同
选型时建议关注
* **写入压力与背压**当存储端变慢时采集端是否会缓冲落盘重试是否会影响业务
* **容量治理**是否具备按天/按大小滚动保留策略生命周期管理 (ILM) 等能力
* **安全与合规**鉴权TLS审计敏感字段脱敏
* **可运维性**升级策略备份恢复告警指标是否齐全
## 19.5 上线前检查清单
你可以用下面的清单快速检查是否具备最小生产可用性
* Prometheus 数据目录已持久化并设置了合理的保留周期
* Prometheus Targets 全部为 `UP`并且关键查询 (CPU/内存/容器指标) 有数据
* Grafana 已导入面板并能定位到具体实例/容器默认账号密码已修改
* 至少有一条关键告警已打通 Alertmanager 的接收链路并验证告警能被正确发送与抑制
* Elasticsearch 数据目录已持久化并有明确的日志保留周期与容量上限策略
* Kibana 能查询到最新日志 UI 异常时能用 Elasticsearch API 验证入库
* 可观测性组件未直接暴露到公网访问已加鉴权或置于内网

View File

@@ -2,8 +2,8 @@
本章将介绍 Docker 在不同操作系统镜像场景下的实战案例
* [Busybox](busybox.md)
* [Alpine](alpine.md)
* [Debian Ubuntu](debian.md)
* [CentOS Fedora](centos.md)
* [Busybox](20.1_busybox.md)
* [Alpine](20.2_alpine.md)
* [Debian Ubuntu](20.3_debian.md)
* [CentOS Fedora](20.4_centos.md)
* [本章小结](summary.md)

View File

@@ -2,10 +2,10 @@
本章将介绍 Docker DevOps 场景下的实战案例
* [DevOps 完整工作流](devops_workflow.md)
* [GitHub Actions](github_actions.md)
* [Drone](drone.md)
* [Drone Demo](drone_demo.md)
* [ IDE 中使用 Docker](ide.md)
* [VS Code](vsCode.md)
* [DevOps 完整工作流](21.1_devops_workflow.md)
* [GitHub Actions](21.2_github_actions.md)
* [Drone](21.3_drone.md)
* [Drone Demo](21.4_drone_demo.md)
* [ IDE 中使用 Docker](21.5_ide.md)
* [VS Code](21.6_vsCode.md)
* [本章小结](summary.md)

View File

@@ -2,7 +2,7 @@
[![](https://img.shields.io/github/stars/yeasy/docker_practice.svg?style=social&label=Stars)](https://github.com/yeasy/docker_practice) [![图](https://img.shields.io/github/release/yeasy/docker_practice/all.svg)](https://github.com/yeasy/docker_practice/releases) [![图](https://img.shields.io/badge/Based-Docker%20Engine%20v29.x-blue.svg)](https://docs.docker.com/engine/release-notes/) [![图](https://img.shields.io/badge/Docker%20%E6%8A%80%E6%9C%AF%E5%85%A5%E9%97%A8%E4%B8%8E%E5%AE%9E%E6%88%98-jd.com-red.svg)][1]
**v1.5.8**
**v1.5.9**
[Docker](https://www.docker.com) 是个划时代的开源项目,它彻底释放了计算虚拟化的威力,极大提高了应用的维护效率,降低了云计算应用开发的成本!使用 Docker可以让应用的部署、测试和分发都变得前所未有的高效和轻松

View File

@@ -75,17 +75,17 @@
* [7.18 实战多阶段构建 Laravel 镜像](07_dockerfile/7.18_multistage_builds_laravel.md)
* [本章小结](07_dockerfile/summary.md)
* [第八章 数据管理](08_data/README.md)
* [8.1 数据卷](08_data/volume.md)
* [8.2 挂载主机目录](08_data/bind-mounts.md)
* [8.3 tmpfs 挂载](08_data/tmpfs.md)
* [8.1 数据卷](08_data/8.1_volume.md)
* [8.2 挂载主机目录](08_data/8.2_bind-mounts.md)
* [8.3 tmpfs 挂载](08_data/8.3_tmpfs.md)
* [本章小结](08_data/summary.md)
* [第九章 网络配置](09_network/README.md)
* [9.1 配置 DNS](09_network/dns.md)
* [9.2 网络类型](09_network/network_types.md)
* [9.3 自定义网络](09_network/custom_network.md)
* [9.4 容器互联](09_network/container_linking.md)
* [9.5 外部访问容器](09_network/port_mapping.md)
* [9.6 网络隔离](09_network/network_isolation.md)
* [9.1 配置 DNS](09_network/9.1_dns.md)
* [9.2 网络类型](09_network/9.2_network_types.md)
* [9.3 自定义网络](09_network/9.3_custom_network.md)
* [9.4 容器互联](09_network/9.4_container_linking.md)
* [9.5 外部访问容器](09_network/9.5_port_mapping.md)
* [9.6 网络隔离](09_network/9.6_network_isolation.md)
* [本章小结](09_network/summary.md)
* [第十章 Docker Buildx](10_buildx/README.md)
* [10.1 BuildKit](10_buildx/10.1_buildkit.md)
@@ -115,68 +115,67 @@
* [12.6 网络](12_implementation/12.6_network.md)
* [本章小结](12_implementation/summary.md)
* [第十三章 容器编排基础](13_kubernetes_concepts/README.md)
* [13.1 简介](13_kubernetes_concepts/intro.md)
* [13.2 基本概念](13_kubernetes_concepts/concepts.md)
* [13.3 架构设计](13_kubernetes_concepts/design.md)
* [13.4 高级特性](13_kubernetes_concepts/advanced.md)
* [13.5 实战练习](13_kubernetes_concepts/practice.md)
* [13.1 简介](13_kubernetes_concepts/13.1_intro.md)
* [13.2 基本概念](13_kubernetes_concepts/13.2_concepts.md)
* [13.3 架构设计](13_kubernetes_concepts/13.3_design.md)
* [13.4 高级特性](13_kubernetes_concepts/13.4_advanced.md)
* [13.5 实战练习](13_kubernetes_concepts/13.5_practice.md)
* [本章小结](13_kubernetes_concepts/summary.md)
* [第十四章 部署 Kubernetes](14_kubernetes_setup/README.md)
* [14.1 使用 kubeadm 部署 Kubernetes (CRI 使用 containerd)](14_kubernetes_setup/kubeadm.md)
* [14.2 使用 kubeadm 部署 Kubernetes (使用 Docker)](14_kubernetes_setup/kubeadm-docker.md)
* [14.3 Docker Desktop 使用](14_kubernetes_setup/docker-desktop.md)
* [14.4 Kind - Kubernetes IN Docker](14_kubernetes_setup/kind.md)
* [14.5 K3s - 轻量级 Kubernetes](14_kubernetes_setup/k3s.md)
* [14.6 一步步部署 Kubernetes 集群](14_kubernetes_setup/systemd.md)
* [14.7 部署 Dashboard](14_kubernetes_setup/dashboard.md)
* [14.8 Kubernetes 命令行 kubectl](14_kubernetes_setup/kubectl.md)
* [14.1 使用 kubeadm 部署 Kubernetes (CRI 使用 containerd)](14_kubernetes_setup/14.1_kubeadm.md)
* [14.2 使用 kubeadm 部署 Kubernetes (使用 Docker)](14_kubernetes_setup/14.2_kubeadm-docker.md)
* [14.3 Docker Desktop 使用](14_kubernetes_setup/14.3_docker-desktop.md)
* [14.4 Kind - Kubernetes IN Docker](14_kubernetes_setup/14.4_kind.md)
* [14.5 K3s - 轻量级 Kubernetes](14_kubernetes_setup/14.5_k3s.md)
* [14.6 一步步部署 Kubernetes 集群](14_kubernetes_setup/14.6_systemd.md)
* [14.7 部署 Dashboard](14_kubernetes_setup/14.7_dashboard.md)
* [14.8 Kubernetes 命令行 kubectl](14_kubernetes_setup/14.8_kubectl.md)
* [本章小结](14_kubernetes_setup/summary.md)
* [第十五章 Etcd 项目](15_etcd/README.md)
* [15.1 简介](15_etcd/intro.md)
* [15.2 安装](15_etcd/install.md)
* [15.3 集群](15_etcd/cluster.md)
* [15.4 使用 etcdctl](15_etcd/etcdctl.md)
* [15.1 简介](15_etcd/15.1_intro.md)
* [15.2 安装](15_etcd/15.2_install.md)
* [15.3 集群](15_etcd/15.3_cluster.md)
* [15.4 使用 etcdctl](15_etcd/15.4_etcdctl.md)
* [本章小结](15_etcd/summary.md)
* [第十六章 容器与云计算](16_cloud/README.md)
* [16.1 简介](16_cloud/intro.md)
* [16.2 腾讯云](16_cloud/tencentCloud.md)
* [16.3 阿里云](16_cloud/alicloud.md)
* [16.4 亚马逊云](16_cloud/aws.md)
* [16.6 多云部署策略](16_cloud/multicloud.md)
* [16.1 简介](16_cloud/16.1_intro.md)
* [16.2 腾讯云](16_cloud/16.2_tencentCloud.md)
* [16.3 阿里云](16_cloud/16.3_alicloud.md)
* [16.4 亚马逊云](16_cloud/16.4_aws.md)
* [16.6 多云部署策略](16_cloud/16.6_multicloud.md)
* [本章小结](16_cloud/summary.md)
* [第十七章 容器其它生态](17_ecosystem/README.md)
* [17.1 Fedora CoreOS 简介](17_ecosystem/coreos_intro.md)
* [17.2 Fedora CoreOS 安装](17_ecosystem/coreos_install.md)
* [17.3 podman - 下一代 Linux 容器工具](17_ecosystem/podman.md)
* [17.1 Fedora CoreOS 简介](17_ecosystem/17.1_coreos_intro.md)
* [17.2 Fedora CoreOS 安装](17_ecosystem/17.2_coreos_install.md)
* [17.3 podman - 下一代 Linux 容器工具](17_ecosystem/17.3_podman.md)
* [本章小结](17_ecosystem/summary.md)
## 第四部分实战篇
* [第十八章 安全](18_security/README.md)
* [18.1 内核命名空间](18_security/kernel_ns.md)
* [18.2 控制组](18_security/control_group.md)
* [18.3 服务端防护](18_security/daemon_sec.md)
* [18.4 内核能力机制](18_security/kernel_capability.md)
* [18.5 其它安全特性](18_security/other_feature.md)
* [18.1 内核命名空间](18_security/18.1_kernel_ns.md)
* [18.2 控制组](18_security/18.2_control_group.md)
* [18.3 服务端防护](18_security/18.3_daemon_sec.md)
* [18.4 内核能力机制](18_security/18.4_kernel_capability.md)
* [18.5 其它安全特性](18_security/18.5_other_feature.md)
* [本章小结](18_security/summary.md)
* [第十九章 容器监控与日志](19_observability/README.md)
* [19.1 Prometheus](19_observability/prometheus.md)
* [19.2 ELK 套件](19_observability/elk.md)
* [19.1 Prometheus](19_observability/19.1_prometheus.md)
* [19.2 ELK 套件](19_observability/19.2_elk.md)
* [本章小结](19_observability/summary.md)
* [第二十章 实战案例 - 操作系统](20_cases_os/README.md)
* [20.1 Busybox](20_cases_os/busybox.md)
* [20.2 Alpine](20_cases_os/alpine.md)
* [20.3 Debian Ubuntu](20_cases_os/debian.md)
* [20.4 CentOS Fedora](20_cases_os/centos.md)
* [20.1 Busybox](20_cases_os/20.1_busybox.md)
* [20.2 Alpine](20_cases_os/20.2_alpine.md)
* [20.3 Debian Ubuntu](20_cases_os/20.3_debian.md)
* [20.4 CentOS Fedora](20_cases_os/20.4_centos.md)
* [本章小结](20_cases_os/summary.md)
* [第二十一章 实战案例 - Devops](21_case_devops/README.md)
* [21.1 DevOps 完整工作流](21_case_devops/devops_workflow.md)
* [21.2 GitHub Actions](21_case_devops/github_actions.md)
* [21.3 Drone](21_case_devops/drone.md)
* [21.4 Drone Demo](21_case_devops/drone_demo.md)
* [21.5 IDE 中使用 Docker](21_case_devops/ide.md)
* [21.6 VS Code](21_case_devops/vsCode.md)
* [21.1 DevOps 完整工作流](21_case_devops/21.1_devops_workflow.md)
* [21.2 GitHub Actions](21_case_devops/21.2_github_actions.md)
* [21.3 Drone](21_case_devops/21.3_drone.md)
* [21.4 Drone Demo](21_case_devops/21.4_drone_demo.md)
* [21.5 IDE 中使用 Docker](21_case_devops/21.5_ide.md)
* [21.6 VS Code](21_case_devops/21.6_vsCode.md)
* [本章小结](21_case_devops/summary.md)
## 附录
@@ -184,13 +183,13 @@
* [附录](appendix/README.md)
* [附录一常见问题与错误速查](appendix/faq/README.md)
* [附录二热门镜像介绍](appendix/repo/README.md)
* [Ubuntu](appendix/repo/ubuntu.md)
* [CentOS](appendix/repo/centos.md)
* [Ubuntu](appendix/repo/3.1_ubuntu.md)
* [CentOS](appendix/repo/3.4_centos.md)
* [Nginx](appendix/repo/nginx.md)
* [PHP](appendix/repo/php.md)
* [Node.js](appendix/repo/nodejs.md)
* [MySQL](appendix/repo/mysql.md)
* [WordPress](appendix/repo/wordpress.md)
* [WordPress](appendix/repo/11.8_wordpress.md)
* [MongoDB](appendix/repo/mongodb.md)
* [Redis](appendix/repo/redis.md)
* [Minio](appendix/repo/minio.md)

View File

@@ -26,7 +26,7 @@
应该保证在一个容器中只运行一个进程将多个应用解耦到不同容器中保证了容器的横向扩展和复用例如 web 应用应该包含三个容器web 应用数据库缓存
如果容器互相依赖你可以使用 [Docker 自定义网络](../09_network/custom_network.md)来把这些容器连接起来
如果容器互相依赖你可以使用 [Docker 自定义网络](../09_network/9.3_custom_network.md)来把这些容器连接起来
#### 镜像层数尽可能少

View File

@@ -1,21 +0,0 @@
import os
import re
count = 0
for root, dirs, files in os.walk("/Users/baohua/Github/books/docker_practice"):
if ".git" in root or "node_modules" in root:
continue
for file in files:
if file.endswith(".md"):
filepath = os.path.join(root, file)
with open(filepath, "r", encoding="utf-8") as f:
lines = f.readlines()
for i, line in enumerate(lines):
# match optional spaces, then exactly one dash, then no space and no dash
m = re.match(r'^(\s*)-([^- \t\n].*)$', line)
if m:
print(f"{filepath}:{i+1}:{line.rstrip()}")
count += 1
print(f"Total found: {count}")

View File

@@ -1,109 +0,0 @@
import os
import re
def check_file(filepath):
issues = []
try:
with open(filepath, 'r', encoding='utf-8') as f:
lines = f.readlines()
except Exception as e:
return [f"Could not read file: {e}"]
in_code_block = False
for i, line in enumerate(lines):
line_stripped = line.strip()
# Code block tracking
if line_stripped.startswith('```'):
in_code_block = not in_code_block
if in_code_block:
continue
# 1. Full-width parentheses `` ``
if '' in line or '' in line:
if line_stripped.startswith('#'):
issues.append(f"Line {i+1}: Header contains full-width parentheses '' or ''")
else:
issues.append(f"Line {i+1}: Text contains full-width parentheses '' or ''")
# 2. Missing intro text after headers
if line_stripped.startswith('#'):
j = i + 1
while j < len(lines) and lines[j].strip() == '':
j += 1
if j < len(lines):
next_line = lines[j].strip()
if next_line.startswith('```'):
issues.append(f"Line {i+1}: Header immediately followed by code block without text")
elif next_line.startswith('|') and len(next_line.split('|')) > 2:
issues.append(f"Line {i+1}: Header immediately followed by table without text")
elif next_line.startswith('#') and next_line.count('#') == line_stripped.count('#') + 1:
issues.append(f"Line {i+1}: Header immediately followed by sub-header (missing text between)")
elif next_line.startswith('!['):
issues.append(f"Line {i+1}: Header immediately followed by image without text")
# 3. Missing blank line before list item
# Is this line a list item?
is_list_item = re.match(r'^(\s*[-*+]\s|\s*\d+\.\s)', line)
if is_list_item and i > 0:
prev_line = lines[i-1]
prev_line_stripped = prev_line.strip()
# If prev line is not empty, and not already a list item, header, quote, or HTML comment
if prev_line_stripped and not prev_line_stripped.startswith('#') and not prev_line_stripped.startswith('>'):
if not re.match(r'^(\s*[-*+]\s|\s*\d+\.\s)', prev_line) and not prev_line_stripped.startswith('<!--') and not prev_line_stripped.startswith('|'):
issues.append(f"Line {i+1}: Missing blank line before list item")
# Check EOF newlines
if set(lines) == {'\n'}:
pass
elif len(lines) > 0 and not lines[-1].endswith('\n') and not lines[-1] == '':
# Note: file.read().splitlines() drops the last empty lines, so simple ends with '\n' might be enough
pass
if len(lines) > 1 and lines[-1] == '\n' and lines[-2] == '\n':
issues.append("EOF: Multiple empty lines at end of file")
return issues
def main():
summary_path = 'SUMMARY.md'
if not os.path.exists(summary_path):
print(f"Error: {summary_path} not found in {os.getcwd()}")
return
with open(summary_path, 'r', encoding='utf-8') as f:
content = f.read()
# Find all .md files in SUMMARY.md
md_files = re.findall(r'\(([^)]*\.md)\)', content)
md_files = list(dict.fromkeys(md_files)) # deduplicate
total_issues = 0
summary_out = open('format_report.txt', 'w', encoding='utf-8')
for md_file in md_files:
filepath = os.path.join(os.path.dirname(summary_path), md_file)
if os.path.exists(filepath):
issues = check_file(filepath)
if issues:
print(f"--- {md_file} ---")
summary_out.write(f"--- {md_file} ---\n")
for issue in issues:
print(issue)
summary_out.write(issue + "\n")
print()
summary_out.write("\n")
total_issues += len(issues)
else:
print(f"Warning: File not found {filepath}")
summary_out.write(f"Total issues found: {total_issues}\n")
summary_out.close()
print(f"Total issues found: {total_issues}. Report saved to format_report.txt.")
if __name__ == '__main__':
main()

View File

@@ -1,34 +0,0 @@
import os
import re
count = 0
for root, _, files in os.walk("/Users/baohua/Github/books/docker_practice"):
if ".git" in root or "node_modules" in root: continue
for f in files:
if f.endswith(".md"):
path = os.path.join(root, f)
try:
with open(path, "r", encoding="utf-8") as file:
lines = file.readlines()
except Exception:
continue
in_code_block = False
for i, line in enumerate(lines):
if line.strip().startswith("```"):
in_code_block = not in_code_block
continue
if not in_code_block:
# Look for lines starting with space(s) and a single dash, followed by word char, * or _
# skip html comment '<!--' or horizontal rule '---' or yaml '---'
m = re.match(r'^(\s*)-([*_\w\u4e00-\u9fa5].*)$', line)
if m:
print(f"{path}:{i+1}:{line.rstrip()}")
count += 1
if count > 50:
print("More than 50 found. Stopping.")
import sys
sys.exit(0)
print(f"Total outside code blocks: {count}")

View File

@@ -1,66 +0,0 @@
import os
import re
def process_file(filepath):
try:
with open(filepath, "r", encoding="utf-8") as f:
lines = f.readlines()
except Exception as e:
print(f"Error reading {filepath}: {e}")
return
in_code_block = False
in_frontmatter = False
changed = False
for i in range(len(lines)):
line = lines[i]
# Checking for yaml frontmatter
if i == 0 and line.strip() == "---":
in_frontmatter = True
continue
if in_frontmatter and line.strip() == "---":
in_frontmatter = False
continue
if in_frontmatter:
continue
# Checking for code block
if line.strip().startswith("```"):
in_code_block = not in_code_block
continue
if not in_code_block:
# We want to find lines like:
# -foo
# -**bold**
# Match optional whitespace, then a single dash, then something not space, not -, not <
m = re.match(r'^(\s*)-([^\s\-<].*)$', line)
if m:
# To distinguish from command line arguments like -p or -v which might legitimately appear outside code blocks
# (though usually they shouldn't), let's be careful. The user specifically asked to fix all list symbols -.
# We will just insert a space.
new_line = m.group(1) + "- " + m.group(2) + "\n"
lines[i] = new_line
changed = True
if changed:
try:
with open(filepath, "w", encoding="utf-8") as f:
f.writelines(lines)
print(f"Fixed: {filepath}")
except Exception as e:
print(f"Error writing {filepath}: {e}")
count = 0
for root, dirs, files in os.walk("/Users/baohua/Github/books/docker_practice"):
# ALWAYS modify dirs in-place to prevent os.walk from entering them
dirs[:] = [d for d in dirs if d not in (".git", "node_modules", "dist", "build")]
for file in files:
if file.endswith(".md"):
process_file(os.path.join(root, file))
print("Done fixing.")

View File

@@ -1,195 +0,0 @@
import os
import re
def fix_bold_spaces(line):
parts = line.split("**")
if len(parts) >= 3 and len(parts) % 2 == 1:
for i in range(1, len(parts), 2):
inner = parts[i]
if inner.strip() != "":
parts[i] = inner.strip()
line = "**".join(parts)
return line
def fix_trailing_newline(content):
if not content:
return content
return content.rstrip('\n') + '\n'
def process_file(filepath):
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
lines = content.split('\n')
filename = os.path.basename(filepath)
is_readme = filename.lower() == 'readme.md'
is_summary = filename.lower() == 'summary.md'
is_section = bool(re.match(r'^\d+\.\d+_.*\.md$', filename))
for i in range(len(lines)):
lines[i] = fix_bold_spaces(lines[i])
# Pass 1 & 2: First Header Level & Hierarchy
changed = True
safe = 100
while changed and safe > 0:
safe -= 1
changed = False
headers = []
in_code_block = False
for i, line in enumerate(lines):
if line.startswith('```'):
in_code_block = not in_code_block
if not in_code_block:
m = re.match(r'^(#{1,6})\s+(.*)', line)
if m:
headers.append({'line': i, 'level': len(m.group(1)), 'text': m.group(2)})
if headers:
first_h = headers[0]
expected = None
if is_readme: expected = 1
elif is_summary: expected = 2
elif is_section: expected = 2
if expected and first_h['level'] != expected:
lines[first_h['line']] = '#' * expected + ' ' + first_h['text']
changed = True
for j in range(len(headers) - 1):
curr_level = headers[j]['level']
next_level = headers[j+1]['level']
if next_level > curr_level + 1:
new_level = curr_level + 1
lines[headers[j+1]['line']] = '#' * new_level + ' ' + headers[j+1]['text']
changed = True
# Pass 3: Parentheses
headers = []
in_code_block = False
for i, line in enumerate(lines):
if line.startswith('```'):
in_code_block = not in_code_block
if not in_code_block:
m = re.match(r'^(#{1,6})\s+(.*)', line)
if m:
headers.append({'line': i, 'level': len(m.group(1)), 'text': m.group(2)})
for h in headers:
line_idx = h['line']
level = h['level']
text = h['text']
new_text = re.sub(r'[A-Za-z\s0-9]+', '', text)
new_text = re.sub(r'\([A-Za-z\s0-9]+\)', '', new_text)
if new_text != text:
lines[line_idx] = '#' * level + ' ' + new_text.strip()
# Pass 4: Single Child Headers Loop
headers = []
in_code_block = False
for i, line in enumerate(lines):
if line.startswith('```'):
in_code_block = not in_code_block
if not in_code_block:
m = re.match(r'^(#{1,6})\s+(.*)', line)
if m:
headers.append({'line': i, 'level': len(m.group(1)), 'text': m.group(2)})
inserts = []
for j in range(len(headers)):
level = headers[j]['level']
children = []
for k in range(j+1, len(headers)):
if headers[k]['level'] <= level:
break
if headers[k]['level'] == level + 1:
children.append(headers[k])
if len(children) == 1:
child = children[0]
inserts.append((child['line'], level + 1))
# Remove duplicates and sort descending
inserts = list(set(inserts))
inserts.sort(key=lambda x: x[0], reverse=True)
for (line_idx, lvl) in inserts:
# We must insert BEFORE the ONLY child
lines.insert(line_idx, '')
lines.insert(line_idx, '总体概述了以下内容。')
lines.insert(line_idx, '')
lines.insert(line_idx, '#' * lvl + ' 概述')
# Pass 5: Output structure (Bridge text & Content Intro)
out_lines = []
in_code_block = False
i = 0
while i < len(lines):
line = lines[i]
if line.startswith('```'):
in_code_block = not in_code_block
is_header = bool(re.match(r'^#{1,6}\s+.*', line)) and not in_code_block
out_lines.append(line)
if is_header:
m = re.match(r'^(#{1,6})\s+(.*)', line)
curr_level = len(m.group(1))
k = i + 1
while k < len(lines) and lines[k].strip() == '':
k += 1
out_lines.append('') # Ensure ONE blank line follows the header
if k < len(lines):
next_content = lines[k].strip()
next_m = re.match(r'^(#{1,6})\s+.*', next_content)
if next_m and len(next_m.group(1)) > curr_level:
# Bridge text
out_lines.append('本节涵盖了相关内容与详细描述主要探讨以下几个方面')
out_lines.append('')
elif next_content.startswith('```'):
# codeblock intro
out_lines.append('如下代码块所示展示了相关示例')
out_lines.append('')
elif "![" in next_content and "](" in next_content:
# image intro
out_lines.append('下图直观地展示了本节内容')
out_lines.append('')
# Set cursor to process next actual content line correctly
i = k - 1
i += 1
content = '\n'.join(out_lines)
content = fix_trailing_newline(content)
with open(filepath, 'w', encoding='utf-8') as f:
f.write(content)
def main():
md_files = []
for root, dirs, files in os.walk('.'):
if 'node_modules' in root or '.git' in root or '.vuepress' in root:
continue
for f in files:
if f.endswith('.md') and f != 'book_rule.md':
md_files.append(os.path.join(root, f))
for f in md_files:
try:
process_file(f)
except Exception as e:
print(f"Error processing {f}: {e}")
if __name__ == '__main__':
main()

115
fixer.py
View File

@@ -1,115 +0,0 @@
import os
import re
def fix_file(filepath):
try:
with open(filepath, 'r', encoding='utf-8') as f:
lines = f.readlines()
except Exception as e:
print(f"Could not read file {filepath}: {e}")
return False
if not lines:
return False
new_lines = []
in_code_block = False
for i, line in enumerate(lines):
line_stripped = line.strip()
# Code block tracking
if line_stripped.startswith('```'):
in_code_block = not in_code_block
# 1. Full-width parentheses `` ``
# We replace them with space + half-width parenthesis except if it's already spaced
if not in_code_block and ('' in line or '' in line):
# Replace left parenthesis
line = re.sub(r'([^\s])', r'\1 (', line)
line = re.sub(r'\s*', r' (', line)
# Replace right parenthesis
line = re.sub(r'([^\s.,;?!:,。;?!:])', r') \1', line)
line = re.sub(r'\s*', r')', line)
# Also a quick hack to replace any leftover
line = line.replace('', '(').replace('', ')')
# 3. Missing blank line before list item
is_list_item = re.match(r'^(\s*[-*+]\s|\s*\d+\.\s)', line)
if not in_code_block and is_list_item and i > 0:
prev_line = lines[i-1]
prev_line_stripped = prev_line.strip()
# If prev line is not empty, and not already a list item, header, quote, HTML, or table
if prev_line_stripped and not prev_line_stripped.startswith('#') and not prev_line_stripped.startswith('>'):
if not re.match(r'^(\s*[-*+]\s|\s*\d+\.\s)', prev_line) and not prev_line_stripped.startswith('<!--') and not prev_line_stripped.startswith('|'):
# Insert a blank line
if new_lines and new_lines[-1] != '\n':
new_lines.append('\n')
new_lines.append(line)
# 2. Missing intro text after headers
if not in_code_block and line_stripped.startswith('#') and not line_stripped.startswith('# '):
pass # might be a comment in code, but we already handled code blocks. Should be a real header.
if not in_code_block and re.match(r'^#{1,6}\s', line):
# Check what comes next
j = i + 1
while j < len(lines) and lines[j].strip() == '':
j += 1
if j < len(lines):
next_line = lines[j].strip()
if next_line.startswith('```'):
new_lines.append('\n示例代码如下\n')
elif next_line.startswith('|') and len(next_line.split('|')) > 2:
new_lines.append('\n相关信息如下表\n')
elif next_line.startswith('!['):
new_lines.append('\n相关图示如下\n')
# Only keep one EOF newline
while len(new_lines) > 1 and new_lines[-1] == '\n' and new_lines[-2] == '\n':
new_lines.pop()
# Make sure text ends with newline
if new_lines and not new_lines[-1].endswith('\n') and getattr(new_lines[-1], "endswith", None):
if new_lines[-1] != "":
new_lines[-1] += '\n'
# Did anything change?
if new_lines != lines:
try:
with open(filepath, 'w', encoding='utf-8') as f:
f.writelines(new_lines)
return True
except Exception as e:
print(f"Could not write file {filepath}: {e}")
return False
return False
def main():
summary_path = 'SUMMARY.md'
if not os.path.exists(summary_path):
print(f"Error: {summary_path} not found in {os.getcwd()}")
return
with open(summary_path, 'r', encoding='utf-8') as f:
content = f.read()
md_files = re.findall(r'\(([^)]*\.md)\)', content)
md_files = list(dict.fromkeys(md_files))
fixed_count = 0
for md_file in md_files:
filepath = os.path.join(os.path.dirname(summary_path), md_file)
if os.path.exists(filepath):
if fix_file(filepath):
print(f"Fixed: {md_file}")
fixed_count += 1
print(f"\nTotal files fixed: {fixed_count}")
if __name__ == '__main__':
main()

View File

@@ -1,250 +0,0 @@
import os
import re
ENG_ALLOWLIST = {
'DOCKER', 'KUBERNETES', 'XML', 'LLM', 'RAG', 'LINUX', 'UBUNTU', 'MAC', 'MACOS',
'WINDOWS', 'API', 'JSON', 'YAML', 'REGISTRY', 'HUB', 'REPOSITORY', 'TAG', 'IMAGE',
'CONTAINER', 'DEBIAN', 'FEDORA', 'CENTOS', 'RASPBERRY', 'PI', 'PULL', 'LIST',
'RM', 'COMMIT', 'BUILD', 'RUN', 'DAEMON', 'STOP', 'NEXUS', 'VOLUMES', 'TMPFS',
'DNS', 'PORT', 'BUILDX', 'BUILDKIT', 'COMPOSE', 'DJANGO', 'RAILS', 'WORDPRESS',
'LNMP', 'NAMESPACE', 'CGROUPS', 'UFS', 'PODMAN', 'PROMETHEUS', 'ELK', 'BUSYBOX',
'ALPINE', 'DEVOPS', 'ACTIONS', 'DRONE', 'IDE', 'VS', 'CODE', 'NGINX', 'PHP',
'NODE.JS', 'MYSQL', 'MONGODB', 'REDIS', 'MINIO', 'DOCKERD', 'TENCENTCLOUD',
'ALICLOUD', 'AWS', 'COREOS', 'KUBEADM', 'CONTAINERD', 'DESKTOP', 'KIND', 'K3S',
'SYSTEMD', 'DASHBOARD', 'KUBECTL', 'ETCD', 'ETCDCTL', 'VM', 'VAGRANT', 'LXC',
'GITHUB', 'GOOGLE', 'CLOUD', 'NPM', 'MAVEN', 'ACR', 'TCR', 'ECR', 'HARBOR',
'CNCF', 'SIGSTORE', 'NOTATION', 'SCOUT', 'TRIVY', 'CMD', 'ENTRYPOINT', 'ENV', 'ARG',
'VOLUME', 'EXPOSE', 'WORKDIR', 'USER', 'HEALTHCHECK', 'ONBUILD', 'LABEL', 'SHELL',
'COPY', 'ADD', 'DOCKERFILE', 'CI', 'CD', 'OS'
}
def parse_summary():
if not os.path.exists('SUMMARY.md'):
return {}
with open('SUMMARY.md', 'r', encoding='utf-8') as f:
content = f.read()
file_to_context = {}
chapter_idx = 0
section_idx = 0
is_appendix = False
for line in content.split('\n'):
if '## 附录' in line or '附录' in line and line.startswith('## '):
is_appendix = True
m_chap = re.match(r'^\* \[([一二三四五六七八九十百]+[^\]]*)\]\((.*?)\)', line)
if m_chap:
title = m_chap.group(1).replace(' ', '', 1)
if '' not in title:
title = title.replace('章', '')
filepath = m_chap.group(2)
chapter_idx += 1
section_idx = 0
file_to_context[filepath] = {
'level': 1,
'title': title,
'chap_num': chapter_idx,
'is_app': False
}
continue
m_sec = re.match(r'^\s+\* \[(.*?)\]\((.*?)\)', line)
if m_sec:
title = m_sec.group(1)
filepath = m_sec.group(2)
section_idx += 1
if is_appendix or 'appendix' in filepath:
file_to_context[filepath] = {
'level': 2,
'title': title,
'is_app': True
}
else:
file_to_context[filepath] = {
'level': 2,
'title': title,
'chap_num': chapter_idx,
'sec_num': section_idx,
'is_app': False
}
m_app = re.match(r'^\* \[(附录[^\]]*)\]\((.*?)\)', line)
if m_app:
title = m_app.group(1)
filepath = m_app.group(2)
file_to_context[filepath] = {
'level': 1,
'title': title,
'is_app': True
}
continue
return file_to_context
def check_english(title):
words = re.findall(r'[a-zA-Z\.]+', title)
for w in words:
if w.upper() not in ENG_ALLOWLIST and w.upper() != 'DOCKER':
print(f" [!] Notice: English word '{w}' in title: {title}")
def process_file(filepath, context):
try:
with open(filepath, 'r', encoding='utf-8') as f:
lines = f.readlines()
except Exception as e:
print(f"Error reading {filepath}: {e}")
return False
headings = []
in_code_block = False
for i, line in enumerate(lines):
line_stripped = line.strip()
if line_stripped.startswith('```'):
in_code_block = not in_code_block
if not in_code_block:
match = re.match(r'^(#{1,6})\s+(.*)', line)
if match:
level = len(match.group(1))
title = match.group(2).strip()
headings.append({'level': level, 'title': title, 'line_idx': i, 'children': []})
for i, h in enumerate(headings):
level = h['level']
for j in range(i+1, len(headings)):
if headings[j]['level'] <= level:
break
if headings[j]['level'] == level + 1:
h['children'].append(j)
actions = {}
def has_text_between(start_idx, end_idx):
for text_ln in range(start_idx + 1, end_idx):
content = lines[text_ln].strip()
if content and not content.startswith('#'):
return True
return False
is_app = context.get('is_app', False)
chap_num = context.get('chap_num', 0)
sec_num = context.get('sec_num', 0)
h2_counter = sec_num if sec_num > 0 else 0
h3_counter = 0
for i, h in enumerate(headings):
level = h['level']
title = h['title']
ln = h['line_idx']
original_title = title
check_english(title)
if level == 1:
if not is_app and chap_num > 0:
pass
elif is_app:
title = re.sub(r'^[\d\.]+\s*', '', title)
m = re.match(r'^(附录[一二三四五六七八九十]*)\s*(.*)', title)
if m:
p1 = m.group(1).strip()
p2 = m.group(2).strip()
if p2.startswith(':') or p2.startswith(''):
p2 = p2[1:].strip()
title = f"{p1}{p2}" if p2 else p1
elif level == 2:
if not is_app:
clean_title = re.sub(r'^[\d\.]+\s*', '', title)
title = f"{chap_num}.{h2_counter} {clean_title}" if h2_counter > 0 else clean_title
else:
title = re.sub(r'^[\d\.]+\s*', '', title)
h3_counter = 0
elif level == 3:
h3_counter += 1
if not is_app:
clean_title = re.sub(r'^[\d\.]+\s*', '', title)
if h2_counter > 0:
title = f"{chap_num}.{h2_counter}.{h3_counter} {clean_title}"
else:
title = re.sub(r'^[\d\.]+\s*', '', title)
elif level >= 4:
m = re.match(r'^([\d\.]+)\s+(.*)', title)
if m:
nums = m.group(1)
rest = m.group(2)
if '.' in nums.strip('.'):
title = rest
if title != original_title:
actions[ln] = f"{'#' * level} {title}\n"
h['title'] = title
children_indices = h['children']
if len(children_indices) == 1:
child_idx = children_indices[0]
child_h = headings[child_idx]
child_ln = child_h['line_idx']
child_title = child_h['title']
if child_ln in actions:
modified_line = actions[child_ln]
m_child = re.match(r'^(#{1,6})\s+(.*)', modified_line)
if m_child:
child_title = m_child.group(2).strip()
actions[child_ln] = f"**{child_title}**\n\n"
elif len(children_indices) >= 2:
child_idx = children_indices[0]
child_ln = headings[child_idx]['line_idx']
if not has_text_between(ln, child_ln):
if level < 4:
if ln in actions:
actions[ln] = actions[ln].rstrip() + "\n\n涵盖了如下重点内容\n\n"
else:
actions[ln] = lines[ln].rstrip() + "\n\n涵盖了如下重点内容\n\n"
if not actions:
return False
new_lines = []
for i, line in enumerate(lines):
if i in actions:
if actions[i].startswith('**'):
pass
new_lines.append(actions[i])
else:
new_lines.append(line)
with open(filepath, 'w', encoding='utf-8') as f:
f.writelines(new_lines)
return True
if __name__ == "__main__":
file_contexts = parse_summary()
modified = 0
for filepath, context in file_contexts.items():
if os.path.exists(filepath):
if process_file(filepath, context):
modified += 1
print(f" -> MODIFIED: {filepath}")
for root, dirs, files in os.walk('.'):
if '.git' in root or 'node_modules' in root or '.gemini' in root:
continue
for file in files:
if file.endswith('.md') and file not in ['SUMMARY.md', 'README.md', 'CONTRIBUTING.md', 'CHANGELOG.md']:
filepath = os.path.join(root, file)
clean_path = filepath.replace('./', '')
if clean_path not in file_contexts:
if process_file(clean_path, {'is_app': True}):
modified += 1
print(f" -> MODIFIED: {clean_path}")
print(f"\nTotal Modified {modified} files")