Skip to content

Commit 8cfa124

Browse files
authored
docs: add Remote WAL configuration documentation (#1887)
Signed-off-by: WenyXu <[email protected]>
1 parent ddb3820 commit 8cfa124

File tree

5 files changed

+366
-0
lines changed

5 files changed

+366
-0
lines changed

docs/user-guide/deployments-administration/deploy-on-kubernetes/common-helm-chart-configurations.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -529,4 +529,27 @@ meta:
529529
enableRegionFailover: true
530530
configData: |
531531
allow_region_failover_on_local_wal = true
532+
533+
```
534+
535+
### Enable Remote WAL
536+
537+
To enable Remote WAL, both Metasrv and Datanode must be properly configured. Before proceeding, make sure to read the [Remote WAL Configuration](/user-guide/deployments-administration/wal/remote-wal/configuration.md) documentation for a complete overview of configuration options and important considerations.
538+
539+
```yaml
540+
remoteWal:
541+
enabled: true
542+
kafka:
543+
brokerEndpoints: ["kafka.kafka-cluster.svc:9092"]
544+
meta:
545+
configData: |
546+
[wal]
547+
provider = "kafka"
548+
replication_factor = 1
549+
auto_prune_interval = "300s"
550+
datanode:
551+
configData: |
552+
[wal]
553+
provider = "kafka"
554+
overwrite_entry_start_id = true
532555
```
Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
---
2+
keywords: [GreptimeDB Remote WAL, configuration, Kafka, Metasrv, Datanode, GreptimeDB]
3+
description: This section describes how to configure the Remote WAL for GreptimeDB Cluster.
4+
---
5+
# Configuration
6+
7+
The configuration of Remote WAL contains two parts:
8+
9+
- Metasrv Configuration
10+
- Datanode Configuration
11+
12+
If you are using Helm Chart to deploy GreptimeDB, you can refer to [Common Helm Chart Configurations](/user-guide/deployments-administration/deploy-on-kubernetes/common-helm-chart-configurations.md) to learn how to configure Remote WAL.
13+
14+
## Metasrv Configuration
15+
16+
On the Metasrv side, Remote WAL is primarily responsible for managing Kafka topics and periodically pruning stale WAL data.
17+
18+
```toml
19+
[wal]
20+
provider = "kafka"
21+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
22+
23+
# WAL data pruning options
24+
auto_prune_interval = "0s"
25+
auto_prune_parallelism = 10
26+
27+
# Topic creation options
28+
auto_create_topics = true
29+
num_topics = 64
30+
replication_factor = 1
31+
topic_name_prefix = "greptimedb_wal_topic"
32+
```
33+
34+
### Options
35+
36+
| Configuration Option | Description |
37+
| ------------------------ | ---------------------------------------------------------------------------------------- |
38+
| `provider` | Set to "kafka" to enable Remote WAL via Kafka. |
39+
| `broker_endpoints` | List of Kafka broker addresses. |
40+
| `auto_prune_interval` | Interval to automatically prune stale WAL data. Set to "0s" to disable. |
41+
| `auto_prune_parallelism` | Maximum number of concurrent pruning tasks. |
42+
| `auto_create_topics` | Whether to automatically create Kafka topics. If false, topics must be created manually. |
43+
| `num_topics` | Number of Kafka topics used for WAL. |
44+
| `replication_factor` | Kafka replication factor for each topic. |
45+
| `topic_name_prefix` | Prefix for Kafka topic names. Must match regex `[a-zA-Z_:-][a-zA-Z0-9_:\-\.@#]*`. |
46+
47+
#### Topic Setup and Kafka Permissions
48+
49+
To ensure Remote WAL works correctly with Kafka, please check the following:
50+
51+
- If `auto_create_topics = false`:
52+
- All required topics must be created manually **before** starting Metasrv.
53+
- Topic names must follow the pattern `{topic_name_prefix}_{index}` where `index` ranges from `0` to `{num_topics - 1}`. For example, with the default prefix `greptimedb_wal_topic` and `num_topics = 64`, you need to create topics from `greptimedb_wal_topic_0` to `greptimedb_wal_topic_63`.
54+
- Topics must be configured to support **LZ4 compression**.
55+
- The Kafka user must have the following permissions:
56+
- **Append** records to WAL topics (requires LZ4 compression support).
57+
- **Read** records from WAL topics (requires LZ4 compression support).
58+
- **Delete** records from WAL topics.
59+
- **Create** topics (only required if `auto_create_topics = true`).
60+
61+
## Datanode Configuration
62+
63+
The Datanode side is mainly used to write the data to the Kafka topics and read the data from the Kafka topics.
64+
65+
```toml
66+
[wal]
67+
provider = "kafka"
68+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
69+
max_batch_bytes = "1MB"
70+
overwrite_entry_start_id = true
71+
```
72+
73+
### Options
74+
75+
| Configuration Option | Description |
76+
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------- |
77+
| `provider` | Set to "kafka" to enable Remote WAL via Kafka. |
78+
| `broker_endpoints` | List of Kafka broker addresses. |
79+
| `max_batch_bytes` | Maximum size for each Kafka producer batch. |
80+
| `overwrite_entry_start_id` | If true, the Datanode will skip over missing entries during WAL replay. Prevents out-of-range errors, but may hide data loss. |
81+
82+
83+
#### Required Settings and Limitations
84+
85+
:::warning IMPORTANT: Kafka Retention Policy Configuration
86+
Please configure Kafka retention policy very carefully to avoid data loss. GreptimeDB will automatically recycle unneeded WAL data, so in most cases you don't need to set the retention policy. However, if you do set it, please ensure the following:
87+
88+
- **Size-based retention**: Typically not needed, as the database manages its own data lifecycle
89+
- **Time-based retention**: If you choose to set this, ensure it's **significantly greater than the auto-flush-interval** to prevent premature data deletion
90+
91+
Improper retention settings can lead to data loss if WAL data is deleted before GreptimeDB has processed it.
92+
:::
93+
94+
- If you set `overwrite_entry_start_id = true`:
95+
- Ensure that `auto_prune_interval` is enabled in Metasrv to allow automatic WAL pruning.
96+
- Kafka topics **must not use size-based retention policies**.
97+
- If time-based retention is enabled, the retention period should be set to a value significantly greater than auto-flush-interval, preferably at least 2 times its value.
98+
99+
- Ensure the Kafka user used by Datanode has the following permissions:
100+
- **Append** records to WAL topics (requires LZ4 compression support).
101+
- **Read** records from WAL topics (requires LZ4 compression support).
102+
- Ensure that `max_batch_bytes` does not exceed Kafka’s maximum message size (typically 1MB by default).
103+
104+
## Kafka Authentication Configuration
105+
106+
Kafka authentication settings apply to both Metasrv and Datanode under the `[wal]` section.
107+
108+
### SASL
109+
110+
Kafka supports several SASL mechanisms: `PLAIN`, `SCRAM-SHA-256`, and `SCRAM-SHA-512`.
111+
112+
```toml
113+
[wal]
114+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
115+
116+
[wal.sasl]
117+
type = "SCRAM-SHA-512"
118+
username = "user"
119+
password = "secret"
120+
```
121+
122+
### TLS
123+
124+
You can enable TLS encryption for Kafka connections by configuring the `[wal.tls]` section. There are three common modes:
125+
126+
#### Using System CA Certificate
127+
128+
To use system-wide trusted CAs, enable TLS without providing any certificate paths:
129+
130+
```toml
131+
[wal]
132+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
133+
134+
[wal.tls]
135+
```
136+
137+
#### Using Custom CA Certificate
138+
139+
If your Kafka cluster uses a private CA, specify the server CA certificate explicitly:
140+
141+
```toml
142+
[wal]
143+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
144+
145+
[wal.tls]
146+
server_ca_cert_path = "/path/to/server.crt"
147+
```
148+
149+
#### Using Mutual TLS (mTLS)
150+
151+
To enable mutual authentication, provide both the client certificate and private key along with the server CA:
152+
153+
```toml
154+
[wal]
155+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
156+
[wal.tls]
157+
server_ca_cert_path = "/path/to/server_cert"
158+
client_cert_path = "/path/to/client_cert"
159+
client_key_path = "/path/to/key"
160+
```
161+

i18n/zh/docusaurus-plugin-content-docs/current/user-guide/deployments-administration/deploy-on-kubernetes/common-helm-chart-configurations.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -529,4 +529,27 @@ meta:
529529
enableRegionFailover: true
530530
configData: |
531531
allow_region_failover_on_local_wal = true
532+
```
533+
534+
### 启用 Remote WAL
535+
536+
537+
在启用前,请务必查阅 [Remote WAL 配置](/user-guide/deployments-administration/wal/remote-wal/configuration.md)文档,以了解完整的配置项说明及相关注意事项。
538+
539+
```yaml
540+
remoteWal:
541+
enabled: true
542+
kafka:
543+
brokerEndpoints: ["kafka.kafka-cluster.svc:9092"]
544+
meta:
545+
configData: |
546+
[wal]
547+
provider = "kafka"
548+
replication_factor = 1
549+
auto_prune_interval = "300s"
550+
datanode:
551+
configData: |
552+
[wal]
553+
provider = "kafka"
554+
overwrite_entry_start_id = true
532555
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
---
2+
keywords: [GreptimeDB Remote WAL, 配置, Kafka, Metasrv, Datanode, GreptimeDB]
3+
description: 本节介绍如何配置 GreptimeDB 集群的 Remote WAL。
4+
---
5+
# 配置
6+
7+
GreptimeDB 支持使用 Kafka 实现 Remote WAL 存储。要启用 Remote WAL,需要分别配置 Metasrv 和 Datanode。
8+
9+
如果你使用 Helm Chart 部署 GreptimeDB,可以参考[常见 Helm Chart 配置项](/user-guide/deployments-administration/deploy-on-kubernetes/common-helm-chart-configurations.md)了解如何配置 Remote WAL。
10+
11+
## Metasrv 配置
12+
13+
Metasrv 负责 Kafka topics 的管理及过期 WAL 数据的自动清理。
14+
15+
```toml
16+
[wal]
17+
provider = "kafka"
18+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
19+
20+
# WAL 数据清理策略
21+
auto_prune_interval = "0s"
22+
auto_prune_parallelism = 10
23+
24+
# Topic 自动创建配置
25+
auto_create_topics = true
26+
num_topics = 64
27+
replication_factor = 1
28+
topic_name_prefix = "greptimedb_wal_topic"
29+
```
30+
31+
### 配置
32+
33+
| 配置项 | 说明 |
34+
| ------------------------ | ---------------------------------------------------------------------- |
35+
| `provider` | 设置为 `"kafka"` 以启用 Remote WAL。 |
36+
| `broker_endpoints` | Kafka broker 的地址列表。 |
37+
| `auto_prune_interval` | 自动清理过期 WAL 的间隔时间,设为 `"0s"` 表示禁用。 |
38+
| `auto_prune_parallelism` | 并发清理任务的最大数量。 |
39+
| `auto_create_topics` | 是否自动创建 Kafka topic,设为 `false` 时需手动预创建。 |
40+
| `num_topics` | 用于存储 WAL 的 Kafka topic 数量。 |
41+
| `replication_factor` | 每个 topic 的副本数量。 |
42+
| `topic_name_prefix` | Kafka topic 名称前缀,必须匹配正则 `[a-zA-Z_:-][a-zA-Z0-9_:\-\.@#]*`|
43+
44+
#### Kafka Topic 与权限要求
45+
46+
请确保以下设置正确,以保证 Remote WAL 正常运行:
47+
48+
- 如果 `auto_create_topics = false`
49+
- 必须**在启动 Metasrv 之前**手动创建好所有 WAL topics;
50+
- Topic 名称必须符合 `{topic_name_prefix}_{index}` 的格式,其中 index 的取值范围是 `0``{num_topics - 1}`。例如,默认前缀为 `greptimedb_wal_topic`,且 `num_topics = 64` 时,需要创建从 `greptimedb_wal_topic_0``greptimedb_wal_topic_63` 的 topic。
51+
- Topic 必须配置为支持**LZ4 压缩**
52+
- Kafka 用户需具备以下权限:
53+
- 对 topics 追加数据;
54+
- 读取 topics 中数据;
55+
- 删除 topics 中数据;
56+
- 若启用自动创建(`auto_create_topics = true`),还需具备 创建 topic 的权限。
57+
58+
## Datanode 配置
59+
60+
Datanode 负责将数据写入 Kafka 并从中读取数据。
61+
62+
```toml
63+
[wal]
64+
provider = "kafka"
65+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
66+
max_batch_bytes = "1MB"
67+
overwrite_entry_start_id = true
68+
```
69+
70+
### 配置
71+
72+
| 配置项 | 说明 |
73+
| -------------------------- | -------------------------------------------------------------------------------------------- |
74+
| `provider` | 设置为 `"kafka"` 以启用 Remote WAL。 |
75+
| `broker_endpoints` | Kafka broker 的地址列表。 |
76+
| `max_batch_bytes` | 每个写入批次的最大大小,默认不能超过 Kafka 配置的单条消息上限(通常为 1MB)。 |
77+
| `overwrite_entry_start_id` | 若设为 `true`,在 WAL 回放时跳过缺失的 entry,避免 out-of-range 错误(但可能掩盖数据丢失)。 |
78+
79+
80+
#### 注意事项与限制
81+
82+
:::warning 重要:Kafka 保留策略配置
83+
请非常小心地配置 Kafka 保留策略以避免数据丢失。GreptimeDB 会自动回收不需要的 WAL 数据,因此在大多数情况下你不需要设置保留策略。但是如果你确实需要设置,请确保以下几点:
84+
85+
- **基于大小的保留策略**:通常不需要设置,因为数据库会管理自己的数据生命周期
86+
- **基于时间的保留策略**:如果你选择设置此项,请确保保留时间**远大于自动刷新间隔** **(auto-flush-interval)** 以防止过早删除数据。
87+
88+
不当的保留设置可能导致数据丢失,如果 WAL 数据在 GreptimeDB 处理之前被删除。
89+
:::
90+
91+
- 如果 `overwrite_entry_start_id = true`
92+
- 确保 Metasrv 中的 `auto_prune_interval` 已启用,以允许自动清理 WAL;
93+
- Kafka topics 不应使用**基于大小保留策略**
94+
- 如果启用基于时间的保留策略,请确保保留时间**远大于自动刷新间隔(auto-flush-interval)**,至少是它的两倍。
95+
96+
- 确保 Datanode 使用的 Kafka 用户具有以下权限:
97+
- 对 topics 追加数据;
98+
- 读取 topics 中数据;
99+
- 确保 `max_batch_bytes` 不超过 Kafka topic 的最大消息大小(通常为 1MB)。
100+
101+
## Kafka 认证配置
102+
103+
Kafka 的认证参数在 Metasrv 和 Datanode 的 `[wal]` 段中配置。
104+
105+
### SASL 认证
106+
107+
支持的 SASL 机制包括:`PLAIN``SCRAM-SHA-256``SCRAM-SHA-512`
108+
109+
```toml
110+
[wal]
111+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
112+
113+
[wal.sasl]
114+
type = "SCRAM-SHA-512"
115+
username = "user"
116+
password = "secret"
117+
```
118+
119+
### TLS
120+
121+
要启用 TLS,可在 `[wal.tls]` 段进行配置,支持以下几种方式:
122+
123+
#### 使用系统 CA 证书
124+
125+
无需提供证书路径,自动使用系统信任的 CA:
126+
127+
```toml
128+
[wal]
129+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
130+
131+
[wal.tls]
132+
```
133+
134+
#### 使用自定义 CA 证书
135+
136+
用于 Kafka 集群使用私有 CA 的场景:
137+
138+
```toml
139+
[wal]
140+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
141+
142+
[wal.tls]
143+
server_ca_cert_path = "/path/to/server.crt"
144+
```
145+
146+
#### 使用双向 TLS(mTLS)
147+
148+
同时提供客户端证书与私钥:
149+
150+
```toml
151+
[wal]
152+
broker_endpoints = ["kafka.kafka-cluster.svc:9092"]
153+
[wal.tls]
154+
server_ca_cert_path = "/path/to/server_cert"
155+
client_cert_path = "/path/to/client_cert"
156+
client_key_path = "/path/to/key"
157+
```
158+

sidebars.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,7 @@ const sidebars: SidebarsConfig = {
305305
items: [
306306
'user-guide/deployments-administration/wal/remote-wal/quick-start',
307307
'user-guide/deployments-administration/wal/remote-wal/cluster-deployment',
308+
'user-guide/deployments-administration/wal/remote-wal/configuration',
308309
]
309310
},
310311
],

0 commit comments

Comments
 (0)