Skip to content

Commit a122c31

Browse files
author
OpenShift Bot
authored
Merge pull request #553 from richm/mux-client-tag-not-handled
Merged by openshift-bot
2 parents 67744e9 + 5ac036c commit a122c31

13 files changed

+216
-141
lines changed

deployer/templates/fluentd.yaml

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -153,12 +153,6 @@ objects:
153153
value: ${JOURNAL_SOURCE}
154154
- name: "JOURNAL_READ_FROM_HEAD"
155155
value: ${JOURNAL_READ_FROM_HEAD}
156-
- name: "USE_MUX"
157-
value: ${USE_MUX}
158-
- name: "USE_MUX_CLIENT"
159-
value: ${USE_MUX_CLIENT}
160-
- name: "MUX_ALLOW_EXTERNAL"
161-
value: ${MUX_ALLOW_EXTERNAL}
162156
- name: "BUFFER_QUEUE_LIMIT"
163157
value: ${BUFFER_QUEUE_LIMIT}
164158
- name: "BUFFER_SIZE_LIMIT"
@@ -329,13 +323,9 @@ parameters:
329323
name: USE_MUX
330324
value: "false"
331325
-
332-
description: 'Configure MUX CLIENT (false|true).'
333-
name: USE_MUX_CLIENT
334-
value: "false"
335-
-
336-
description: 'Configure MUX SERVER (false|true).'
337-
name: MUX_ALLOW_EXTERNAL
338-
value: "false"
326+
description: 'Configure MUX CLIENT MODE (minimal|maximal).'
327+
name: MUX_CLIENT_MODE
328+
value: ""
339329
-
340330
description: 'Fluentd buffer queue limit (0.12 only)'
341331
name: BUFFER_QUEUE_LIMIT

docs/mux-logging-service-diag4.dia

4.22 KB
Binary file not shown.

docs/mux-logging-service-diag4.png

91.5 KB
Loading

docs/mux-logging-service.md

Lines changed: 117 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -41,30 +41,37 @@ There are environment variables for the logging-mux DC config:
4141
on each node. This allows the per-node Fluentd collectors to read and ship
4242
logs off the node as fast as possible to the logging-mux service, which will
4343
normalize and enrich with Kubernetes metadata, and store in Elasticsearch.
44-
* `MUX_ALLOW_EXTERNAL` - `true`/`false` - default is `false` - if `true`, the
45-
logging-mux service will be able to handle log records sent from outside of
46-
the cluster, as described below.
44+
This will also be able to process records sent from outside of the cluster,
45+
if the service has exposed an externalIP.
4746
* `FORWARD_LISTEN_PORT` - default `24284` - port to listen for secure_forward
4847
protocol log messages
4948
* `FORWARD_LISTEN_HOST` - hostname to listen for secure_forward protocol log
5049
messages - this is the same as the FQDN in the mux server cert
5150

5251
These are environment variables for the logging-fluentd daemonset config:
5352

54-
* `USE_MUX_CLIENT` - `true`/`false` - default is `false` - if `true`, fluentd
55-
will not send logs directly to Elasticsearch, instead it will use
56-
`secure_forward` to send logs to mux.
57-
* `MUX_CLIENT_MODE` - `minimal`/`full_no_k8s_meta` - default is unset - if
58-
`minimal`, this is the same behavior as `USE_MUX_CLIENT true`. If
59-
`full_no_k8s_meta`, then fluentd will perform *all* of the data processing
60-
and filtering *except* the Kubernetes metadata annotation. The records will
61-
be sent to mux for Kubernetes metadata annotation before sending the logs to
62-
Elasticsearch. In this case, it is assumed you want to deploy mux to be as
63-
lightweight as possible, and move the processing burden to the individual
64-
nodes.
65-
66-
logging-mux must be deployed with `MUX_ALLOW_EXTERNAL true` in order to receive
67-
records sent from outside the cluster.
53+
* `MUX_CLIENT_MODE` - `minimal`/`maximal` - default is unset - If this is not
54+
set, Fluentd will perform all of the processing, including Kubernetes
55+
metadata processing, and will send the records directly to Elasticseach.
56+
If `maximal`, then fluentd will do as much processing as possible at the node
57+
before sending the records to mux. This is the current recommended way to
58+
use mux due to current scaling issues. In this case, it is assumed you want
59+
to deploy mux to be as lightweight as possible, and move as much of the
60+
processing burden as possible to the individual Fluentd collector pods
61+
running on each node.
62+
If `minimal`, then fluentd will perform *no* processing and send the raw logs
63+
to mux for processing. We do not currently recommend using this mode, and
64+
ansible will warn you about this.
65+
66+
At the present time, the only scalable way to use mux is with
67+
`MUX_CLIENT_MODE=maximal`, which pushes as much of the log processing burden as
68+
possible on to every node running Fluentd. With `minimal`, mux quickly becomes
69+
overwhelmed with processing all of the log records, and requires scaling up a
70+
large number of pods to keep up with the load. However, this may change in the
71+
future.
72+
73+
The logging-mux Service must be configured with an externalIP in order to
74+
receive records sent from outside the cluster.
6875

6976
### Ansible Configuration ###
7077

@@ -73,12 +80,18 @@ There are several new Ansible parameters which can be used with the
7380

7481
* `openshift_logging_use_mux` - `True`/`False` - default `False` - if
7582
`True`, create the mux Service, DeploymentConfig, Secrets, ConfigMap, etc.
76-
* `openshift_logging_use_mux_client` - `True`/`False` - default `False` - if
77-
`True`, configure the Fluentd collectors running on each node to send the raw
78-
logs to mux instead of directly to Elasticsearch
79-
* `openshift_logging_use_mux_service` - `True`/`False` - default `False` - if
80-
`True`, expose the mux service external to the cluster, and configure the mux
81-
fluentd to accept logs from outside the cluster
83+
- `openshift_logging_mux_client_mode`: Values - `minimal`, `maximal`.
84+
Default is unset. If this value is is unset, Fluentd will perform all of the
85+
processing, including Kubernetes metadata processing, and will send the
86+
records directly to Elasticseach.
87+
The value `maximal` means that Fluentd will do as much processing as possible
88+
at the node before sending the records to mux. This is the current
89+
recommended way to use mux due to current scaling issues.
90+
The value `minimal` means that Fluentd will do *no* processing at all, and
91+
send the raw logs to mux for processing. We do not currently recommend using
92+
this mode, and ansible will warn you about this.
93+
* `openshift_logging_mux_allow_external` - `True`/`False` - default `False` - if
94+
`True`, expose the mux service external to the cluster
8295
* `openshift_logging_mux_hostname` - default
8396
"mux."`openshift_master_default_subdomain`. This is the hostname that
8497
clients outside the cluster will use, the one used in the mux TLS server cert
@@ -193,12 +206,59 @@ namespace templates such as found in the
193206
mux Fluentd Config Details
194207
--------------------------
195208

209+
### About the Diagrams ###
210+
211+
A Fluentd node collector agent can read from either the systemd journal or from
212+
log files. `Input /var/log` and `Input journald` mean that Fluentd will read
213+
from whichever of these it is configured to read from.
214+
215+
`Filter k8s` takes the json-file or journald container log input and formats it
216+
using the ViaQ data model (i.e. the `kubernetes` and `docker` namespaces).
217+
218+
`Filter syslog` takes the `/var/log/messages` or journald system log input and
219+
formats it using the ViaQ data model (i.e. the `systemd` namespace).
220+
221+
`Filter k8s meta` looks up Kubernetes metadata for container log records from
222+
the Kubernetes server such as namespace\_uuid, pod\_uuid, labels, and
223+
annotations.
224+
225+
`Filter viaq` removes empty fields and makes sure the time field is
226+
`@timestamp`.
227+
228+
`Filter mux` determines what type of processing the log record needs - full
229+
processing, k8s meta only, or no further processing, and sends the log record
230+
to the appropriate processing pipeline, with the appropriate tagging and
231+
annotations.
232+
233+
`Filter pre mux` prepares records received by mux and processed in the `Filter
234+
mux` step for Kubernetes metadata annotation.
235+
236+
`Filter post mux` removes any temporary log record annotations added by
237+
previous mux filtering steps.
238+
239+
A `raw` log is a log that is directly from the source, with no filtering or
240+
formatting applied to it yet.
241+
242+
`viaq format` means that the log record is in the ViaQ format at this stage in
243+
the pipeline, ready to be sent to Elasticsearch, or secure_forward to an
244+
external logging system.
245+
246+
`secure_forward` is the Fluentd secure_forward protocol. If there is a dashed
247+
or broken line in the box around it, that means it is an optional component
248+
e.g. for shipping logs out of the cluster to another logging system.
249+
250+
The `Elasticsearch` output inside of a larger box means the Elasticsearch
251+
output plugin. The large, standalone `Elasticsearch` means the Elasticsearch
252+
storage cluster component of OpenShift logging.
253+
196254
### Basic Flow ###
197255

198-
This is what the flow looks like normally, with no mux:![Normal Flow](mux-logging-service-diag1.png)
256+
This is what the flow looks like normally, when `MUX_CLIENT_MODE` is
257+
unset. mux is not used at all, Fluentd does all of the processing and sends
258+
logs directly to Elasticsearch:![Normal Flow](mux-logging-service-diag1.png)
199259

200-
With Fluentd configured with `USE_MUX_CLIENT true`, and with mux configured
201-
With `USE_MUX true` and `MUX_ALLOW_EXTERNAL false`:
260+
With Fluentd configured with `MUX_CLIENT_MODE minimal`, and with mux configured
261+
with `USE_MUX true` (`minimal` is not currently recommended to use):
202262

203263
OpenShift Fluentd node collector sends raw records, collected from `json-file`
204264
or `journald` or both, to the mux service via secure_forward.
@@ -207,7 +267,7 @@ and sends the logs to Elasticsearch.
207267

208268
Flow:![Internal Cluster mux](mux-logging-service-diag2.png)
209269

210-
With `USE_MUX true` and `MUX_ALLOW_EXTERNAL true`:
270+
With `USE_MUX true`:
211271

212272
In addition to the above, mux will examine the tags of the incoming
213273
`secure_forward` records, which can come from `secure_forward` clients outside
@@ -223,12 +283,14 @@ container records are written.
223283

224284
Flow:![External Cluster mux](mux-logging-service-diag3.png)
225285

226-
With Fluentd configured with `MUX_CLIENT_MODE full_no_k8s_meta`:
286+
With Fluentd configured with `MUX_CLIENT_MODE maximal`:
287+
288+
Fluentd will perform as much of the processing and formatting as possible of
289+
log records read from files or journald. Mux will perform the Kubernetes
290+
metadata annotation before submitting the records to Elasticsearch. `maximal`
291+
mode is the currently recommended mode to use with mux.
227292

228-
Fluentd will perform *all* processing and formatting of log records read from
229-
files or journald *except* that it will not perform Kubernetes metadata
230-
processing. Mux will perform the Kubernetes metadata annotation before
231-
submitting the records to Elasticsearch.
293+
Flow:![mux with MUX_CLIENT_MODE=maximal](mux-logging-service-diag4.png)
232294

233295
### Details ###
234296

@@ -268,31 +330,39 @@ is much easier to add/rewrite a field than to rewrite a tag, which is why the
268330
k8s meta filter plugin is configured to use the `journald` fields rather than
269331
the `json-file` Fluentd tags.
270332

271-
The next thing is this match:
333+
The next stage in the pipeline is this match:
272334

273335
<match journal system.var.log.messages system.var.log.messages.** kubernetes.var.log.containers.**>
274336

275337
This uses the `relabel` plugin to redirect the standard OpenShift logging tags
276338
to their usual destination via the `@INGRESS` label.
277339

278-
This is all of the processing done if `USE_MUX true` and
279-
`MUX_ALLOW_EXTERNAL false`.
340+
The next stage in the pipeline is this match:
280341

281-
If `MUX_ALLOW_EXTERNAL true`, there is some additional processing for tags
282-
matching `project.**`, and for tags which do not match any of the above.
342+
<match journal.container** journal.system>
283343

284-
The first thing is this filter:
344+
Records with these tags are operations logs and have already been processed and
345+
formatted by an OpenShift Fluentd. These are retagged with `mux.ops` and sent
346+
to the `@INGRESS` label, where they will skip the filtering stages and go to
347+
the output stages.
348+
349+
The next stage in the pipeline is this filter:
285350

286351
<filter **>
287352

288353
This examines the tag to see if it is of the form
289354
`project.namespacename.whatever`, and sets the field `mux_namespace_name` to
290-
the value of `namespacename` if it exists, or `mux-undefined` otherwise. This
291-
also examines the record to see if the `kubernetes.metadata_uuid` field exists
292-
and has a value, to see if the record can bypass the k8s metadata plugin, and
293-
adds the field `mux_need_k8s_meta` with a value of `true` or `false`.
294-
295-
The next thing is a rewrite tag filter:
355+
the value of `namespacename` if it is in that form, or `mux-undefined`
356+
otherwise. This also examines the record to see if the
357+
`kubernetes.metadata_uuid` field exists and has a value, to see if the record
358+
can bypass the k8s metadata plugin, and adds the field `mux_need_k8s_meta` with
359+
a value of `true` or `false`. This also handles logs coming from inside the
360+
cluster that have been sent from a fluentd running in
361+
`MUX_CLIENT_MODE=maximal`. In that case, the records will not have a
362+
`project.*` tag, but will have a `kubernetes.*` tag and/or fields
363+
such as `CONTAINER_NAME` which will be used to perform the k8s metadata lookup.
364+
365+
The next stage in the pipeline is a rewrite tag filter:
296366

297367
<match **>
298368

@@ -310,13 +380,13 @@ and will match that plugin's tag match pattern (if for some reason it is using t
310380
record filters.
311381

312382
Then all records are passed to the `@INGRESS` label, where there are two
313-
additional filters. The first one is this:
383+
additional mux related filters. The first one is this:
314384

315385
<filter kubernetes.mux.var.log.containers.mux-mux.mux-mux_**>
316386

317387
This filter will add the `CONTAINER_NAME` and `CONTAINER_ID_FULL` fields to the
318-
record if they do not already exists so that the k8s meta filter plugin will
319-
add the `kubernetes.namespace_uuid` to the record.
388+
record if they do not already exist so that the k8s meta filter plugin will
389+
add the Kubernetes `namespace_uuid`, labels, and annotations.
320390

321391
The next filter is this:
322392

fluentd/configs.d/filter-pre-mux.conf

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,8 @@
22
@type record_transformer
33
enable_ruby
44
<record>
5-
# add these in case k8s-meta is configured to look for journald metadata
5+
# mux always runs with USE_JOURNAL=true, so make sure CONTAINER_NAME and
6+
# CONTAINER_ID_FULL are set correctly for the k8s meta plugin
67
# mux_namespace_name added in input-post-forward-mux.conf
78
CONTAINER_NAME ${record.fetch('CONTAINER_NAME', 'k8s_mux-mux.mux-mux_mux_' + record['mux_namespace_name'] + '_mux_01234567')}
89
CONTAINER_ID_FULL ${record.fetch('CONTAINER_ID_FULL', '0123456789012345678901234567890123456789012345678901234567890123')}

fluentd/configs.d/input-post-forward-mux.conf

Lines changed: 33 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,37 @@
4747
rewriterule1 message .+ mux.ops
4848
</match>
4949

50-
# if mux is operating as an external facing service, there is more
51-
# processing required
52-
@include mux-post-input-*.conf
50+
# If we got here, then either these are k8s container logs read from the journal by
51+
# an openshift fluentd, or are external logs
52+
# If they are k8s container logs, they should have a CONTAINER_NAME field which will
53+
# be used by filter-pre-mux.conf. If they are external logs, the namespace to use
54+
# will be encoded in either the namespace_name field, or in the tag. The filter below
55+
# will set the values/tag needed in filter-pre-mux.conf.
56+
# If the record already has k8s metadata, there will be a kubernetes.namespace_uuid
57+
# field. If not, then the record will be tagged that k8s metadata processing is needed.
58+
# This filter will also ensure that the record has some sort of time field.
59+
<filter **>
60+
@type record_transformer
61+
enable_ruby
62+
<record>
63+
mux_namespace_name ${record['namespace_name'] || (tag_parts[0] == "project" && tag_parts[1]) || ENV["MUX_UNDEFINED_NAMESPACE"] || "mux-undefined"}
64+
mux_need_k8s_meta ${(record['namespace_uuid'] || record.fetch('kubernetes', {})['namespace_id'].nil?) ? "true" : "false"}
65+
kubernetes {"namespace_name":"${record['namespace_name'] || (tag_parts[0] == 'project' && tag_parts[1]) || ENV['MUX_UNDEFINED_NAMESPACE'] || 'mux-undefined'}","namespace_id":"${record['namespace_uuid'] || record.fetch('kubernetes', {})['namespace_id']}"}
66+
time ${record['@timestamp'] || record['time'] || time.utc.to_datetime.rfc3339(6)}
67+
</record>
68+
</filter>
69+
70+
# if the record already has k8s metadata (e.g. record forwarded from another
71+
# openshift or mux) then tag so that k8s meta will be skipped
72+
# the `mux` tag will skip all operation and app specific filtering
73+
# the kubernetes.mux.** tag will match the k8s-meta but no other ops and apps filtering
74+
# the kubernetes.mux.** tag will be processed by filter-pre-mux.conf
75+
# This tag was chosen because it will be processed by the k8s meta plugin, but will
76+
# bypass all other filtering.
77+
<match **>
78+
@type rewrite_tag_filter
79+
@label @INGRESS
80+
rewriterule1 mux_need_k8s_meta ^false$ mux
81+
rewriterule2 mux_namespace_name (.+) kubernetes.mux.var.log.containers.mux-mux.mux-mux_$1_mux-0123456789012345678901234567890123456789012345678901234567890123.log
82+
</match>
5383
</label>

fluentd/configs.d/mux-post-input-filter-tag.conf

Lines changed: 0 additions & 22 deletions
This file was deleted.

0 commit comments

Comments
 (0)