-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Problem statement : We are trying to push data to two opensearch instances from fluentd as shown above, Now at any point of time , if fluentd pod in k8s restarted due to any update in fluentd configuration , we are observing delay of data to be seen in Opensearch dashboards. Meaning that the data which we are pushing is not observed for atleast 45 mins - 1 hr and only after this time the data is visible on the opensearch dashboard.
Question 1 : Is this expected behaviour of fluentd , that when fluentd restarts the data to be visible on opensearch takes some time and why does this happen, (Need explanation to understand) ?
Question 2 : If above answer is No, How to avoid delay , can you help us with the updated configuration for fluentd ?
Question 3 : When fluentd restarts, we also observed 1-2 datapoint loss sometimes.
This is the current config of our fluentd as shown below where we are pushing the data to opensearch.
containers.input.conf: |-
<source>
@id edge-tls-http-endpoint
@type http
port 4000
<parse>
@type json
time_key nil
</parse>
</source>
output.conf: |-
<match **>
@type copy
<store ignore_error>
@type relabel
@label @primary
</store>
{{- if .Values.opensearch.sampletestingSecondary.enabled }}
<store ignore_error>
@type relabel
@label @secondary
</store>
{{- end }}
</match>
<label @primary>
<match **>
@id elasticsearch_sampletesting_pm
@type elasticsearch
@log_level info
verify_es_version_at_startup false
default_elasticsearch_version 7
max_retry_putting_template 5
fail_on_putting_template_retry_exceed false
scheme https
ssl_verify false
ssl_version TLSv1_2
suppress_type_name true
host {{ include "getsampletestingEndpointURL" . }}
port 443
user {{ required "Required sampletesting username" .Values.opensearch.sampletesting.creds.username | b64dec }}
password {{ required "Required sampletesting password" .Values.opensearch.sampletesting.creds.password | b64dec }}
templates { "sampletesting_default_template": "/etc/fluent/template/sampletesting-default-template", "sampletesting_cm_template": "/etc/fluent/template/sampletesting-cm-template", "sampletesting_2_shard_template": "/etc/fluent/template/sampletesting-2-shard-template", "sampletesting_3_shard_template": "/etc/fluent/template/sampletesting-3-shard-template", "sampletesting_4_shard_template": "/etc/fluent/template/sampletesting-4-shard-template", "sampletesting_5_shard_template": "/etc/fluent/template/sampletesting-5-shard-template", "sampletesting_6_shard_template": "/etc/fluent/template/sampletesting-6-shard-template"}
template_overwrite true
write_operation upsert
target_index_key @target_index
index_name defaulsampletestingtindex
type_name _doc
id_key request_id
remove_keys request_id
reconnect_on_error true
reload_on_failure true
reload_connections false
request_timeout 45s
bulk_message_request_threshold -1
<buffer>
@type file
path /var/log/fluentd-edge-sampletesting-pm-buffers/sampletesting.system.buffer.pm
flush_mode interval
flush_thread_count 4
flush_interval 60s
retry_type periodic
retry_max_times 20
retry_wait 20s
chunk_limit_size 64M
queued_chunks_limit_size 100
overflow_action throw_exception
</buffer>
</match>
</label>
{{- if .Values.opensearch.sampletestingSecondary.enabled }}
<label @secondary>
<filter **>
@type record_modifier
<record>
@target_index ${record["@target_index"]}_${(((Time.at(time).strftime("%j").to_i - 1) / 3) * 3 + 1)}
</record>
</filter>
<match **>
@id secondary_elasticsearch_sampletesting_pm
@type elasticsearch
@log_level info
verify_es_version_at_startup false
default_elasticsearch_version 7
max_retry_putting_template 5
fail_on_putting_template_retry_exceed false
scheme https
ssl_verify false
ssl_version TLSv1_2
suppress_type_name true
host {{ include "getsampletestingSecondaryEndpointURL" . }}
port 443
user {{ required "Required sampletesting username" .Values.opensearch.sampletestingSecondary.creds.username | b64dec }}
password {{ required "Required sampletesting password" .Values.opensearch.sampletestingSecondary.creds.password | b64dec }}
templates { "sampletesting_default_template": "/etc/fluent/template/sampletesting-default-template", "sampletesting_cm_template": "/etc/fluent/template/sampletesting-cm-template", "sampletesting_2_shard_template": "/etc/fluent/template/sampletesting-2-shard-template", "sampletesting_3_shard_template": "/etc/fluent/template/sampletesting-3-shard-template", "sampletesting_4_shard_template": "/etc/fluent/template/sampletesting-4-shard-template", "sampletesting_5_shard_template": "/etc/fluent/template/sampletesting-5-shard-template", "sampletesting_6_shard_template": "/etc/fluent/template/sampletesting-6-shard-template"}
template_overwrite true
write_operation upsert
target_index_key @target_index
index_name duplicatedefaulsampletestingtindex
type_name _doc
id_key request_id
remove_keys request_id
reconnect_on_error true
reload_on_failure true
reload_connections false
request_timeout 45s
bulk_message_request_threshold -1
<buffer>
@type file
path /var/log/fluentd-edge-sampletesting-pm-buffers/small.sampletesting.system.buffer.pm
flush_mode interval
flush_thread_count 4
flush_interval 60s
retry_type periodic
retry_max_times 20
retry_wait 20s
chunk_limit_size 64M
queued_chunks_limit_size 100
overflow_action throw_exception
</buffer>
</match>
</label>
{{- end }}
{{- end }}
How to reproduce : just restart fluentd whenever its pushing and observe data in OS dashboard.
Can someone please help here . @daipom