Skip to content

Commit 8de96d8

Browse files
authored
Update documentation for tracking field usage
1 parent 5ab5546 commit 8de96d8

File tree

1 file changed

+12
-11
lines changed

1 file changed

+12
-11
lines changed

docs/index.asciidoc

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -117,18 +117,21 @@ When <<plugins-{type}s-{plugin}-tracking_field>> is set, the plugin will record
117117
a file (location defaults to <<plugins-{type}s-{plugin}-last_run_metadata_path>>).
118118

119119
The user can then inject this value in the query using the placeholder `:last_value`. The value will be injected into the query
120-
before execution, and the updated after the query completes, assuming new data was found.
120+
before execution, and the updated after the query completes if new data was found.
121121

122122
This feature works best when:
123-
* the query sorts by the tracking field
124-
* the field type has enough resolution so that two events are unlikely to have the same value for the field
125123

126-
The plugin also offers another placeholder called `:present` used to inject the nano-second based value of "now-30s".
124+
. the query sorts by the tracking field;
125+
. the timestamp field is added by {es};
126+
. the field type has enough resolution so that two events are unlikely to have the same value.
127127

128-
A suggestion is to use a tracking field that has nanosecond second precision, like
129-
https://www.elastic.co/guide/en/elasticsearch/reference/current/date_nanos.html[date nanoseconds] field type.
128+
It is recommended to use a tracking field whose type is https://www.elastic.co/guide/en/elasticsearch/reference/current/date_nanos.html[date nanoseconds].
129+
If the tracking field is of this data type, an extra placeholder called `:present` can be used to inject the nano-second based value of "now-30s".
130+
This placeholder is useful as the right-hand side of a range filter, allowing the collection of
131+
new data but leaving partially-searcheable bulk request data to the next scheduled job.
130132

131-
A good use case for this feature is to track new data in an index, which can be achieved by:
133+
Below is a series of steps to help set up the "tailing" of data being written to a set of indices, using a date nanosecond field
134+
added by an Elasticsearch ingest pipeline, and the `tracking_field` capability of this plugin.
132135

133136
. create ingest pipeline that adds Elasticsearch's `_ingest.timestamp` field to the documents as `event.ingested`:
134137

@@ -205,9 +208,7 @@ A good use case for this feature is to track new data in an index, which can be
205208
index => 'test-*'
206209
query => '{ "query": { "range": { "event.ingested": { "gt": ":last_value", "lt": ":present"}}}, "sort": [ { "event.ingested": {"order": "asc", "format": "strict_date_optional_time_nanos", "numeric_type" : "date_nanos" } } ] }'
207210
tracking_field => "[event][ingested]"
208-
# set a seed value to a value known to be older than any value of `event.ingested`
209-
tracking_field_seed => "1980-01-01T23:59:59.999999999Z"
210-
slices => 5 # optional use of slices to speed data processing, should be less than number of primary shards
211+
slices => 5 # optional use of slices to speed data processing, should be equal to or less than number of primary shards
211212
schedule => '* * * * *' # every minute
212213
schedule_overlap => false # don't accumulate jobs if one takes longer than 1 minute
213214
}
@@ -216,7 +217,7 @@ A good use case for this feature is to track new data in an index, which can be
216217
With this setup, as new documents are indexed an `test-*` index, the next scheduled run will:
217218

218219
. select all new documents since the last observed value of the tracking field;
219-
. use PIT+search_after to paginate through all the data;
220+
. use <<point-in-time-api,Point in time (PIT)>> + <<search-after, Search after>> to paginate through all the data;
220221
. update the value of the field at the end of the pagination.
221222

222223
[id="plugins-{type}s-{plugin}-options"]

0 commit comments

Comments
 (0)