Closed
Description
I've noticed that if a field is Instant
with epoch_millis
format:
@Field(type = FieldType.Date, format = DateFormat.epoch_millis)
private Instant timestamp;
spring-data-elasticsearch will convert this object as a json with string value like "timestamp":"1644234181000"
rather than long value "timestamp":1644234181000
. After digging into the code, I find that it's DateFormatter#format
that returns only string value, so timestamp in Instant
is converted into a string value rather long.
- Although string value(long literally) for
epoch_mills
is accepted by elasticsearch, it's not mentioned in the doc; - Worse, we save/update our value as long for
epoch_millis
before(without using spring-data-elasticsearch), so now after using spring-data-elasticsearch, both string and long exist fortimestamp
field; - Additionally, we also use elasticsearch-hadoop to read data in elasticsearch, and it can only read
epoch_millis
as long or string, not both.
Any ideas to support to convert epoch_millis
and epoch_second
for date type as long rather than string? or at least supply an option to determine it as long or string, rather than just use string whatever the real date type is.
Activity
[-]Why save `epoli_millis` as string?[/-][+]Why save `epoch_millis` as string?[/+]sothawo commentedon Oct 1, 2022
The documentation you already linked explicitly states:
Elasticsearch stores the values in the
_source
the way they came in and when returning the_source
in a query Elasticsearch will return what came in.But when fields for example are retrieved with the
fields
option or with thedocvalue_fields
option, they are returned as string, no matter how they were sent in.Consider this mapping for two fields with the same date format:
We store this document:
The search it with
field
values (normally you'd set"_source": false
when using fields):The response is:
In the
_source
the mixed notation is returned, but in thefields
the values are returned as strings. Elasticsearch takes whatever it gets, internally uses an numeric instant value, but whenever returning it (besides in the_source
) it represents the date as string - as documented.If Spring Data Elasticsearch would convert
Instant
properties to a numeric values then it would fail on reading responses when users do not request the full document source but only selected fields, so there's no point in changing that behaviour.If you got mixed data in your _source of the documents, you'd probably better use
fields
in your queries to get a consistently representation (which would be string).One possibility would be to add a new format value
epoch_millis_long
which would explicitly convert to/from a Long value.puppylpg commentedon Oct 3, 2022
Thanks very much for your detailed response! It really helps me a lot.
Does spring-data-elasticsearch support query with
fields
/stored_fields
/docvalue_fields
options? I don't find clues about that in docs and codes so far.sothawo commentedon Oct 3, 2022
Support for
fields
has been in Spring Data Elasticsearch from the beginning, since 4.4 it is available on everyQueryBuilder
with thewithFields()
methods. In older versions I think you had to set it directly on theQuery
instance.Support for
stored_fields
has been added in 4.4 (#2004) to theNativeSearchQuery
. In version 5 this is moved to theBaseQueryBuilder
(#2250) so it's available for all queries then.For
docvalue_fields
there is the open issue #2316.puppylpg commentedon Oct 3, 2022
Thanks~ I'll consider using these in the future.
Appreciated!