We're trying to find distinct inner objects in Elasticsearch. This would be a minimum example for our case. We're stuck with something like the following mapping (changing types or indices or adding new fields wouldn't be a problem, but the structure should remain as it is):
{
"building": {
"properties": {
"street": {
"type": "string",
"store": "yes",
"index": "not_analyzed"
},
"house number": {
"type": "string",
"store": "yes",
"index": "not_analyzed"
},
"city": {
"type": "string",
"store": "yes",
"index": "not_analyzed"
},
"people": {
"type": "object",
"store": "yes",
"index": "not_analyzed",
"properties": {
"firstName": {
"type": "string",
"store": "yes",
"index": "not_analyzed"
},
"lastName": {
"type": "string",
"store": "yes",
"index": "not_analyzed"
}
}
}
}
}
}
Assuming we have this example data:
{
"buildings": [
{
"street": "Baker Street",
"house number": "221 B",
"city": "London",
"people": [
{
"firstName": "John",
"lastName": "Doe"
},
{
"firstName": "Jane",
"lastName": "Doe"
}
]
},
{
"street": "Baker Street",
"house number": "5",
"city": "London",
"people": [
{
"firstName": "John",
"lastName": "Doe"
}
]
},
{
"street": "Garden Street",
"house number": "1",
"city": "London",
"people": [
{
"firstName": "Jane",
"lastName": "Smith"
}
]
}
]
}
When we query for the street "Baker Street" (and whatever additional options needed), we expect to get the following list:
[
{
"firstName": "John",
"lastName": "Doe"
},
{
"firstName": "Jane",
"lastName": "Doe"
}
]
The format does not matter too much, but we should be able to parse the first and last name. Just, as our actual data-set is much larger, we need the entries to be distinct.
We are using Elasticsearch 1.7.