1

I'm using this query in order to get which values there are in a single field (SQLfying would be a SELECT field, count(field) GROUP BY field.

In order to do that I'm sending this request to ES:

{
  "query" : {
    "bool" : {
      "must" : {
        "exists" : {
          "field" : "metainfos.ceeaacceaeaaccebeaacceceaaccedeaac"
        }
      }
    }
  },
  "aggregations" : {
    "followUpActivity.metainfo.metainfos.ceeaacceaeaaccebeaacceceaaccedeaac" : {
      "terms" : {
        "field" : "metainfos.ceeaacceaeaaccebeaacceceaaccedeaac",
        "missing" : "null"
      }
    }
  }
}

There's only one document on this collection:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "living_v1",
      "_type" : "fuas",
      "_id" : "a2cb0ba1-8955-11e6-8a00-0242ac110007",
      "_score" : 1.0,
      "_routing" : "user2",
      "_source" : {
        "user" : "user2",
        "timestamp" : "2016-10-03T11:08:30.074Z",
        "startTimestamp" : "2016-10-03T11:08:30.074Z",
        "dueTimestamp" : null,
        "closingTimestamp" : null,
        "matter" : "Fua 1",
        "comment" : null,
        "status" : 0,
        "backlogStatus" : 20,
        "metainfos" : {
          "ceeaacceaeaaccebeaacceceaaccedeaac" : [ "Living Digital" ]
        },
        "resources" : [ ],
        "notes" : null
      }
    } ]
  }
}

As you can see doc.metainfos.ceeaacc... = ["Living Digital"]

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "living_v1",
      "_type" : "fuas",
      "_id" : "a2cb0ba1-8955-11e6-8a00-0242ac110007",
      "_score" : 1.0,
      "_routing" : "user2",
      "_source":{"user":"user2","timestamp":"2016-10-03T11:08:30.074Z","startTimestamp":"2016-10-03T11:08:30.074Z","dueTimestamp":null,"closingTimestamp":null,"matter":"Fua 1","comment":null,"status":0,"backlogStatus":20,"metainfos":{"ceeaacceaeaaccebeaacceceaaccedeaac":["Living Digital"]},"resources":[],"notes":null}
    } ]
  },
  "aggregations" : {
    "followUpActivity.metainfo.metainfos.ceeaacceaeaaccebeaacceceaaccedeaac" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ {
        "key" : "digital",
        "doc_count" : 1
      }, {
        "key" : "living",
        "doc_count" : 1
      } ]
    }
  }
}

ES is getting me two values: one for "living" and another one for "digital". I'd like to get aggregation using the shole values "Living Digital".

The mapping scheme is:

{
  "living_v1" : {
    "mappings" : {
      "fuas" : {
        "properties" : {
          "backlogStatus" : {
            "type" : "long"
          },
          "comment" : {
            "type" : "string"
          },
          "matter" : {
            "type" : "string"
          },
          "metainfos" : {
            "properties" : {
              "ceeaacceaeaaccebeaacceceaaccedeaac" : {
                "type" : "string"
              }
            }
          },
          "startTimestamp" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
          },
          "status" : {
            "type" : "long"
          },
          "timestamp" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
          },
          "user" : {
            "type" : "string",
            "index" : "not_analyzed"
          }
        }
      }
    }
  }
}

As you can see:

"metainfos" : {
    "properties" : {
        "ceeaacceaeaaccebeaacceceaaccedeaac" : {
             "type" : "string"
         }
     }
 }

The problem for me is "ceeaacceaeaaccebeaacceceaaccedeaac" is a user on-demand property created and I don't know how could I set an not-analyzed to any metainfos.* field.

EDIT

I've tested with:

#curl -XPUT 'http://localhost:9200/living_v1/' -d '
{
  "mappings": {
    "fuas": {
      "dynamic_templates": [
        {
          "metainfos": {
            "path_match":   "metainfos.*",
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "index": "not_analyzed"
            }
          }
        }
      ]
    }
  }
}
'

It's telling me that living_v1 index already exist. As far I've been able to figure out on here I need to send a PUT against index:

{
    "error":{
    "root_cause":[
        {
            "type":"index_already_exists_exception",
            "reason":"already exists",
            "index":"living_v1"
        }
    ],
    "type":"index_already_exists_exception",
    "reason":"already exists",
    "index":"living_v1"
},
"status":400
}
3
  • I think you are looking for dynamic index templates: stackoverflow.com/a/23370138/693546 Commented Oct 3, 2016 at 12:01
  • You can't update the mapping for anything that already has data. You may need to create a new index with the fixed mapping, then reindex your data to this index. Then you could delete the old index and use it's name as a alias for the new index. Commented Apr 25, 2019 at 10:23
  • Another option is to add an additional field (elastic.co/guide/en/elasticsearch/reference/6.4/…) (call it "raw", for instance). Then you can aggregate on "ceeaacceaeaaccebeaacceceaaccedeaac.raw" instead, while preserving the mapping of "ceeaacceaeaaccebeaacceceaaccedeaac". I think this will only affect documents that are indexed after you change the mappings, however. Commented Apr 25, 2019 at 10:27

2 Answers 2

1

As you already noticed, the search behaviour is caused by the mapping that was applied by default. This mapping does analyzing on all string-valued fields that are not defined differently.

So if you don't yet know which properties (=keys) will be in the metainfos object, you can use the dynamic templates feature as described here and here to define which mapping should be applied for these fields and so override the default behaviour of analyzing a string field.

You could apply a mapping that looks a bit like this (not tested):

{
  "mappings": {
    "fuas": {
      "dynamic_templates": [
        {
          "metainfos": {
            "path_match":   "metainfos.*",
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "index": "not_analyzed",
            }
          }
        }
      ]
    }
  }
}
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot @Andreas. I've tried it, however it has come up some issues. What's the difference between templates and your approach? Are they the same?
1

As other people have pointed out, dynamic templates is the way to go. The only problem is that you can't change index template after some documents were indexed. You will need to recreate the index (delete index, create mapping, feed new documents)

2 Comments

Ok @oldbam, I got it. Is there some straightforward way to recreate the index from index_v1 to index_v2?
you may consider looking at answers at stackoverflow.com/questions/28626803/… . I always deleted an index and started feeding documents again when I was changing index template

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.