0

I have mapped JSON data into Elasticsearch via Logstash which has worked, it has imported the data in and I can see it in Elasticsearch-Head.

My problem is querying the data. I can run a search for a field but it returns the entire type within the index as a single search result. I have tried a few variations but have not had any luck.

Here is the logstash shipper file:

input {
   exec {
     type => "recom_db"
     command => "curl -s -X GET http://www.test.com/api/edselastic/recom_db.json"
     interval => 86400
     codec => "json"
   }
   exec {
     type => "recom_ki"
     command => "curl -s -X GET http://www.test.com/api/edselastic/recom_ki.json"
     interval => 86400
     codec => "json"
   }
   exec {
     type => "recom_un"
     command => "curl -s -X GET http://www.test.com/api/edselastic/recom_un.json"
     interval => 86400
     codec => "json"
   }
}
output {
        elasticsearch {
                host => localhost
                index => "lib-recommender-%{+yyyy.MM.dd}"
                template_name => "recommender-template"
        }
}

and the Elasticsearch index is in the following form:

{
    "_index": "lib-recommender-2015.06.11",
    "_type": "recom_un",
    "_id": "qoZE4aF-SkS--tq_8MhH4A",
    "_version": 1,
    "_score": 1,
    "_source": {
        "item": [{
            "name": "AAM219 -- reading lists",
            "link": "http://www.test.com/modules/aam219.html",
            "description": "AAM219 -- reading lists",
            "terms": {
                "term": ["AAM219"]
            }
        },
        {
            "name": "AAR410 -- reading lists",
            "link": "http://www.test.com/modules/aar410.html",
            "description": "AAR410 -- reading lists",
            "terms": {
                "term": ["AAR410"]
            }
        }
        ...

Anyway so I have tried querying the data the various ways I have seen in the Elasticsearch documentation but am unable to a desired result. Here is one of the many queries I have tried:

curl -XPOST "http://localhost:9200/lib-recommender/recom_un/_search" -d'
{
    "fields": ["item.name", "item.link"],
    "query":{
        "term": {
                "item.terms.term": "AAM219"
                        }
                }
        }
}'

but it returns the entire type within the index (with the correct fields selected but disjointed and all of them):

{
    "took": 13,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 0.006780553,
        "hits": [{
            "_index": "lib-recommender-2015.06.11",
            "_type": "recom_un",
            "_id": "qoZE4aF-SkS--tq_8MhH4A",
            "_score": 0.006780553,
            "fields": {
                "item.link": ["http://www.test.com/modules/aam219.html",
                "http://www.test.com/modules/aar410.html",
                "http://www.test.com/modules/ac1201.html",
                "http://www.test.com/modules/aca401.html",

I am after the following result:

{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 0.006780553,
        "hits": [{
            "_index": "lib-recommender-2015.06.11",
            "_type": "recom_un",
            "_id": "qoZE4aF-SkS--tq_8MhH4A",
            "_score": 0.006780553,
            "_source": {
                "item": [{
                    "name": "AAM219 -- reading lists",
                    "link": "http://www.test.com/modules/aam219.html",
                    "description": "AAM219 -- reading lists",
                    "terms": {
                        "term": ["AAM219"]
                    }
                }
            }
        }
    }
}

What am I missing? Is the mapping of the index wrong for this kind of search (so should I manually make a mapping file for elasticsearch before importing data). Is there a parameter I am missing in the query? I have been looking for an answer but feel like I am running around in circles now, I am guessing it is something simple that I am overlooking but not sure.

2 Answers 2

2

Yes, for this kind of use case to work you need to create a custom mapping and make sure your item structure is of type nested otherwise all the fields in your item will be collapsed together as you saw in the results you showed.

So the mapping needs to be like this:

{
  "recom_un": {
    "properties": {
      "item": {
        "type": "nested",
        "properties": {
          "name": {
            "type": "string"
          },
          "link": {
            "type": "string"
          },
          "description": {
            "type": "string"
          },
          "terms": {
            "properties": {
              "term": {
                "type": "string"
              }
            }
          }
        }
      }
    }
  }
}

Then you can modify your query a bit to use a nested query instead like this. Also note I'm including the inner_hits so your results only include the matching nested document:

curl -XPOST "http://localhost:9200/lib-recommender/recom_un/_search" -d'
{
  "fields": [
    "item.name",
    "item.link"
  ],
  "query": {
    "nested": {
      "path": "item",
      "query": {
        "term": {
          "item.terms.term": "AAM219"
        }
      },
      "inner_hits": {}
    }
  }
}'
Sign up to request clarification or add additional context in comments.

4 Comments

I have tried that but am now getting no results. I have double checked the mapping and template and the "item" objects now have the "type": "nested" parameter. Is there anything else I can try or could I have a configuration issue perhaps?
Is item the only property of your document or are there others? In the former case, why not simply indexing each single item as a standalone document so you could get rid of the nested type altogether?
Yes you are right about indexing of item, it will be the only property. I wanted to do a nested search without changing the data much as I will have to search nested data anyway (item.terms.term for example). For the record your answer above is correct but i needed to change the "term" to "match". Your way would work if I had the index for the field set to "not_analyzed".
Actually I needed to set the nested type for terms also. I will post the full solution I have below to help guide other people who have the same issue in future. Thanks Val for pointing me in the right direction.
1

Props to Val's answer above. It was mostly what but with another level of nesting. Here is the mapping:

{
  "recom_un": {
    "properties": {
      "item": {
        "type": "nested",
        "properties": {
          "name": {
            "type": "string"
          },
          "link": {
            "type": "string"
          },
          "description": {
            "type": "string"
          },
          "terms": {
            "type": "nested",
            "properties": {
              "term": {
                "type": "string"
              }
            }
          }
        }
      }
    }
  }
}

The search query I used to get what I wanted:

curl -XPOST "http://localhost:9200/lib-recommender/recom_un/_search" -d'
{
  "_source": false,
  "query": {
    "filtered": {
      "filter": {
        "nested": {
          "path": "item",
          "query": {
            "nested": {
              "path": "item.terms",
              "query": {
                "match": {
                  "term": "AAM219"
                }
              }
            }
          },
          "inner_hits": { }
        }
      }
    }
  }
}'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.