0

I'm having trouble framing an address search query in ElasticSearch.

The address is stored in ES with the following structure:
Address { street, city, zipcode }

And here is a sample query:

GET /adr-address/_search
{   
  "query": {
    "multi_match": {
      "query":       "mainstreet, houston",
      "type":        "most_fields",
      "fields":      [ "street", "city", "zipcode"]
    }
  }
}

"hits": [
 {
      "_source": {
       "id": "S6v4xyO8UE5NRcWtmMATPQ==",
       "street": "Houston 2nd Avenue",
       "zipcode": "8032",
       "city": "Houston"
    }
 },
 {
    "_source": {
       "id": "aLgQFrO8zCT8m88lAnYZPQ==",
       "street": "Houston 1st Avenue",
       "zipcode": "8044",
       "city": "Houston"
    }
 },
 {
    "_source": {
       "id": "aLgQFrO8zCT8m88lAnYZPQ==",
       "street": "mainstreet",
       "zipcode": "8044",
       "city": "Houston"
    }
 },

The multi match query works fine most of the time, except for the scenario when street contains the city name as well. Elasticsearch assigns higher priority to these results which is totally understandable even though not acceptable.

Here is the _analyze result:

GET /adr-address/_validate/query?explain
{
  "query": {
    "multi_match": {
      "query":       "mainstreet, houston",
      "type":        "most_fields",
      "fields":      [ "street", "city", "zipcode" ]
    }
  }
}

{
   "valid": true,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "explanations": [
      {
         "index": "adr-address",
         "valid": true,
         "explanation": "(zipcode:mainstreet zipcode:houston) (street:mainstreet street:houston) (city:mainstreet city:houston)"
      }
   ]
}

It should be noted that google maps api returns accurate results for the same query.

Assumptions/conditions made until now:

  1. Tokenizers are: space, comma, numbers etc
  2. Input term can contain multi word street name, zip code or city in any order

Any suggestion on how I could improve the search reuslts?

3
  • I don't know, but have you tried to change the order: [ "city", "zipcode", "street" ] ? Commented Apr 25, 2016 at 11:54
  • Yes, but it didn't help and also the _analyze explanation shows that it searches all the terms in both the fields Commented Apr 25, 2016 at 13:26
  • I guess copy_to option is what I need. Copy all the values to a new field and run the search there. elastic.co/guide/en/elasticsearch/guide/current/… I should be knowing if this works by tomorrow. Commented Apr 25, 2016 at 14:31

1 Answer 1

0

Try using cross_fields instead of most_fields as a type for the multi_match.

From the docs:

The cross_fields type is particularly useful with structured documents where multiple fields should match. For instance, when querying the first_name and last_name fields for “Will Smith”, the best match is likely to have “Will” in one field and “Smith” in the other.

And the most_fields that you are using seems to be for searching through the same text, but analysed in different ways.

Example query:

GET /adr-address/_search
{   
  "query": {
    "multi_match": {
      "query":       "mainstreet, houston",
      "type":        "cross_fields",
      "fields":      [ "street", "city", "zipcode"]
    }
  }
}

link to docs

Sign up to request clarification or add additional context in comments.

2 Comments

Yup, that's exactly what i'm trying now and it looks very promising. I'll mark this as accepted answer once I'm done tomorrow.
works perfect! here is my final query: GET /adr-address/_validate/query?explain { "query": { "multi_match": { "query": "mainstreet, houston", "type": "cross_fields", "minimum_should_match": 2, "fields": [ "street", "city", "zipcode", "state" ] } } }

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.