1

I have added 15k records in elasticsearch index products_idx1 and type product.

In records product name like apple iphone 6 so when I search for iphone6 it returns empty data.

Here is my code in php elasticsearch

<?php

    use Elasticsearch\ClientBuilder;

    require 'vendor/autoload.php';

   $client = ClientBuilder::create()->build();
 $values =['name','name.prefix','name.suffix','sku'];
$params =
[
'client'=>['verify'=>1,'connect_timeout'=>5],
'from'=> 0,
'size'=>25,
 'body'  =>[
'query' => [
 'bool'=>
            [
            'should'=> [[
                'multi_match'=> ['query'=>'iphone6','type'=>'cross_fields','fields'=>$values,'operator'=>'OR']
                ],
                ['match'=>['all'=>['query'=>'iphone6','operator'=>'OR','fuzziness'=>'AUTO'] ]]
                ]
            ]

],
'sort'=>['_score'=>['order'=>'desc']],
],

'index'=>'products_idx1'
];

 $response = $client->search($params);
echo "<pre>";print_r($response);
7
  • do you get results for just "iphone"? Commented Sep 8, 2020 at 15:00
  • No i Just give example if someone search like appleiphone so it should return results so for that Should I make search analyzer ? Commented Sep 8, 2020 at 15:08
  • @Nate Right now I am getting result zero if I search 'iphone6' Commented Sep 8, 2020 at 15:09
  • You need to have tokens that match and b/c ES splits text into tokens on whitespace by default I think that's why you get 0 results with those queries. There are prefix queries that might help some of this and you could set up another field variant where all spaces get removed as an alternate analysis. There are a lot of options Commented Sep 8, 2020 at 16:47
  • @Nate I do not know correct method for that. If you provide me reference link I can research on that and implement in my project Commented Sep 8, 2020 at 17:09

2 Answers 2

1
+50

Using the shingle and pattern_replace token filter it's possible to get the result for all 3 search terms which is mentioned in question and comment aka iphone, iphone6 and appleiphone and below is complete example of it.

As explained in the comment, you search time tokens generated from search term should match the index time tokens generated from indexed doc, in order to get the search result and this is what I've achieved by creating the custom analyzer.

Index mapping

{
  "settings": {
    "analysis": {
      "analyzer": {
        "text_analyzer": {
          "tokenizer": "standard",
          "filter": [
            "shingle",
            "lowercase",
            "space_filter"
          ]
        }
      },
      "filter": {
        "space_filter": {
          "type": "pattern_replace",
          "pattern": " ",
          "replacement": "",
          "preserve_original": true
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "text_analyzer"
      }
    }
  }
}

Index your sample doc

{
  "title" : "apple iphone 6" 
}

Search query of appleiphone with result

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": "appleiphone"
          }
        }
      ]
    }
  }
}

result

"hits": [
      {
        "_index": "ana",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.3439677,
        "_source": {
          "title": "apple iphone 6",
          "title_normal": "apple iphone 6"
        }
      }
    ]

Search query for iphone6 with result

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": "iphone6"
          }
        }
      ]
    }
  }
}

Result

 "hits": [
      {
        "_index": "ana",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.3439677,
        "_source": {
          "title": "apple iphone 6",
          "title_normal": "apple iphone 6"
        }
      }
    ]

And Last but not the least search query for iphone

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": "iphone"
          }
        }
      ]
    }
  }
}

Result

"hits": [
      {
        "_index": "ana",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.3439677,
        "_source": {
          "title": "apple iphone 6",
          "title_normal": "apple iphone 6"
        }
      }
    ]
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for your effort but when I search 'iphone6' it still return zero data but if I add new doc 'apple iphone x' and then search by 'iphonex' it returns 'apple iphone x' ,why it does not search for iphone6 ?
But I have created a new index and inject new document in that
@NirajPatel this is strange, You can see in my answer that it works, I've personally tried this locally, hope you have indexed apple iphone 6 before searching for iphone6?
I accepted your answer and give 50point and upvote but still I am unable to search iphone6
I am using elasticsearch 6.8.0 and when I ran your index analyzer code it gave me error so I added one code before "properties"
|
1

As my answer is already very big, adding the information about the analyze API in another answer for readability reasons and for folks who are not very familiar with analyzers in Elasticsearch and how it works.

In my previous answer's comment as @Niraj mentioned other documents are working but he is having an issue with iphone6 query, so in order to debug the issue anlyze API is very useful.

First check the index time tokens present for your document which you think should match your search query which is in this case, apple iphone 6

PUT http://{{hostname}}:{{port}}/{{index}}/_analyze

{
"text" : "apple iphone 6",
"analyzer" : "text_analyzer"
}

And generated tokens

{
"tokens": [
{
"token": "apple",
"start_offset": 0,
"end_offset": 5,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "appleiphone",
"start_offset": 0,
"end_offset": 12,
"type": "shingle",
"position": 0,
"positionLength": 2
},
{
"token": "iphone",
"start_offset": 6,
"end_offset": 12,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "iphone6", //note this carefully
"start_offset": 6,
"end_offset": 14,
"type": "shingle",
"position": 1,
"positionLength": 2
},
{
"token": "6",
"start_offset": 13,
"end_offset": 14,
"type": "<NUM>",
"position": 2
}
]
}

Now as you can see the analyzer used by us creates iphone6 also as a token, now check for search time token

{
  "text" : "iphone6",
  "analyzer" : "text_analyzer"
}

And tokens

{
    "tokens": [
        {
            "token": "iphone6",
            "start_offset": 0,
            "end_offset": 7,
            "type": "<ALPHANUM>",
            "position": 0
        }
    ]
}

Now you can notice search tokens also creats iphone6 as a token which is present in index time tokens as well, so that's the reason it will match the search query which I already shown in my complete example given in first answer

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.