7

I want to perform searching using regular expression involving whitespace in elasticsearch. I have already set my field to not_analyzed. And it's mapping is just like

"type1": {
   "properties": {
      "field1": {
         "type": "string",
         "index": "not_analyzed",
         "store": true
      }
   }
}

And I input two value for test,

"field1":"XXX YYY ZZZ"
"field1":"XXX ZZZ YYY"

And i do some case using regex query /XXX YYY/
(I want to use this query to find record1 but not record2)

{
    "query": {
        "query_string": {
           "query": "/XXX YYY/"
        }
    }
}

But it return 0 results.

However if I search without using regex (without the forward slash '/'), both record1 and record2 are returned.

Is that in elasticsearch, i cannot search using regex query involving space?

1

3 Answers 3

1

What you need is a ''term'' query that doesn't tokenise the search query by breaking it down into smaller parts. More about the term query here: https://www.elastic.co/guide/en/elasticsearch/reference/2.0/query-dsl-term-query.html

There's a special breed of term queries that allows you to use regexes called regexp queries. That should match any whitespaces as well: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html

Sign up to request clarification or add additional context in comments.

Comments

0

You can keep using your query string, but your regexp is just missing a tiny part, i.e. the .* at the end. If you run that you'll get the single result you expect.

{
    "query": {
        "query_string": {
           "query": "/XXX YYY.*/"
        }
    }
}

4 Comments

This does not work for my use-case. "/XXX YYY.*/" will match a string like XXX YYY blah blah blah. However, "/XXX YYY .* something else/" will not match XXX YYY blah blah something else... No clue yet as to why...
@Tabbernaut feel free to create a new question for your specific issue
Figured out a way: "/XXX YYY.*/" AND "/.*something else/" works. It just can't be done in a single clause, I guess.
@Tabbernaut not sure about your use case, feel free to create a new question
-1

You can use regexp queries to achieve this. Mind you, the query performance may be slow. The below query will search for all documents in which the value of field1 contains "XXX YYY".

POST <index_name>/type1/_search
{
   "query": {
      "regexp": {
         "field1": ".*XXX YYY.*"
      }
   }
}

2 Comments

i dont think that recognizes the space character. i also tried \s but that did not work
@AbtPst Why do you say that? Note that the field "field1" is marked as "not_analyzed". That will result in Elasticsearch not tokenizing around whitespaces. So my query will work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.