1

I've inserted a document with a raw_id field equal to 1.2.3.04ABC, and I'm trying to construct a regular expression query to search the document in ES. I'm using the following query:

curl -X POST 'http://localhost:9200/hello/world/_search' -d '{
"query": {
    "regexp": {
        "raw_id": "1\\.2\\.3\\.04ABC" 
        }
    }
}' 

This returns the result empty result

{
    "took":1,
    "timed_out":false,
    "_shards": {
        "total":5,
        "successful":5,
        "failed":0
    },
    "hits": {
        "total":0,
        "max_score":null,
        "hits":[]
    }
}

On the other hand, when I use the slightly modified query

curl -X POST 'http://localhost:9200/hello/world/_search' -d '{
"query": {
    "regexp": {
        "raw_id": "1\\.2\\.3.*" 
        }
    }
}' 

I get the nonempty result:

{
    "_shards": {
        "failed": 0,
        "successful": 5,
        "total": 5
    },
    "hits": {
        "hits": [
            {
                "_id": "adfafadfafa",
                "_index": "hello",
                "_score": 1.0,
                "_source": {
                    "raw_id": "1.2.3.04ABC"
                },
                "_type": "world"
            }
        ],
        "max_score": 1.0,
        "total": 1
    },
    "timed_out": false,
    "took": 2
}

Can someone please help me understand why the first query doesn't work?

1
  • 1
    Not gonna lie, according to the docs that should work. Only thing I can think of is regex indexing isn't enabled for raw_id. Commented Jul 24, 2015 at 20:40

1 Answer 1

1

My guess is that your raw_id field is an analyzed string, while it should be not_analyzed. I've used the following mapping with one analyzed string field id and another not_analyzed string field raw_id:

curl -XPUT 'http://localhost:9200/hello' -d '{
  "mappings": {
    "world": {
      "properties": {
        "id": {
          "type": "string"
        },
        "raw_id": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }
  }
}'

Then I've indexed the following document:

curl -XPUT 'http://localhost:9200/hello/world/1' -d '{
  "id": "1.2.3.04ABC",
  "raw_id": "1.2.3.04ABC"
}'

Now taking your query above, if I search against the id field, I get no hits:

curl -XPOST 'http://localhost:9200/hello/world/_search' -d '{
"query": {
    "regexp": {
        "id": "1\\.2\\.3\\.04ABC" 
        }
    }
}'
=> 0 hits KO

However, I do get one hit when I search against the raw_id field:

curl -XPOST 'http://localhost:9200/hello/world/_search' -d '{
"query": {
    "regexp": {
        "raw_id": "1\\.2\\.3\\.04ABC" 
        }
    }
}'
=> 1 hit OK

With your second query I get a hit with each field:

curl -XPOST 'http://localhost:9200/hello/world/_search' -d '{
"query": {
    "regexp": {
        "id": "1\\.2\\.3.*" 
        }
    }
}'
=> 1 hit OK

curl -XPOST 'http://localhost:9200/hello/world/_search' -d '{
"query": {
    "regexp": {
        "raw_id": "1\\.2\\.3.*" 
        }
    }
}'
=> 1 hit OK
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.