2

I have documents in the elasticsearch and can't understand how to apply search script that should return documents if any attachment doesn't contain uuid or uuid is null. Version of elastic 5.2. Mapping of documents

"mappings": {
    "documentType": {
        "properties": {
            "attachment": {
                "properties": {
                    "uuid": {
                        "type": "text"
                    },
                    "path": {
                        "type": "text"
                    },
                    "size": {
                        "type": "long"
                    }
                }
            }}}

In the elasticsearch it looks like

{
        "_index": "documents",
        "_type": "documentType",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "attachment": [
               {
                "uuid": "21321321",
                "path": "../uploads/somepath",
                "size":1231
               },
               {
                "path": "../uploads/somepath",
                "size":1231
               },      
         ]},
{
        "_index": "documents",
        "_type": "documentType",
        "_id": "2",
        "_score": 1.0,
        "_source": {
          "attachment": [
               {
                "uuid": "223645641321321",
                "path": "../uploads/somepath",
                "size":1231
               },
               {
                "uuid": "22341424321321",
                "path": "../uploads/somepath",
                "size":1231
               },        
         ]},
{
        "_index": "documents",
        "_type": "documentType",
        "_id": "3",
        "_score": 1.0,
        "_source": {
          "attachment": [
               {
                "uuid": "22789789341321321",
                "path": "../uploads/somepath",
                "size":1231
               }, 
               {
                "path": "../uploads/somepath",
                "size":1231
               },      
         ]}

As result I want to get attachments with _id 1 and 3. But as result I get error of the script I tried to apply next script:

{
    "query": {
        "bool": {
            "must": [
                {
                    "exists": {
                        "field": "attachment"
                    }
                },
                {
                    "script": {
                        "script": {
                            "inline": "for (item in doc['attachment'].value) { if (item['uuid'] == null) { return true}}",
                            "lang": "painless"
                        }
                    }
                }
            ]
        }
    }
}

Error is next:

 "root_cause": [
            {
                "type": "script_exception",
                "reason": "runtime error",
                "script_stack": [
                    "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:77)",
                    "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:36)",
                    "for (item in doc['attachment'].value) { ",
                    "                 ^---- HERE"
                ],
                "script": "for (item in doc['attachment'].value) { if (item['uuid'] == null) { return true}}",
                "lang": "painless"
            }
        ],

Is it possible to select documents in case even one attachment object doesn't contain uuid ?

2 Answers 2

2

Iterating arrays of objects is not as trivial as one would expect. I've written extensively about it here and here.

Since your attachments are not defined as nested, ES will internally represent them as flattened lists of values (also called "doc values"). For instance attachment.uuid in doc#2 will become ["223645641321321", "22341424321321"], and attachments.size will turn into [1231, 1231].

This means that you can simply compare the .length of these flattened representations! I assume attachment.size will always be present and can be thus taken as the comparison baseline.

One more thing. To take advantage of these optimized doc values for textual fields, it'll require one small mapping change:

PUT documents/documentType/_mappings
{
  "properties": {
    "attachment": {
      "properties": {
        "uuid": {
          "type": "text",
          "fielddata": true     <---
        },
        "path": {
          "type": "text"
        },
        "size": {
          "type": "long"
        }
      }
    }
  }
}

When that's done and you've reindexed your docs — which can be done with this little Update by query trick:

POST documents/_update_by_query

You can then use the following script query:

POST documents/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "attachment"
          }
        },
        {
          "script": {
            "script": {
              "inline": "def size_field_length = doc['attachment.size'].length; def uuid_field_length =  doc['attachment.uuid'].length; return uuid_field_length < size_field_length",
              "lang": "painless"
            }
          }
        }
      ]
    }
  }
}
Sign up to request clarification or add additional context in comments.

3 Comments

thank you for your response. I will check it and looks for future I will make reindex for attachment that change type on nested.
You're welcome. Making the attachments nested is a good approach. Just keep in mind that scripting with nested fields is a chapter of its own. The links at the top of my answer should point you to the right direction, though!
It works. I have reindexed and run your script and it works. Thank you a lot.
1

Just to supplement this answer. If mapping for uuid field was created automatically elastic search adds it in this way:

"uuid": {
    "type": "text",
    "fields": {
        "keyword": {
            "type": "keyword",
            "ignore_above": 256
        }
    }
}

then script could look like:

POST documents/_search
{
    "query": {
        "bool": {
            "must": [
                {
                    "exists": {
                        "field": "attachment"
                    }
                },
                {
                    "script": {
                        "script": {
                            "inline": "doc['attachment.size'].length > doc['attachment.uuid.keyword'].length",
                            "lang": "painless"
                        }
                    }
                }
            ]
        }
    }
}

1 Comment

you are right, your suggestion also works fine even without reindexing

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.