1

I am trying to filter based on the cluster field and whether the podName field has a value. Then I want to filter out some fields with specific values but I get the values for other cluster fields than the one specified.

So the following query will also return values for cluster2 and cluster3.

I can't figure out what the correct syntax is.

{
   "size":50,
   "query":{
      "bool":{
         "must":[
            {
               "range":{
                  "timestamp":{
                     "gte":"now-1h"
                  }
               }
            },
            {
               "query_string":{
                  "query":"(podstatus.podName:* AND cluster:cluster1) AND NOT podstatus.containerStatus:true AND NOT podstatus.phase:Running AND NOT podstatus.phase:Succeeded AND NOT podstatus.started: true"
               }
            }
         ]
      }
   }
}

Sample document

{
    "timestamp":  "2020-07-09T17:30:04",
    "cluster":  "cluster1",
    "namespace":  "kube-system",
    "podstatus.podName":  "cronjob-kubernetes-resource-monitor-1594233600-4frbc",
    "podstatus.containerStatus":  "false",
    "podstatus.restartCount":  0,
    "podstatus.started":  "false",
    "podstatus.phase":  "Succeeded"
}

Mapping

{
    "cluster-resources-cluster1-2020.07.08-000001" : {
      "mappings" : {
        "properties" : {
          "allocated" : {
            "properties" : {
              "pods-percent" : {
                "type" : "float"
              }
            }
          },
          "capacity" : {
            "properties" : {
              "cpu" : {
                "type" : "long"
              },
              "mem" : {
                "type" : "long"
              },
              "pods" : {
                "type" : "long"
              }
            }
          },
          "cluster" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "depstatus" : {
            "properties" : {
              "availableReplicas" : {
                "type" : "long"
              },
              "deploymentName" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "readyReplicas" : {
                "type" : "long"
              },
              "replicas" : {
                "type" : "long"
              },
              "unavailableReplicas" : {
                "type" : "long"
              },
              "updatedReplicas" : {
                "type" : "long"
              }
            }
          },
          "namespace" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "podstatus" : {
            "properties" : {
              "containerStatus" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "phase" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "podName" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "restartCount" : {
                "type" : "long"
              },
              "started" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              }
            }
          },
          "requests" : {
            "properties" : {
              "cpu" : {
                "type" : "long"
              },
              "cpu-percent" : {
                "type" : "float"
              },
              "mem" : {
                "type" : "long"
              },
              "mem-percent" : {
                "type" : "float"
              },
              "pods" : {
                "type" : "long"
              }
            }
          },
          "timestamp" : {
            "type" : "date"
          }
        }
      }
    }
  }
10
  • 1
    Not sure but there is an extra parenthesis ) at the very end. Could you remove that and give it a try. Commented Jul 9, 2020 at 17:19
  • Whops, yeah that wasn't it, I tried a number of combinations. Commented Jul 9, 2020 at 17:23
  • Got it. Also could you let me know if podstatus is a nested type or just object type. Would help if you can also share the mapping. Commented Jul 9, 2020 at 17:26
  • It's not and all the fields are strings. Commented Jul 9, 2020 at 17:31
  • 1
    @Zucchini Can you add a sample document and your mapping for the fields as well? Commented Jul 9, 2020 at 18:17

1 Answer 1

1

Your query seems to be working correctly. However I'm posting the below steps and you let me know if you can find any observations in similar manner.

I've taken the mapping, created sample documents, the query you've shared and response I get.

Mapping:

PUT cluster_index_001
{
  "mappings" : {
    "properties" : {
      "allocated" : {
        "properties" : {
          "pods-percent" : {
            "type" : "float"
          }
        }
      },
      "capacity" : {
        "properties" : {
          "cpu" : {
            "type" : "long"
          },
          "mem" : {
            "type" : "long"
          },
          "pods" : {
            "type" : "long"
          }
        }
      },
      "cluster" : {
        "type" : "text",
        "fields" : {
          "keyword" : {
            "type" : "keyword",
            "ignore_above" : 256
          }
        }
      },
      "depstatus" : {
        "properties" : {
          "availableReplicas" : {
            "type" : "long"
          },
          "deploymentName" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "readyReplicas" : {
            "type" : "long"
          },
          "replicas" : {
            "type" : "long"
          },
          "unavailableReplicas" : {
            "type" : "long"
          },
          "updatedReplicas" : {
            "type" : "long"
          }
        }
      },
      "namespace" : {
        "type" : "text",
        "fields" : {
          "keyword" : {
            "type" : "keyword",
            "ignore_above" : 256
          }
        }
      },
      "podstatus" : {
        "properties" : {
          "containerStatus" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "phase" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "podName" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "restartCount" : {
            "type" : "long"
          },
          "started" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      },
      "requests" : {
        "properties" : {
          "cpu" : {
            "type" : "long"
          },
          "cpu-percent" : {
            "type" : "float"
          },
          "mem" : {
            "type" : "long"
          },
          "mem-percent" : {
            "type" : "float"
          },
          "pods" : {
            "type" : "long"
          }
        }
      },
      "timestamp" : {
        "type" : "date"
      }
    }
  }
}

Sample Documents:

POST cluster_index_001/_doc/1
{
    "timestamp":  "2020-07-09T17:30:04",
    "cluster":  "cluster1",
    "namespace":  "kube-system",
    "podstatus.podName":  "cronjob-kubernetes-resource-monitor-1594233600-4frbc",
    "podstatus.containerStatus":  "false",
    "podstatus.restartCount":  0,
    "podstatus.started":  "false",
    "podstatus.phase":  "Failed"
}

POST cluster_index_001/_doc/2
{
    "timestamp":  "2020-07-10T17:30:04",
    "cluster":  "cluster1",
    "namespace":  "kube-system",
    "podstatus.podName":  "cronjob-kubernetes-resource-monitor-1594233600-4frbc",
    "podstatus.containerStatus":  "false",
    "podstatus.restartCount":  0,
    "podstatus.started":  "false",
    "podstatus.phase":  "Failed"
}

POST cluster_index_001/_doc/3
{
    "timestamp":  "2020-07-10T17:30:04",
    "cluster":  "cluster2",
    "namespace":  "kube-system",
    "podstatus.podName":  "cronjob-kubernetes-resource-monitor-1594233600-4frbc",
    "podstatus.containerStatus":  "false",
    "podstatus.restartCount":  0,
    "podstatus.started":  "false",
    "podstatus.phase":  "Failed"
}

Sample Query:

POST cluster_index_001/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "timestamp": {
              "gte": "now-2d"
            }
          }
        },
        {
          "query_string": {
            "query":"(podstatus.podName:* AND cluster:cluster1) AND NOT podstatus.containerStatus:true AND NOT podstatus.phase:Running AND NOT podstatus.phase:Succeeded AND NOT podstatus.started:true"
          }
        }
      ]
    }
  }
}

You could also make use of cluster.keyword in the above like this cluster.keyword:cluster1 for exact matches.

Response:

{
  "took" : 86,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 2.4700036,
    "hits" : [
      {
        "_index" : "cluster_index_001",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 2.4700036,
        "_source" : {
          "timestamp" : "2020-07-09T17:30:04",
          "cluster" : "cluster1",
          "namespace" : "kube-system",
          "podstatus.podName" : "cronjob-kubernetes-resource-monitor-1594233600-4frbc",
          "podstatus.containerStatus" : "false",
          "podstatus.restartCount" : 0,
          "podstatus.started" : "false",
          "podstatus.phase" : "Failed"
        }
      },
      {
        "_index" : "cluster_index_001",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 2.4700036,
        "_source" : {
          "timestamp" : "2020-07-10T17:30:04",
          "cluster" : "cluster1",
          "namespace" : "kube-system",
          "podstatus.podName" : "cronjob-kubernetes-resource-monitor-1594233600-4frbc",
          "podstatus.containerStatus" : "false",
          "podstatus.restartCount" : 0,
          "podstatus.started" : "false",
          "podstatus.phase" : "Failed"
        }
      }
    ]
  }
}

Note that query works correctly and returns correct set of documents

Additional Debugging and Further Info:

This steps would help you verify and let you know why the document that was not supposed to be returned is returning.

For e.g the 3rd Document in the sample was not showing up in the response for me, and the way to figure out that is to make use of Explain API.

GET cluster_index_001/_explain/3       <----- Note this
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "timestamp": {
              "gte": "now-2d"
            }
          }
        },
        {
          "query_string": {
              "query":"podstatus.podName:* AND cluster:cluster1 AND NOT podstatus.containerStatus:true AND NOT podstatus.phase:Running AND NOT podstatus.started: true"
          }
        }
      ]
    }
  }
}

The response for which I see is the below:

{
  "_index" : "cluster_index_001",
  "_type" : "_doc",
  "_id" : "3",
  "matched" : false,
  "explanation" : {
    "value" : 0.0,
    "description" : "Failure to meet condition(s) of required/prohibited clause(s)",
    "details" : [
      {
        "value" : 1.0,
        "description" : "ConstantScore(DocValuesFieldExistsQuery [field=timestamp])",
        "details" : [ ]
      },
      {
        "value" : 0.0,
        "description" : "no match on required clause (+ConstantScore(NormsFieldExistsQuery [field=podstatus.podName]) +cluster:cluster1 -podstatus.containerStatus:true -podstatus.phase:running -podstatus.started:true)",
        "details" : [
          {
            "value" : 0.0,
            "description" : "Failure to meet condition(s) of required/prohibited clause(s)",
            "details" : [
              {
                "value" : 1.0,
                "description" : "ConstantScore(NormsFieldExistsQuery [field=podstatus.podName])",
                "details" : [ ]
              },
              {
                "value" : 0.0,
                "description" : "no match on required clause (cluster:cluster1)",
                "details" : [
                  {
                    "value" : 0.0,
                    "description" : "no matching term",
                    "details" : [ ]
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
}

Note how the description in the above response clearly states this:

"description" : "Failure to meet condition(s) of required/prohibited clause(s)"

Also note the below:

"description" : "no match on required clause (+ConstantScore(NormsFieldExistsQuery [field=podstatus.podName]) +cluster:cluster1 -podstatus.containerStatus:true -podstatus.phase:running -podstatus.started:true)",

As a result, you know now why document 3 is not returning in the response.

Further if you still are not able to figure out the issue, make sure of the below points:

  • Make sure you are not using any alias and that you focus on single index at a time. Narrow down the index which can be causing this issue if you are using alias.
  • Also make sure that your that document in question does not have multiple values for e.g. "cluster": "cluster2, cluster1"
  • If the above two points are clear, go to your browser and type http://<your_host_name>:<port>/cluster-resources-cluster1-2020.07.08-000001/_settings and observe if there are any custom analyzers that has been implemented for e.g. Edge Ngrams or Ngrams and if your standard analyzer has been over-ridden.
  • Execute this http://<your_host_name>:<port>/cluster-resources-cluster1-2020.07.08-000001/_stats?pretty and notice if you find anything peculiar.

One thing at a time, please do share your observations and we can see what is the issue.

Sign up to request clarification or add additional context in comments.

1 Comment

Using cluster.keyword made me get the right matches. Thanks a lot for a very detailed and helpful answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.