in elasticsearch i got several hundred thousand documents with roughly this kind of structure:
{
"script": "/index.html",
"query": {
"ab": "hello",
"cd": "world",
"ef": "123"
}
The url "http://localhost/index.html?ab=hello&cd=world&ef=123" is parsed into it. "script" only contains the path and the target script - no query at all. The query array does not contain the same list of keys and of course different values, which doesn't matter at the moment at all.
I know, i am able to get a distinct list of "script" with:
{
"aggregations": {
"my_agg": {
"terms": {
"field": "script.raw"
}
}
}
}
which results into multiple buckets like
"buckets": [
{
"key": "/index.html",
"doc_count": 123456
},
{
"key": "/hello.html",
"doc_count": 1456
},
...
My question: Is there a way to get additionally a list and count of all query keys, which are occurring in the different urls?
Something like:
"buckets": [
{
"key": "/index.html",
"doc_count": 123456,
"query_key_count": {
"ab": 33456,
"cd": 3456,
"ef": 456,
"gh": 56,
"ij": 6
}
},
{
"key": "/hello.html",
"doc_count": 1456,
"query_key_count": {
"zy": 156,
"gh": 6
}
},
...
Thanks alot!!