1

I am using NEST 2.3.2. And I am trying to build a query with nested aggregations. Basically, I have an index with logs, which have a timestamp and a result code. I want to first put those logs into buckets of minutes, and then further classify them according to result code.

I have the following F# code for generating the query.

/// Generate an aggregation to put buckets by result code
let generateAggregationByResultCode () =
    let resultAggregationName = "result_aggregation"
    let aggregationByResults = new TermsAggregation(resultAggregationName)
    aggregationByResults.Field <- new Field(Name = "Result")
    aggregationByResults.ExecutionHint <- new Nullable<TermsAggregationExecutionHint>(TermsAggregationExecutionHint.GlobalOrdinals);
    aggregationByResults.MinimumDocumentCount <- new Nullable<int>(0);
    aggregationByResults.Size <- new Nullable<int>(bucketSize);
    aggregationByResults.Missing <- "-128"
    aggregationByResults

/// Generate an aggregation to classify into buckets by minutes and then by result code
let generateNewDateHistogramByMinute () =
    let dateHistogramByMinute = new DateHistogramAggregation("by_minute")
    dateHistogramByMinute.Field <- new Field(Name = "OperationTime")
    dateHistogramByMinute.Interval <- new Union<DateInterval, Time>(DateInterval.Minute) // can also use TimeSpan.FromMinutes(1.0)
    dateHistogramByMinute.MinimumDocumentCount <- new Nullable<int>(0)
    dateHistogramByMinute.Format <- "strict_date_hour_minute"
    let innerAggregations = new AggregationDictionary()
    innerAggregations.[resultInnerAggregationName] <- new AggregationContainer(Terms = generateAggregationByResultCode ())
    dateHistogramByMinute.Aggregations <- innerAggregations
    dateHistogramByMinute

I use this aggregation to set the request by

let dateHistogram = generateNewDateHistogramByMinute ()
let aggregations = new AggregationDictionary()
aggregations.[histogramName] <- new AggregationContainer(DateHistogram = dateHistogram)
(* ... code omitted ... *)
dslRequest.Aggregations <- aggregations

When I print out the request, the aggregation part is like this

"aggs": {
    "BucketsByMinutes": {
      "date_histogram": {
        "field": "OperationTime",
        "interval": "minute",
        "format": "strict_date_hour_minute",
        "min_doc_count": 0
      }
    }
  }

The inner aggregation is completely lost. Does anyone know how should I construct a request properly? And how do I retrieve that inner buckets when the response is returned? I didn't find appropriate properties or methods for that, and the documentation is basically non-existent.

2
  • Did you see the 2.x documentation at elastic.co/guide/en/elasticsearch/client/net-api/2.x/index.html ? Commented Jun 14, 2016 at 11:08
  • @RussCam I came across that while I was searching for solutions, but it didn't help much. Thank you for the link. Commented Jun 15, 2016 at 1:47

1 Answer 1

1

I'm not sure why you're not seeing the inner aggregation on the request; I'm seeing it with the following, slightly modified version of what you have

open Nest
open Elasticsearch.Net

type Document () =
    member val Name = "" with get, set

let pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"))
let settings = new ConnectionSettings(pool, new InMemoryConnection())

settings.DisableDirectStreaming()
        .PrettyJson()
        .OnRequestCompleted(fun response ->
            if (response.RequestBodyInBytes <> null)
            then
                Console.WriteLine("{0} {1} \n{2}\n", response.HttpMethod, response.Uri, Encoding.UTF8.GetString(response.RequestBodyInBytes));
            else Console.WriteLine("{0} {1} \n", response.HttpMethod, response.Uri);

            if (response.ResponseBodyInBytes <> null)
            then
                Console.WriteLine("Status: {0}\n{1}\n{2}\n", response.HttpStatusCode, Encoding.UTF8.GetString(response.ResponseBodyInBytes), new String('-', 30));
            else Console.WriteLine("Status: {0}\n{1}\n", response.HttpStatusCode, new String('-', 30));
        ) |> ignore

let client = new ElasticClient(settings)

/// Generate an aggregation to put buckets by result code
let generateAggregationByResultCode () =
    let bucketSize = 10
    let resultAggregationName = "result_aggregation"
    let aggregationByResults = new TermsAggregation(resultAggregationName)
    aggregationByResults.Field <- Field.op_Implicit("Result")
    aggregationByResults.ExecutionHint <- new Nullable<TermsAggregationExecutionHint>(TermsAggregationExecutionHint.GlobalOrdinals);
    aggregationByResults.MinimumDocumentCount <- new Nullable<int>(0);
    aggregationByResults.Size <- new Nullable<int>(bucketSize);
    aggregationByResults.Missing <- "-128"
    aggregationByResults

/// Generate an aggregation to classify into buckets by minutes and then by result code
let generateNewDateHistogramByMinute () =
    let dateHistogramByMinute = new DateHistogramAggregation("by_minute")
    dateHistogramByMinute.Field <- Field.op_Implicit("OperationTime")
    dateHistogramByMinute.Interval <- new Union<DateInterval, Time>(DateInterval.Minute) // can also use TimeSpan.FromMinutes(1.0)
    dateHistogramByMinute.MinimumDocumentCount <- new Nullable<int>(0)
    dateHistogramByMinute.Format <- "strict_date_hour_minute"
    dateHistogramByMinute.Aggregations <- AggregationDictionary.op_Implicit(generateAggregationByResultCode())
    dateHistogramByMinute

let request = new SearchRequest<Document>()
request.Aggregations <- (AggregationDictionary.op_Implicit(generateNewDateHistogramByMinute()))

let response = client.Search<Document>(request)

this yields the following in the console

POST http://localhost:9200/_search?pretty=true 
{
  "aggs": {
    "by_minute": {
      "date_histogram": {
        "field": "OperationTime",
        "interval": "minute",
        "format": "strict_date_hour_minute",
        "min_doc_count": 0
      },
      "aggs": {
        "result_aggregation": {
          "terms": {
            "field": "Result",
            "size": 10,
            "min_doc_count": 0,
            "execution_hint": "global_ordinals",
            "missing": "-128"
          }
        }
      }
    }
  }
}

Status: 200
------------------------------

The above may be useful while you're developing; when you're ready to execute against Elasticsearch, remove the InMemoryConnection from the ConnectionSettings constructor and also remove the calls to .DisableDirectStreaming(), .PrettyJson() and .OnRequestCompleted(fun) on ConnectionSettings.

Sign up to request clarification or add additional context in comments.

7 Comments

Thank you very much. I just replaced aggregations.[histogramName] <- new AggregationContainer() with request.Aggregations <- (AggregationDictionary.op_Implicit(some_aggregation)) and it works. Apparently it's wrong use try to add the aggregation to the AggregationDictionary yourself. I just assumed that it functions like a normal dictionary with key-item values and was unaware of the implicit cast.
indexing into the AggregationDictionary should also work as you had it (just tried it), but going that route is a bit more cumbersome as you need to index the aggregation against the name that you provide it; the implicit conversion does this for you
I don't know why but replacing the two lines works. And now I have troubles getting the inner buckets from the response. I can see them if I print the response in plain text, but I don't know how to access it programmatically. I have "aggregations": {"BucketsByMinutes": {"items": [{ ..."aggregations": {"result_aggregation": {"items": [{"key": "2003", "docCount": 4},...]}}}]}} Thanks.
OK. I figured out how to do it. I need to first use bucket.Aggregations.["result_aggregation"] to get the aggregate and then cast it into BucketAggregate.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.