3

I'm trying to achieve the following query in CosmosDB:

SELECT * FROM c
WHERE c.timestamp = (SELECT VALUE MAX(c.timestamp) FROM c )

However it doesn't seem to be calling the sub query first and returns all rows.

Is this possible?

5
  • This query would be very expensive to run. What is it you're trying to do? Get the most recent item? Have you looked at the Change Feed? Commented Dec 7, 2018 at 21:56
  • @ChrisAnderson-MSFT I was actually just testing this and the behaviour he reported is true. The query runs fine but you get all the records, not he one that matches the nested query. Commented Dec 7, 2018 at 21:58
  • Yeah, I'm looping in the query folks to answer why this doesn't just fail. It shouldn't work. :) But it's still a bad design pattern, so I can try to help with that first. :) Commented Dec 7, 2018 at 22:02
  • 1
    @ChrisAnderson-MSFT - aside from nested queries being unsupported, why would such a query be expensive? If timestamp is indexed, wouldn’t MAX(timestamp) be fairly efficient since it’s a service-side aggregation? Commented Dec 8, 2018 at 0:28
  • Cross partition aggregation requires going to every partition, getting its max, then finding the max of maxes client side. So it is lots of requests. Krishnan has some good suggestions below. I'd still recommend looking at change feed for finding the most recent items. It is very cost effective. Commented Dec 10, 2018 at 21:43

1 Answer 1

2

I am from CosmosDB Engineering team.

CosmosDB query supports only correlated subqueries, so subqueries can refer to items from the parent collection only. For example, you could utilize aggregates on nested attributes in a document like so:

SELECT TOP 1000 
    c.id, 
    MaxNutritionValue,
    MinNutritionValue,
    AvgNutritionValue
FROM c
JOIN (SELECT VALUE Max(n.nutritionValue) FROM n IN c.nutrients) MaxNutritionValue
JOIN (SELECT VALUE Min(n.nutritionValue) FROM n IN c.nutrients) MinNutritionValue
JOIN (SELECT VALUE Avg(n.nutritionValue) FROM n IN c.nutrients) AvgNutritionValue

assuming a document structure like so:

{
     "id":"someId",
     "nutrients":[
            { 
               "item": "pizza",
               "nutritionValue": 20 
            },
            { 
               "item": "burger",
               "nutritionValue": 30 
            }
     ] 
}

To achieve what you want, you could do something like this:

SELECT TOP 1 * FROM c ORDER BY c.timestamp DESC 

Though not applicable to all aggregates this approach can help with Max or Min.

Sign up to request clarification or add additional context in comments.

1 Comment

Hey Krishnan, just to make sure that I understand what you say. If I have a container called Table1, then I cannot do a subquery on it to select the max in CosmosDB? Why would you have such limiation?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.