22

I've seen various questions around SO about how to get the total row count of an Azure storage table, but I want to know how to get the number of rows within a single partition.

How can I do this while loading a minimal amount of entity data into memory?

6 Answers 6

24

As you may already know that there's no Count like functionality available in Azure Tables. In order to get the total number of entities (rows) in a Partition (or a Table), you have to fetch all entities.

You can reduce the response payload by using a technique called Query Projection. A query projection allows you to specify the list of entity attributes (columns) that you want table service to return. Since you're only interested in total count of entities, I would recommend that you only fetch PartitionKey back. You may find this blog post helpful for understanding about Query Projection:

Windows Azure Tables: Introducing Upsert and Query Projection

September 15, 2011
[...]

Query Projection Feature

Projection refers to querying a subset of an entity or entities’ properties. This is analogous to selecting a subset of the columns/properties of a certain table when querying in LINQ. It is a mechanism that would allow an application to reduce the amount of data returned by a query by specifying that only certain properties are returned in the response.

Sign up to request clarification or add additional context in comments.

2 Comments

I suppose you meant RowKey instead of PartitionKey?
The link is dead now, it just redirects to azure.microsoft.com/en-us/blog for me
24

https://azure.microsoft.com/en-gb/features/storage-explorer/ allows you to define a Query and you can use the Table Statistics toolbar item to get the total rows for the whole table or your query

enter image description here

4 Comments

IMHO this function is useless. It counts the items of the query...what is also written at the bottom of the page.
The number at the bottom of the page only counts to 1000. The current version will not let you request more. So, this is a useful way around that.
Exactly what I was looking for.. nice one
After clicking the button was waiting for something to happen. It turned out the result was being printed on the Activities window below. Thanks for this answer @Nigel.
4

Tested the speed using Stopwatch to fetch and count 100,000 entities in a Partition that have three fields in addition to the standard TableEntity.

I select just the PartitionKey and use a resolver to end up with just a list of strings, which once the entire Partition has been retrieved I count.

Fastest I have got it is around 6000ms - 6500ms. Here is the function:

public static async Task<int> GetCountOfEntitiesInPartition(string tableName, string partitionKey)
    {
        CloudTable table = tableClient.GetTableReference(tableName);

        TableQuery<DynamicTableEntity> tableQuery = new TableQuery<DynamicTableEntity>().Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, partitionKey)).Select(new string[] { "PartitionKey" });

        EntityResolver<string> resolver = (pk, rk, ts, props, etag) => props.ContainsKey("PartitionKey") ? props["PartitionKey"].StringValue : null;

        List<string> entities = new List<string>();

        TableContinuationToken continuationToken = null;
        do
        {
            TableQuerySegment<string> tableQueryResult =
                await table.ExecuteQuerySegmentedAsync(tableQuery, resolver, continuationToken);

            continuationToken = tableQueryResult.ContinuationToken;

            entities.AddRange(tableQueryResult.Results);
        } while (continuationToken != null);

        return entities.Count;
    }

This is a generic function, all you need is the tableName and partitionKey.

1 Comment

Since you are only after the count you do not need to add the entities fetched to a list, rather just immediately increment a counter with the results count for that segment.
3

You could achieve this by leveraging atomic batch operation of azure table storage service pretty efficiently. For every partition have an additional entity with the same partition key and a specific row key like "PartitionCount" etc. That entity will have a single int (or long ) property Count.

Every time you insert a new entity do an atomic batch operation to also increment the Count property of your partition counter entity. Your partition counter entity will have the same partition key with your data entity so that allows you to do an atomic batch operation with guaranteed consistency.

Every time you delete an entity, go and decrement the Count property of the partition counter entity. Again in a batch execute operation so these 2 operations are consistent.

If you want to just read the value of partition count then all you need to do is to make a single point query to the partition counter entity and its Count property will tell you the current count for that partition.

13 Comments

Azure Storage Table does not have atomic operations. Every "atomic" operation would require multiple request for read and merge.
well lets start building up knowledge first before we post up. see here: learn.microsoft.com/en-us/rest/api/storageservices/… and the comment "Operations within a change set are processed atomically; that is, all operations in the change set either succeed or fail. Operations are processed in the order they are specified in the change set."
If you never used Azure Storage, please at least read documentation carefully. Azure storage does not have any atomic batch or increment operations over single item. That "either succeed or fail" means you have to repeat retrieve, increment, merge continuously till success, which in concurrent environment means to increment one item you have to send tens of request.
Surely you are not reading or getting the one line I pasted from the documentation ironically. Just search for the substring atomic in that sentence. From client side a batch operation is atomic simply explaining this to you either all operations succeed or all fail. And that is a general terminology used by the official documentation and industry to refer batch operations. I don't think you have ever used any batch operation because this would be simple to understand then.
Ironically you did not read your own "prove". You did not read what is batch operation. Obviously you've never used Azure Storage. Batch operations are limited only to Storage operations and one entity can be only once in one batch. There is no Azure Storage operation incrementing or modifying existing values, only replacing whole. You cannot read value, increment it and update in one atomic batch.
|
2

This can be done a bit shorter than @NickBrooks answer.

public static async Task<int> GetCountOfEntitiesInPartition<T>(
string tableName,string partitionKey) 
where T : ITableEntity, new()
{
    var tableClient = tableServiceClient.GetTableClient(tableName);
    var results = _tableClient.QueryAsync<T>(t => t.PartitionKey == partitionKey,
        select: new[] { "PartitionKey" });
    return await results.CountAsync();    
}

The results.CountAsync() comes from System.Linq.Async, a NuGet package which is officially supported by dotnet.

Comments

-2

I think you can directly use the .Count in C#. You can use either this technique:

var tableStorageData = await table.ExecuteQuerySegmentedAsync(azQuery, null);
int count = tableStorageData.Count();

or

TableQuery<UserDetails> tableQuery = new TableQuery<UserDetails>();
var tableStorageData = table.ExecuteQuery(tableQuery,null);          
count = tableStorageData .Count();

The count variable will have the number total number of rows depending on the query.

1 Comment

No, It Won't. The Query will have continuation token. Each Query only returns upto a 1000 records.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.