0

Let's assume I have some data I want to store in my DynamoDB table. I want to use as the Primary Key the following structure: {timestamp}_{short_uuid}, e.g. "1643207769_123423-ab31d-12345d-12355". I want to ensure good distribution of these items across the partitions.

I'm wondering if "enforcing" sharding of the data by introducing the hash-range key with a specific range (like 1-20) is a good idea? This means my Primary Key would consist of:

"Partition Key" = "range(1-20)" and "Sort Key": "{timestamp}_{short_uuid}".

In other words, will the hash-range key provide better distribution than just simple partition key (regardless the high cardinality like in my example)? Eventually, I'm not interested on which partition the item will end up, I just want to avoid potential hot partition problem.

1

2 Answers 2

2

With thanks to Alex DeBrie's Everything you need to know about DynamoDB Partitions for much of this information.

Some NoSQL databases expose the partition hashing algorithm and/or the cluster topology, but DynamoDB does not. So, you don't know what it is and you can't control it.

Prior to 2018 you needed to be much more aware of how your items were sharded because DynamoDB shared your table's provisioned read/write capacity evenly across all partitions.

In 2018, AWS introduced adaptive capacity and made it instant in May 2019. So, now your table's provisioned read/write capacity shifts to the partitions where it's needed and, as well as being able to add new partitions as needed, DynamoDB will also split highly-active partitions to provide consistent performance.

The upshot is that as long as you stay within an individual partition's size and throughout limits, you should not worry about primary keys too much.

Sign up to request clarification or add additional context in comments.

Comments

0

DynamoDB hash function(which they didn't disclose) will distribute it better than you can as they are topology aware( + you have low cardinality in the partition key).

Not sure about your usage but if you want sorting, then use a sort key.

2 Comments

I have the very same feeling. However, when you look on some AWS blog posts and docs , they do recommend “ Using Write Sharding to Distribute Workloads Evenly” (docs.aws.amazon.com/amazondynamodb/latest/developerguide/…). That’s why I’m asking about it.
I see why you are asking.. The key you use is random unlike their examples, so no need to randomise it

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.