How to batch_get_item many items at once given a list of primary partition key values

Question

So, so I have a dynamodb table with a primary partition key column, foo_id and no primary sort key. I have a list of foo_id values, and want to get the observations associated with this list of ids.

I figured the best way to do this (?) is to use batch_get_item(), but it's not working out for me.

    # python code
    import boto3
    client = boto3.client('dynamodb')

    # ppk_values = list of `foo_id` values (strings) (< 100 in this example)
    x = client.batch_get_item(
        RequestItems={
            'my_table_name':
                {'Keys': [{'foo_id': {'SS': [id for id in ppk_values]}}]}
        })

I'm using SS because I'm passing a list of strings (list of foo_id values), but I'm getting:

ClientError: An error occurred (ValidationException) when calling the
BatchGetItem operation: The provided key element does not match the
schema

So I assume that means it's thinking foo_id contains list values instead of string values, which is wrong.

--> Is that interpretation right? What's the best way to batch query for a bunch of primary partition key values?

Laren Crawford · Accepted Answer · 2020-07-15 00:21:02Z

25

Boto3 now has a version of batch_get_item that lets you pass in the keys in a more natural Pythonic way without specifying the types.

You can find a complete and working code example in https://github.com/awsdocs/aws-doc-sdk-examples. That example deals with some additional nuances around retries, but here's a digest of the parts of the code that answer this question:

import logging
import boto3

dynamodb = boto3.resource('dynamodb')
logger = logging.getLogger(__name__)

movie_table = dynamodb.Table('Movies')
actor_table = dyanmodb.Table('Actors')

batch_keys = {
    movie_table.name: {
        'Keys': [{'year': movie[0], 'title': movie[1]} for movie in movie_list]
    },
    actor_table.name: {
        'Keys': [{'name': actor} for actor in actor_list]
    }
}

response = dynamodb.batch_get_item(RequestItems=batch_keys)

for response_table, response_items in response.items():
    logger.info("Got %s items from %s.", len(response_items), response_table)

answered Jul 15, 2020 at 0:21

Laren Crawford

6671 gold badge7 silver badges6 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Shubham K. Over a year ago

Thank you I was looking for resources wise query and using table.name feature

jbuddy_13 Over a year ago

What are some reasonable read/write provisioned parameters if you expected to read 1k-10k items per API request and need low latency but only expect <1000 requests per day. (No write operations in production.)

AlwaysLearning Over a year ago

Is there such thing for client?

Darius · Accepted Answer · 2019-02-12 21:49:41Z

17

The keys should be given as mentioned below. It can't be mentioned as 'SS'.

Basically, you can compare the DynamoDB String datatype with String (i.e. not with SS). Each item is handled separately. It is not similar to SQL in query.

'Keys': [
            {
                'foo_id': key1
            },
            {
                'foo_id': key2
            }
],

Sample code:-

You may need to change the table name and key values.

from __future__ import print_function # Python 2/3 compatibility
import boto3
import json
import decimal
from boto3.dynamodb.conditions import Key, Attr
from botocore.exceptions import ClientError

# Helper class to convert a DynamoDB item to JSON.
class DecimalEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, decimal.Decimal):
            if o % 1 > 0:
                return float(o)
            else:
                return int(o)
        return super(DecimalEncoder, self).default(o)

dynamodb = boto3.resource("dynamodb", region_name='us-west-2', endpoint_url="http://localhost:8000")

email1 = "[email protected]"
email2 = "[email protected]"

try:
    response = dynamodb.batch_get_item(
        RequestItems={
            'users': {
                'Keys': [
                    {
                        'email': email1
                    },
                    {
                        'email': email2
                    },
                ],            
                'ConsistentRead': True            
            }
        },
        ReturnConsumedCapacity='TOTAL'
    )
except ClientError as e:
    print(e.response['Error']['Message'])
else:
    item = response['Responses']
    print("BatchGetItem succeeded:")
    print(json.dumps(item, indent=4, cls=DecimalEncoder))

edited Feb 12, 2019 at 21:49

Darius

12.4k2 gold badges33 silver badges51 bronze badges

answered Feb 7, 2017 at 16:40

notionquest

39.4k6 gold badges120 silver badges111 bronze badges

3 Comments

vrtx54234 Over a year ago

The above answer works only after we change dynamodb.batch_get_item to dynamodb.meta.client.batch_get_item, as the method batch_get_item exists only on a client not on a resource.

mcskinner Over a year ago

The above answer actually no longer works at all. I tried the equivalent and got an error. The inner payloads need type information: {'Keys': [{'email': {'S': email1}}, {'email': {'S': email2}}]}

BrokenEyes Over a year ago

For anyone struggling with this, there is an important point to note, which I had missed, and that is the use of the word 'Primary Key' so if your Primary Key consists of a Partition Key and Sort Key, you have to provide both! (which rendered my use case useless sadly). Else you'll get a ValidationException The provided key element does not match the schema error.

mcskinner · Accepted Answer · 2020-06-24 23:53:06Z

8

The approved answer no longer works.

For me the working call format was like so:

import boto3
client = boto3.client('dynamodb')

# ppk_values = list of `foo_id` values (strings) (< 100 in this example)
x = client.batch_get_item(
    RequestItems={
        'my_table_name': {
            'Keys': [{'foo_id': {'S': id}} for id in ppk_values]
        }
    }
)

The type information was required. For me it was "S" for string keys. Without it I got an error saying the libraries found a str but expected a dict. That is, they wanted {'foo_id': {'S': id}} instead of the simpler {'foo_id': id} that I tried first.

answered Jun 24, 2020 at 23:53

mcskinner

2,7682 gold badges15 silver badges22 bronze badges

1 Comment

Germán Ruelas Over a year ago

That is because he is using the boto3.resource and you're using the client.

Samuurai · Accepted Answer · 2023-05-12 16:13:55Z

1

If you have a Primary Key which consists of a partition key and a sort key, you will need to provide both. This code works for me:

keys = [{'review_id':  id, 'place_id': place_id} for id in review_ids]
print(keys)
# Set up the batch_get_item request
request_items = {
    table_name: {
        'Keys': keys,
        'ConsistentRead': True
    }
}
response = dynamodb.batch_get_item(RequestItems=request_items)

answered May 12, 2023 at 16:13

Samuurai

3972 silver badges13 bronze badges

Comments

curious_soul · Accepted Answer · 2020-10-30 06:32:58Z

Here is the java solution with dynamodb version 2.15.0. assuming foo_id is string type and keys are less than 100. you can break list into batches of required size

private void queryTable(List<String> keys){

    List<Map<String, AttributeValue>> keysBatch = keys.stream()
            .map(key -> singletonMap("foo_id", AttributeValue.builder().s(key).build()))
            .collect(toList());
    KeysAndAttributes keysAndAttributes = KeysAndAttributes.builder()
            .keys(keysBatch)
            .build();
    Map<String, KeysAndAttributes> requestItems = new HashMap<>();
    requestItems.put("tableName", keysAndAttributes);
    BatchGetItemRequest batchGet = BatchGetItemRequest.builder()
            .requestItems(requestItems)
            .build();
    Map<String, List<Map<String, AttributeValue>>> responses = dbClient.batchGetItem(batchGet).responses();
    responses.entrySet().stream().forEach(entry -> {
        System.out.println("Table : " + entry.getKey());
        entry.getValue().forEach(v -> {
            System.out.println("value: "+v);
        });
    });
}

Collectives™ on Stack Overflow

How to batch_get_item many items at once given a list of primary partition key values

5 Answers 5

3 Comments

3 Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

3 Comments

3 Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related