0

I have the following Python code to do a ElasticSearch batch update, and when I get back the response object, I got something very simple, and it doesn't look right to me, anything wrong here?

...
actions = []
for item in data:
    actions.append({"_index": index,
                    "_type": doc_type,
                    "_id": item['id'],
                    "_source": item})

print ("batching result")
response = helpers.bulk(self.es.conn, actions)
print (response)

Here is the output, but I'm expecting something more detail.

batching result
(2, [])
2
  • This is normal. From the docs: "It returns a tuple with summary information - number of successfully executed actions and either list of errors or number of errors if stats_only is set to True" Commented Apr 27, 2018 at 17:14
  • What if we want to get the same response as when the bulk API is called from command line? Commented Feb 10, 2020 at 20:16

1 Answer 1

1

As written in documentation:

It returns a tuple with summary information - number of successfully executed actions and either list of errors or number of errors if stats_only is set to True [...] If you need to process a lot of data and want to ignore/collect errors please consider using the streaming_bulk() helper which will just return the errors and not store them in memory.

with streaming_bulk() you have to use raise_on_error parameter for raise on error. if you want to collect a lot of data i suggest to use parallel_bulk() that is faster and more intuitive

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.