4

I'm using ElasticSearch bulk Python API, Does it provide both sync and Async api?

1
  • 1
    using refresh=True with the query Derlin suggested got me the sync behavior I was expecting: helpers.bulk(es, insert_actions, refresh=True) This is on elasticsearch version 1.7.5 Commented Jun 8, 2016 at 1:07

1 Answer 1

7
+50

If by sync you mean a blocking operation

In Python, the bulk functions are synchronous. The easiest way to go it through the helper

elasticsearch.helpers.bulk(client, actions, stats_only=False, **kwargs)

it returns a tuple with summary informations. It is thus synchronous.

If by sync you mean consistency

From the bulk api:

When making bulk calls, you can require a minimum number of active shards in the partition through the consistency parameter

In python, the bulk function has a consistency parameter, allowing you to explicit how many shards must have acknowledged the change for the method to return.

If by timeout you mean a way to stop the operation after a while

If you need to limit the duration of a bulk operation, again the low level bulk() function is your friend. It takes a timeout parameter to add an explicit operation timeout.

Even more generally,

Global timeout can be set when constructing the client (see Connection‘s timeout parameter) or on a per-request basis using request_timeout (float value in seconds) as part of any API call

For example:

from elasticsearch import Elasticsearch
es = Elasticsearch()
# only wait for 1 second, regardless of the client's default
es.cluster.health(wait_for_status='yellow', request_timeout=1)

As a side note, I searched for the bulk() call in java and especially the bulk().await(). I couldn't find anything. May I ask you for your source ?

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a lot for your great answer, just a quick following question, does read() function has consistency, I didn't see the parameter, but I know it should be a number? say, just return one shard as read successful, or need to return quorum/all shards just as successful read.
For read operations, consistency is somewhat controlled by the preference parameter. From the docs: specify the node or shard the operation should be performed on (default: random). For more info, have a look at this thread stackoverflow.com/questions/14080808/…. Hope it helps.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.