sync/async insert or update ElasticSearch in Python

Question

I'm using ElasticSearch bulk Python API, Does it provide both sync and Async api?

using refresh=True with the query Derlin suggested got me the sync behavior I was expecting: helpers.bulk(es, insert_actions, refresh=True) This is on elasticsearch version 1.7.5 — Jesse Aldridge
– Jesse Aldridge, Commented Jun 8, 2016 at 1:07

Derlin · Accepted Answer · 2016-02-01 08:15:08Z

7

+50

If by sync you mean a blocking operation

In Python, the bulk functions are synchronous. The easiest way to go it through the helper

elasticsearch.helpers.bulk(client, actions, stats_only=False, **kwargs)

it returns a tuple with summary informations. It is thus synchronous.

If by sync you mean consistency

From the bulk api:

When making bulk calls, you can require a minimum number of active shards in the partition through the consistency parameter

In python, the bulk function has a consistency parameter, allowing you to explicit how many shards must have acknowledged the change for the method to return.

If by timeout you mean a way to stop the operation after a while

If you need to limit the duration of a bulk operation, again the low level bulk() function is your friend. It takes a timeout parameter to add an explicit operation timeout.

Even more generally,

Global timeout can be set when constructing the client (see Connection‘s timeout parameter) or on a per-request basis using request_timeout (float value in seconds) as part of any API call

For example:

from elasticsearch import Elasticsearch
es = Elasticsearch()
# only wait for 1 second, regardless of the client's default
es.cluster.health(wait_for_status='yellow', request_timeout=1)

As a side note, I searched for the bulk() call in java and especially the bulk().await(). I couldn't find anything. May I ask you for your source ?

answered Feb 1, 2016 at 8:15

Derlin

9,9212 gold badges34 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Jack Over a year ago

Thanks a lot for your great answer, just a quick following question, does read() function has consistency, I didn't see the parameter, but I know it should be a number? say, just return one shard as read successful, or need to return quorum/all shards just as successful read.

Derlin Over a year ago

For read operations, consistency is somewhat controlled by the preference parameter. From the docs: specify the node or shard the operation should be performed on (default: random). For more info, have a look at this thread stackoverflow.com/questions/14080808/…. Hope it helps.

Collectives™ on Stack Overflow

sync/async insert or update ElasticSearch in Python

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related