I write data using parallel_bulk function in python to elasticsearch, but the performance is very low, I write 10000 data, and it consumes 180s, and I set the settings:
"settings": {
"number_of_shards": 5,
"number_of_replicas": 0,
"refresh_interval": "30s",
"index.translog.durability": "async",
"index.translog.sync_interval": "30s"
}
and in the elasticsearch.yml, I set:
bootstrap.memory_lock: true
indices.memory.index_buffer_size: 20%
indices.memory.min_index_buffer_size: 96mb
# Search pool
thread_pool.search.size: 5
thread_pool.search.queue_size: 100
thread_pool.bulk.queue_size: 300
thread_pool.index.queue_size: 300
indices.fielddata.cache.size: 40%
discovery.zen.fd.ping_timeout: 120s
discovery.zen.fd.ping_retries: 6
discovery.zen.fd.ping_interval: 30s
But it doesn't improve the performance, how can I do it? I use elasticsearch6.5.4 on windows10, and only one node, and I yield data from Oracle to elasticsearch.