# Create an Elasticsearch inference endpoint

**PUT /_inference/{task_type}/{elasticsearch_inference_id}**

Create an inference endpoint to perform an inference task with the `elasticsearch` service.

> info
> Your Elasticsearch deployment contains preconfigured ELSER and E5 inference endpoints, you only need to create the enpoints using the API if you want to customize the settings.

If you use the ELSER or the E5 model through the `elasticsearch` service, the API request will automatically download and deploy the model if it isn't downloaded yet.

> info
> You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.

After creating the endpoint, wait for the model deployment to complete before using it.
To verify the deployment status, use the get trained model statistics API.
Look for `"state": "fully_allocated"` in the response and ensure that the `"allocation_count"` matches the `"target_allocation_count"`.
Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

## Required authorization

* Cluster privileges: `manage_inference`


## Servers
- http://api.example.com: http://api.example.com ()


## Authentication methods
- Api key auth


## Parameters

### Path parameters
- **task_type** (string)
  The type of the inference task that the model will perform.
- **elasticsearch_inference_id** (string)
  The unique identifier of the inference endpoint.
  The must not match the `model_id`.

### Query parameters
- **timeout** (string)
  Specifies the amount of time to wait for the inference endpoint to be created.

### Body: application/json (object)

- **chunking_settings** (object)
  The chunking configuration object.
  Applies only to the `sparse_embedding` and `text_embedding` task types.
  Not applicable to the `rerank`, `completion`, or `chat_completion` task types.
- **service** (string)
  The type of service supported for the specified task type. In this case, `elasticsearch`.
- **service_settings** (object)
  Settings used to install the inference model. These settings are specific to the `elasticsearch` service.
- **task_settings** (object)
  Settings to configure the inference task.
  These settings are specific to the task type you specified.


## Responses
### 200


#### Body: application/json (object)
- **chunking_settings** (object)
  The chunking configuration object.
  Applies only to the `sparse_embedding` and `text_embedding` task types.
  Not applicable to the `rerank`, `completion`, or `chat_completion` task types.
- **service** (string)
  The service type
- **service_settings** (object)
  Settings specific to the service
- **task_settings** (object)
  Task settings specific to the service and task type
- **inference_id** (string)
  The inference Id
- **task_type** (string)
  The task type


[Powered by Bump.sh](https://bump.sh)