# Create a custom inference endpoint **PUT /_inference/{task_type}/{custom_inference_id}** The custom service gives more control over how to interact with external inference services that aren't explicitly supported through dedicated integrations. The custom service gives you the ability to define the headers, url, query parameters, request body, and secrets. The custom service supports the template replacement functionality, which enables you to define a template that can be replaced with the value associated with that key. Templates are portions of a string that start with `${` and end with `}`. The parameters `secret_parameters` and `task_settings` are checked for keys for template replacement. Template replacement is supported in the `request`, `headers`, `url`, and `query_parameters`. If the definition (key) is not found for a template, an error message is returned. In case of an endpoint definition like the following: ``` PUT _inference/text_embedding/test-text-embedding { "service": "custom", "service_settings": { "secret_parameters": { "api_key": "" }, "url": "...endpoints.huggingface.cloud/v1/embeddings", "headers": { "Authorization": "Bearer ${api_key}", "Content-Type": "application/json" }, "request": "{\"input\": ${input}}", "response": { "json_parser": { "text_embeddings":"$.data[*].embedding[*]" } } } } ``` To replace `${api_key}` the `secret_parameters` and `task_settings` are checked for a key named `api_key`. > info > Templates should not be surrounded by quotes. Pre-defined templates: * `${input}` refers to the array of input strings that comes from the `input` field of the subsequent inference requests. * `${input_type}` refers to the input type translation values. * `${query}` refers to the query field used specifically for reranking tasks. * `${top_n}` refers to the `top_n` field available when performing rerank requests. * `${return_documents}` refers to the `return_documents` field available when performing rerank requests. ## Required authorization * Cluster privileges: `manage_inference` ## Servers - http://api.example.com: http://api.example.com () ## Authentication methods - Api key auth - Basic auth - Bearer auth ## Parameters ### Path parameters - **task_type** (string) The type of the inference task that the model will perform. - **custom_inference_id** (string) The unique identifier of the inference endpoint. ### Body: application/json (object) - **chunking_settings** (object) The chunking configuration object. - **service** (string) The type of service supported for the specified task type. In this case, `custom`. - **service_settings** (object) Settings used to install the inference model. These settings are specific to the `custom` service. - **task_settings** (object) Settings to configure the inference task. These settings are specific to the task type you specified. ## Responses ### 200 #### Body: application/json (object) - **chunking_settings** (object) The chunking configuration object. Applies only to the `sparse_embedding` and `text_embedding` task types. Not applicable to the `rerank`, `completion`, or `chat_completion` task types. - **service** (string) The service type - **service_settings** (object) Settings specific to the service - **task_settings** (object) Task settings specific to the service and task type - **inference_id** (string) The inference Id - **task_type** (string) The task type [Powered by Bump.sh](https://bump.sh)