I'm running TorchServe in WSL2. There are three issues with the metrics:
- Even if
metrics_configparameter ints.configpoints to non-existing file everything works without any problems. It looks like the parameter doesn't work in my case - Even if I comment/remove some metrics (
ts_metricsormodel_metrics) I can see these metrics in thets_metrics.logormodel_metrics.log - I can't get any of the ts metrics via metrics API. There are only
ts_queue_latency_microsecondsandts_inference_requests_totalmetrics
ts.config:
models={\
"doc_model": {\
"1.0": {\
"defaultVersion": true,\
"minWorkers": 1,\
"maxWorkers": 1,\
"batchSize": 1\
}\
}\
}
inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
metrics_address=http://0.0.0.0:8082
metrics_mode=prometheus
metrics_config=./metrics.yaml
number_of_netty_threads=32
job_queue_size=1000
model_store=/home/model-server/model-store
workflow_store=/home/model-server/wf-store
metrics.yaml:
dimensions:
- &model_name "ModelName"
- &worker_name "WorkerName"
- &level "Level"
- &device_id "DeviceId"
- &hostname "Hostname"
ts_metrics:
counter:
# - name: Requests2XX
# unit: Count
# dimensions: [*level, *hostname]
- name: Requests4XX
unit: Count
dimensions: [*level, *hostname]
# - name: Requests5XX
# unit: Count
# dimensions: [*level, *hostname, *model_name]
- name: ts_inference_requests_total
unit: Count
dimensions: [*level,"model_name", "model_version", "hostname"]
- name: ts_inference_latency_microseconds
unit: Microseconds
dimensions: ["model_name", "model_version", "hostname"]
- name: ts_queue_latency_microseconds
unit: Microseconds
dimensions: ["model_name", "model_version", "hostname"]
histogram:
- name: NameOfHistogramMetric
unit: ms
dimensions: [*model_name, *level]
gauge:
- name: QueueTime
unit: Milliseconds
dimensions: [*level, *hostname]
- name: WorkerThreadTime
unit: Milliseconds
dimensions: [*level, *hostname]
- name: WorkerLoadTime
unit: Milliseconds
dimensions: [*worker_name, *level, *hostname]
- name: CPUUtilization
unit: Percent
dimensions: [*level, *hostname]
- name: MemoryUsed
unit: Megabytes
dimensions: [*level, *hostname]
- name: MemoryAvailable
unit: Megabytes
dimensions: [*level, *hostname]
- name: MemoryUtilization
unit: Percent
dimensions: [*level, *hostname]
- name: DiskUsage
unit: Gigabytes
dimensions: [*level, *hostname]
- name: DiskUtilization
unit: Percent
dimensions: [*level, *hostname]
- name: DiskAvailable
unit: Gigabytes
dimensions: [*level, *hostname]
- name: GPUMemoryUtilization
unit: Percent
dimensions: [*level, *device_id, *hostname]
- name: GPUMemoryUsed
unit: Megabytes
dimensions: [*level, *device_id, *hostname]
- name: GPUUtilization
unit: Percent
dimensions: [*level, *device_id, *hostname]
model_metrics:
# Dimension "Hostname" is automatically added for model metrics in the backend
gauge:
- name: HandlerTime
unit: ms
dimensions: [*model_name, *level]
- name: PredictionTime
unit: ms
dimensions: [*model_name, *level]
metrics API output:
# HELP ts_inference_latency_microseconds Cumulative inference duration in microseconds
# TYPE ts_inference_latency_microseconds counter
ts_inference_latency_microseconds{uuid="57984a8f-c19a-4c93-b9cd-cc7eb8b1fa55",model_name="doc_model",model_version="default",} 4925586.4
# HELP ts_inference_requests_total Total number of inference requests.
# TYPE ts_inference_requests_total counter
ts_inference_requests_total{uuid="57984a8f-c19a-4c93-b9cd-cc7eb8b1fa55",model_name="doc_model",model_version="default",} 3.0
# HELP ts_queue_latency_microseconds Cumulative queue duration in microseconds
# TYPE ts_queue_latency_microseconds counter
ts_queue_latency_microseconds{uuid="57984a8f-c19a-4c93-b9cd-cc7eb8b1fa55",model_name="doc_model",model_version="default",} 291.4
Here is how I build mar:
torch-model-archiver \
--model-name doc_model \
--version 1.0 \
--serialized-file model/pytorch_model.bin \
--handler ./src/transformers_vectorizer_handler.py \
--extra-files "./model/config.json,./tokenizer" \
-f
mkdir -p model_store && mv doc_model.mar model_store/
And how I start TS:
torchserve \
--start \
--model-store model_store \
--models doc_model=doc_model.mar \
--ncs \
--ts-config ./ts.config