Postgres query: Abnormal Disk I/O

Question

My production query suffers of a strange I/O hit, and I can't figure out where the problem is. I mean, it's a simple query without any massive join or anything like that...

My table has around 300K of rows, and grows like 15/20K row per day, here is the table definition:

CREATE TABLE public.tv_smartdevicemeasurement_modbus
(
    measurement_id integer NOT NULL DEFAULT nextval('tv_smartdevicemeasurement_modbus_measurement_id_seq'::regclass),
    insert_time timestamp with time zone NOT NULL,
    data jsonb NOT NULL,
    parent_job_id integer NOT NULL,
    smart_device_id integer NOT NULL,
    CONSTRAINT tv_smartdevicemeasurement_modbus_pkey PRIMARY KEY (measurement_id),
    CONSTRAINT tv_smartdevicemeasur_parent_job_id_1ac4609e_fk_tv_measur FOREIGN KEY (parent_job_id)
        REFERENCES public.tv_measurementjobs (job_id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT tv_smartdevicemeasur_smart_device_id_62c12ed0_fk_tv_smartd FOREIGN KEY (smart_device_id)
        REFERENCES public.tv_smartdevice_modbus (device_id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

CREATE INDEX tv_smartdevicemeasurement_modbus_parent_job_id_1ac4609e
    ON public.tv_smartdevicemeasurement_modbus USING btree
    (parent_job_id)
    TABLESPACE pg_default;

CREATE INDEX tv_smartdevicemeasurement_modbus_smart_device_id_62c12ed0
    ON public.tv_smartdevicemeasurement_modbus USING btree
    (smart_device_id)
    TABLESPACE pg_default;

Here is the query:

EXPLAIN (ANALYZE, BUFFERS)
SELECT "tv_smartdevicemeasurement_modbus"."measurement_id",
       "tv_smartdevicemeasurement_modbus"."smart_device_id",
       "tv_smartdevicemeasurement_modbus"."parent_job_id",
       "tv_smartdevicemeasurement_modbus"."insert_time",
       "tv_smartdevicemeasurement_modbus"."data",
  (SELECT DATA->> 'VLN_AVG') AS "VLN_AVG",
  (SELECT DATA->> 'VLN3') AS "VLN3",
  (SELECT DATA->> 'VLN2') AS "VLN2",
  (SELECT DATA->> 'VLN1') AS "VLN1",
  (SELECT DATA->> 'VL1-2') AS "VL1-2",
  (SELECT DATA->> 'VL2-3') AS "VL2-3",
  (SELECT DATA->> 'VL3-1') AS "VL3-1",
  (SELECT DATA->> 'VLL_AVG') AS "VLL_AVG",
  (SELECT DATA->> 'IL_AVG') AS "IL_AVG",
  (SELECT DATA->> 'IL1') AS "IL1",
  (SELECT DATA->> 'IL2') AS "IL2",
  (SELECT DATA->> 'IL3') AS "IL3",
  (SELECT DATA->> 'PL_SUM') AS "PL_SUM",
  (SELECT DATA->> 'PL2') AS "PL2",
  (SELECT DATA->> 'PL3') AS "PL3",
  (SELECT DATA->> 'PL1') AS "PL1",
  (SELECT DATA->> 'kWh_L2') AS "kWh_L2",
  (SELECT DATA->> 'kWh_L3') AS "kWh_L3",
  (SELECT DATA->> 'kWh_L1') AS "kWh_L1",
  (SELECT DATA->> 'QL3') AS "QL3",
  (SELECT DATA->> 'QL1') AS "QL1",
  (SELECT DATA->> 'QL2') AS "QL2",
  (SELECT DATA->> 'QL_SUM') AS "QL_SUM",
  (SELECT DATA->> 'SL1') AS "SL1",
  (SELECT DATA->> 'SL2') AS "SL2",
  (SELECT DATA->> 'SL3') AS "SL3",
  (SELECT DATA->> 'SL_SUM') AS "SL_SUM",
  (SELECT DATA->> 'KVARh_L1') AS "KVARh_L1",
  (SELECT DATA->> 'KVARh_L2') AS "KVARh_L2",
  (SELECT DATA->> 'KVARh_LSUM') AS "KVARh_LSUM",
  (SELECT DATA->> 'KVARh_L3') AS "KVARh_L3",
  (SELECT DATA->> 'KVAh_L3') AS "KVAh_L3",
  (SELECT DATA->> 'KVAh_L1') AS "KVAh_L1",
  (SELECT DATA->> 'KVAh_L2') AS "KVAh_L2",
  (SELECT DATA->> 'KVAh_LSUM') AS "KVAh_LSUM",
  (SELECT DATA->> 'Cos_LSUM') AS "Cos_LSUM",
  (SELECT DATA->> 'Cos_L1') AS "Cos_L1",
  (SELECT DATA->> 'Cos_L2') AS "Cos_L2",
  (SELECT DATA->> 'Cos_L3') AS "Cos_L3"
FROM "tv_smartdevicemeasurement_modbus"
WHERE ("tv_smartdevicemeasurement_modbus"."insert_time" > '2019-02-07 12:15:00+01:00'
       AND "tv_smartdevicemeasurement_modbus"."insert_time" < '2019-02-07 12:31:00+01:00'
       AND "tv_smartdevicemeasurement_modbus"."smart_device_id" = 6)

And there is the execution plan with Explain analyze buffer: https://explain.depesz.com/s/d3f

Very, Very strange!

Unrelated, but: (SELECT DATA->> 'Cos_L3') can be simplified to DATA->> 'Cos_L3' — user330315
– user330315, Commented Feb 8, 2019 at 9:23

Laurenz Albe · Accepted Answer · 2019-02-08 09:07:49Z

2

That is not strange or abnormal at all.

Since you don't have an index on insert_time, PostgreSQL can only use the index on smart_device_id to speed up the query.

But that condition is not very selective, it has 30270 result rows, each of which have to be checked if they satisfy the additional filter condition on insert_time. This filter will remove 30258 of these rows, leaving only 12 behind.

To fetch these 30270 rows from the table, PostgreSQL had to read 21442 blocks of 8 KB each from disk, which is where your time is spent.

The best index would be:

CREATE INDEX ON tv_smartdevicemeasurement_modbus (smart_device_id, insert_time);

but I suspect that the following index, which is more widely applicable and smaller, will also give good response times:

CREATE INDEX ON tv_smartdevicemeasurement_modbus (insert_time);

answered Feb 8, 2019 at 9:07

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Symon Over a year ago

Hi mates, i've analyzed my queries a little bit more, and i've solved my problem using this approach: - Removing the unnecessary order by (It seems that using the Itertools sorting it's more fast than order by) - Adding the index as you suggested, but using the sorting method ASC (This is the way that the data it's accessed from my web application) Here there is the index definition: CREATE INDEX insert_time_idx ON tv_smartdevicemeasurement_modbus (insert_time DESC) Many Thanks dude! Have a great day!

Laurenz Albe Over a year ago

There was no ORDER BY in the query you posted. ASC is the default, but it doesn't matter in this case, because indexes can be scanned in both directions.

Collectives™ on Stack Overflow

Postgres query: Abnormal Disk I/O

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related