0

I tried a simple performance test between postgres and timescaleDB. Here are my results:-

Total rows 403,204

With Postgres

Fetch Time For Aggregation Query 176 rows : 203ms - 240ms

Fetch Time For Join Query 102 rows : 660ms - 720ms

With TimescaleDB

Fetch Time For Aggregation Query 176 rows : 175ms - 200ms

Fetch Time For Join Query 102 rows : 614ms - 650 ms

CREATE TABLE public.sensors(
  id SERIAL PRIMARY KEY,
  type VARCHAR(50),
  location VARCHAR(50)
);
​
-- Postgres table
CREATE TABLE sensor_data (
  time TIMESTAMPTZ NOT NULL,
  sensor_id INTEGER,
  temperature DOUBLE PRECISION,
  cpu DOUBLE PRECISION,
  FOREIGN KEY (sensor_id) REFERENCES sensors (id)
);
​
--drop table public.sensor_data;
​
-- TimescaleDB table
CREATE TABLE sensor_data_ts (
  time TIMESTAMPTZ NOT NULL,
  sensor_id INTEGER,
  temperature DOUBLE PRECISION,
  cpu DOUBLE PRECISION,
  FOREIGN KEY (sensor_id) REFERENCES sensors (id)
);
SELECT create_hypertable('sensor_data_ts', 'time');
​
-- Insert Data
​
INSERT INTO sensors (type, location) VALUES
('a','floor'),
('a', 'ceiling'),
('b','floor'),
('b', 'ceiling');
​
​
-- Postgres 
​
INSERT INTO sensor_data (time, sensor_id, cpu, temperature)
SELECT
  time,
  sensor_id,
  random() AS cpu,
  random()*100 AS temperature
FROM generate_series(now() - interval '50 week', now(), interval '5 minute') AS g1(time), generate_series(1,4,1) AS g2(sensor_id);
​
-- TimescaleDB
INSERT INTO sensor_data_ts (time, sensor_id, cpu, temperature)
SELECT
  time,
  sensor_id,
  random() AS cpu,
  random()*100 AS temperature
FROM generate_series(now() - interval '50 week', now(), interval '5 minute') AS g1(time), generate_series(1,4,1) AS g2(sensor_id);
​
​
--truncate table public.sensor_data;
--truncate table public.sensor_data_ts;
​
select count(*) from public.sensor_data sd ;
select count(*) from public.sensor_data_ts sd ;
​
--Postgres
​
--Aggregate queries
SELECT 
  floor(extract(epoch from "time")/(60*60*24*2)) as period,
  AVG(temperature) AS avg_temp, 
  AVG(cpu) AS avg_cpu 
FROM sensor_data 
GROUP BY period;
--ORDER BY PERIOD;

--Join Queries
SELECT 
  sensors.location,
  floor(extract(epoch from "time")/(60*60*24*7)) as period,
  AVG(temperature) AS avg_temp, 
  last(temperature, time) AS last_temp, 
  AVG(cpu) AS avg_cpu 
FROM sensor_data JOIN sensors on sensor_data.sensor_id = sensors.id
GROUP BY period, sensors.location;
​
--Timescale DB
​
--Aggregate Queries
SELECT 
  time_bucket('2 day', time) AS period, 
  AVG(temperature) AS avg_temp, 
  AVG(cpu) AS avg_cpu 
FROM sensor_data_ts 
GROUP BY period;
--ORDER BY PERIOD;
​
--Join Queries
SELECT 
  sensors.location,
  time_bucket('1 week', time) AS period, 
  AVG(temperature) AS avg_temp, 
  last(temperature, time) AS last_temp, 
  AVG(cpu) AS avg_cpu 
FROM sensor_data JOIN sensors on sensor_data.sensor_id = sensors.id
GROUP BY period, sensors.location;

I was expecting some tangible boost in query performance. What else can I do improve query performance ?

1 Answer 1

2

A few things:

  1. I'm a little confused. time_bucket is a TimescaleDB function, not a Postgres function, so it is probably running some of our code.
  2. You are still performing a full table scan of all your data. There's not much in the way of optimizations to do here. And the dataset is small (400K) so will fit all in buffer cache; if you want to see some insert/query performances, likely need (a) much more data, (b) more complex types of queries.
  3. But TimescaleDB also has other features. For example, turn on compression and you'll likely find these "full table scans" to be quicker (albeit once you get into disk-bound workloads). Or turn on continuous aggs so you can continuously/incrementally materialize these results to serve, e.g., user-facing dashboards.
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for pointing out incorrect time bucket usage. I believe it was running some of the code. I have updated the queries (no timescaleDB functions) and do see some boost. I will look into point 3 for further optimization.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.