I'm working with an IOT application where we have a bunch of devices sending readings every minute or so, to an AWS Aurora MySQL 5.6 (InnoDB) instance. That instance is a db.t2.medium (2 CPU, 4GB RAM) size. SELECT queries where we fetch by device id and sensor type have begun taking longer and longer, and I'm guessing it's because we're outgrowing our instance size.
The table we query has about 60 million rows, and because of the way our feature works to display the data on a graph, we fetch all historical data rather than do pagination. I also suspect this might be part of the problem. An example query looks like SELECT * FROM readings WHERE device_id = 1234 and sensor_type = 'pressure' and time >= 1644092837 and time <= 1646684837 and returns about 500K rows, taking around 5-8 seconds.
The readings table has four columns - device_id, sensor_type, time (Unix timestamp, stored as an int), and value. A composite index is on device_id, sensor_type, and time.
My main question is - how have people handled returning a large number of rows from an already large table? This table is only going to grow due to the frequency of the data the sensors send. I've considered having a readings table per device but I'm not that comfortable with having potentially thousands of tables, especially if we have to add or edit a column.
I'm also wondering how people have handled scaling up a database in an IOT use case because I'm concerned our AWS bill is going to get very expensive if we just keep increasing RAM / increasing the instance size.
(from Comment)
CREATE TABLE readings (
device_id int(11) unsigned NOT NULL AUTO_INCREMENT,
sensor_type char(5) CHARACTER SET ascii NOT NULL DEFAULT '',
time int(11) unsigned NOT NULL,
value float NOT NULL,
PRIMARY KEY (device_id,sensor_type,time)
) ENGINE=InnoDB AUTO_INCREMENT=48025983 DEFAULT CHARSET=latin1