3

I have a PostgreSQL database with millions of date/time data rows. The time range of the data is from year 2004 until now. The time zone is always UTC. Due to limited disk space on my webhosting account, I want to reduce the database size as much as possible. I know that in Microsoft SQL Server there is a SMALLDATETIME datatype with 4 byte size.

Is there something equivalent to SMALLDATETIME in Postgres?

(Note: I have looked at the Postgres manual, but all 'date and time' are 8 bytes. the 4-byte 'date' datatype doesn't suit my needs because I need resolution in minutes.)

1 Answer 1

4

No, there isn't such a type. It'd be nice for a few use cases, but it isn't supported.

What you can do as a compromise is store the seconds since your offset epoch:

extract(epoch from mydate) - extract(epoch from TIMESTAMP '2004-01-01 00:00:00')

in an integer column, then reconstruct the date on the fly. This will have a performance impact, and more importantly, suck for readability, but it'll work.

SELECT to_timestamp(mydatesecondssince2004 + extract(epoch from TIMESTAMP '2004-01-01 00:00:00')) FROM mytable

Alternately, you could be the person who decides to add the 'smalldate' and 'smalltimestamp' types; PostgreSQL is open source, and that means people need to do things if they want them to happen.

Sign up to request clarification or add additional context in comments.

3 Comments

It actually looks like old versions of postgres used a 4-byte date/time called abstime, but it's deprecated now. I studied a bit more about Postgres's disk storage requirements. It appears that there's a 24 byte row overhead! So saving 4 bytes in the datetime won't make a big difference. I'm surprised how 'hungry for disk space' Postgres is, MSSQL's row overhead is only 7 bytes ...
@jirikadlec2 Does MS SQL keep a rollback log / redo log, and replace rows in the main table? If so, like Oracle that means it doesn't need to store transaction visibility information in the master table, which greatly reduces row sizes - at the cost of making queries over tables that are changing while they're being read considerably slower when they have to scan the rollback log as well. I'm not convinced PostgreSQL's trade-off is the right one, but it's not easy to change at this point...
Probably my best bet at saving disk space in Postgresql could be merging my 'narrow' tables into one 'wide' table. Right now I have around 10 separate tables such as TEMPERATURE(station_id, time, value) or RAINFALL (station_id, time, value). Instead I would create one table HOURLY_DATA(station_id, time, temperature_value, rainfall_value), thereby reducing the number of rows required for storing my data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.