0

I have a table (innodb) with 1 million new inserts (20GB) a week. I only need the data for 1 week, so I delete it after 7 days, so each day we delete around 3GB and insert 3GB new. That table is already in a separate database from the rest.

The problem is that disk space is only freed after an optimize query, so we run it every few weeks at night. It works, but it takes 30 minutes and freezes the whole database server that time, not just the particular database.

Is there any way to opimize faster?

If we run an optimize everytime we delete the data, will it be faster than running the optimize every few weeks instead? I thought it might be faster to run it when just 3GB of deleted rows need to be removed from disk, if we run it after 20 days it's 60GB. Is that right? and is there another way to optmize the optimmize?

6
  • What is your MySQL version? Do you have enough free hard disk space? Optimize should not freeze your whole database. Also, while not freeing at the time of the delete, it will reuse the space. So if you delete daily, it should stay at about 8-day-size, it will just not drop to 7-day-size, growing to 8-day-size during the day, drop back to 7-day-size on delete. (If you delete weekly, it should stay at about 14 to 15-day-size). Did you test that/would that be a problem? Commented Jul 21, 2019 at 8:44
  • One suggestion here but never used it. And from this one - try to drop indexes, optimize and then re-create Commented Jul 21, 2019 at 9:15
  • Are you sure that the optimization is really necessary? IIRC then InnoDB marks space to be reusable after deletion, so newly added rows would be inserted into that space. So the only result of running the optimizer is that the areas marked as free by InnoDB get also released on the system itself. Did you check if you really have a growing DB usage over let's say a week if you don't run that optimization script. Commented Jul 21, 2019 at 11:01
  • MySQL version is 5.0.11. Have enough disk space, but need to run the optimise regularly, because InnoDB somehow doesn't reuse the disk space it does not free up after deletes. The size of the db grows continuously. Just tried to run it again, it took 55 minutes. Table/Database (we moved that table to an individual db) size is 20GB // 400k rows. Does updating mysql help? Any other ideas? Commented Jul 21, 2019 at 16:35
  • What the heck is in the table? It sounds like the average row is 20KB; this is abnormally large and must involve "off-record" storage. There may be a better way to handle the TEXT/BLOB(s) involved. Commented Jul 21, 2019 at 16:39

3 Answers 3

3

Instead of worrying about speeding up OPTIMIZE TABLE, let's get rid of the need for it.

PARTITION BY RANGE(TO_DAYS(...)) ...

Then DROP PARTITION nightly; this is much faster than using DELETE, and avoids the need for OPTIMIZE.

Be sure to have innodb_file_per_table=ON.

Also nightly, use REORGANIZE PARTITION to turn the future partition into tomorrow's partition and a new, empty, partition.

Details here: https://mysql.rjweb.org/doc.php/partitionmaint

Note that each PARTITION is effectively a separate table so DROP PARTITION is effectively a drop table.

There should be 10 partitions:

  • 1 starter table to avoid the overhead of a glitch when partitioning by DATETIME.
  • 7 daily partitions
  • 1 extra day, so that there will be a full 7 day's worth.
  • 1 empty future partition just in case your nightly script fails to run.
Sign up to request clarification or add additional context in comments.

4 Comments

Sorry. PARTITIONing is not available until 5.1. 5.0 is a decade old. It is past time to upgrade! The longer you wait to upgrade, the harder it will be. Since then, there has been 5.1, 5.5, 5.6, 5.7, 8.0.
Ouch! 5.0.11 dates to 2005-08-06. The final version (5.0.96) dates back to 2012-03-21 .
we're running MariaDB-10.2.25 on another database server, could move it there, it seems this version will support PARTITIONing
Yes, MariaDB 10.2 has PARTITIONing; my link works fine with it. (MariaDB forked off MySQL at about 5.1, and has maintained a lot of compatibility since then.)
0

Since you have an antique version that does not have PARTITIONing, here is another solution:

  • Compress the html and storing into a BLOB (instead of TEXT).
  • Do the compress and uncompress in the client.
  • This technique will shave shrink the disk footprint upwards of 3:1.

That won't eliminate the OPTIMIZE issue, but it will

  • Use less disk space.
  • Be faster (due to having less data to shovel around).

But, as already mentioned, InnoDB cleans up the free space somewhat. I suspect that the table does not grow past 2x after an Optimize? Normally a BTree that starts with no free space degrades to about 69% full after a lot of churn. But then it stays at that ratio.

Emails, HTML, text, code -- all of these shrink about 3:1 with any decent compression library (zlib, PHP's compress(), etc). Most image formats and pdfs are already compressed; they don't benefit from a second compression.

2 Comments

We need to search through the emails using a FULLTEXT index on that column, that won't work when the data is compressed. I think moving to an up-to-date database solution to be able to use partioning will keep the full functionality and still solve the issue
@DeveloperJano - True; FT needs uncompressed data. (And you are many versions away from ROW_FORMAT=COMPRESSED.)
-1

MySQL is not designed for that volume... try a warehouse database engine (columnar engine) like AWS RedShift it will feel a 4 mb database again :) if you cant use it, you can install postgres and add the plugin for compressed columnar tables (should be similar to redshift)

2 Comments

I beg to differ. Inserting a million rows per day is not a problem for MySQL. Nor is a terabyte of data. What can be a problem is poorly indexed or inefficiently written queries. As for DW, one must manually build and maintain Summary tables for a large DW application work well on MySQL.
According to one study, 7% of MySQL tables are bigger than 7 million rows.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.