1

I have a table with big text column, approx 3M per row, the table is using InnoDB storage engine. The binlog size is an issue,

I have binlog turned on and configured to format:row.

I do know about log_bin_compress option, but before I try it as solution and turn it on for server wide and pay for its impact, I would like to know if there is a more specific solution just for this table.

I also do know about binlog_row_image option minimal, but it seems, the application changes the column every time... so it can not spare space for me.

Question

If I configure column compression this text column, will it affect the size of the binlog logged event?

If this is not an option, will any kind of InnoDB compression, (either row, either page) affect size of the binlog logged event?

6
  • I guess you are using MariaDB, not MySQL, because log_bin_compress is not a feature in MySQL. It was developed for MariaDB. Commented Jan 25, 2023 at 17:15
  • I would be surprised if the binlog events will not be smaller if you use column compression on the relevant columns. I've never tested it, but it would be silly for the MariaDB server to decompress columns before replicating / sending to the binlog. Commented Jan 25, 2023 at 18:18
  • First, some info... Why do you even have the binlog turned on? Replication, point-in-time-recovery, other? Are you worried about disk space? In what situation? What is in MEDIUMTEXT column? Maybe there is a different way to deal with it. Is there a FULLTEXT index on the column? Commented Jan 26, 2023 at 1:58
  • @BillKarwin, your guess is correct about i am using MariaDB, as the very first word is the OP title is 'MariaDB' :-) Commented Jan 26, 2023 at 4:28
  • @RickJames, this is for having the opportunity of PITR. Commented Jan 26, 2023 at 4:31

3 Answers 3

1

Binary log format is independent of storage engine.

I have not tested it, but I am pretty sure that no storage engine options will result in different binary log storage. For example if you store the table with InnoDB compression, that won't compress the value in the binary logs.

You could use gzip to compress strings or blobs in your application before storing them to the database.

If binary log size is a problem, you should also adjust the binary log retention, with expire_logs_days or binlog_expire_logs_seconds (I prefer the latter if you are using MariaDB 10.6 or later).

If the binary log size is still a problem, then you should just get larger storage volumes.

10
  • many thx. I do know about log retention, however, I must archive all logs to satisfy PITR requirement. (Column compression is not storage engine feature. Although you are right in respect of InnoDB row or page compression) Commented Jan 26, 2023 at 5:30
  • You need to maintain PITR for all time? Normally you only need to keep binary logs back to the most recent full or incremental backup. Commented Jan 26, 2023 at 6:00
  • many thx for taking time to get know the actual use case. The PITR is for audit reasons, so having all historical log is mandatory. I know I can solve the issue completely outside of the rdbms, by compressing the binlog files as OS file, then decompress them if there is a need. However I would like to understand what is going on with MariaDB more deep, and having smaller binlog in MariaDB level may have other benefits, regarding disk I/O or when this binlog is used in master/slave replication. Commented Jan 26, 2023 at 6:22
  • Very well. I think you know the options by now. If you need the binary logs for archiving or auditing purposes, you still don't need to store all historical binlogs on the production database server. You can copy them to a file server or cloud storage, then let them expire on the database server. Commented Jan 26, 2023 at 6:41
  • the get larger storage is not appropriate answer in many cases, and not because the storage cost, insted because my concern usually not the storage size, but the increased I/O load. I would like to spare with my I/O bandwith in every case if there is a reasonable option Commented Jan 27, 2023 at 4:47
1

When storing large text columns, I recommend "compressing" them in the client, then storing into a MEDIUMBLOB. This will shrink all copies of the data, include those in the binlog. And, in some configurations, it will speed up things.

3
  • unfortunatelly, as dba, application implementation is not something we have effect on... Very hard and most importantly slow to make a change in app implementation, generally rejected by developers, because of fear of regression. Most importantly as dba we have to react within a way smaller time window (days), but an app implementation change requests time window in the range of months Commented Jan 26, 2023 at 4:36
  • Alas, I disagree with the corporate policy of strictly segregating DBAs and Programmers. "Month"! That reminds me of the "old" days. Or Governments. Commented Jan 26, 2023 at 20:30
  • My experience this applies to any conservative enterprise. In financial industry this process is even more conservative and slower than government. Commented Jan 27, 2023 at 4:42
1

It seems that column compression has measurable effect on the binlog size, (in case the binlog format is row)

I've created a POC which scope is limited, and does not allow generic conclusions, but results may provide some understanding.

I compared column compression vs without column compression cases on a specific TEXT column. I stored approx 5M well compressible data in each row, the data was representative sample to the production load. The result were conclusive, giving the same ratio regardless of using 100, 1000 or 10000 rows.

Column compression was set ZLIB, (and level 9 in the config file)

The InnoDB file size was approx 30-40% smaller, the binlog size was approx 40-50% smaller, to near half size.

It is interesting, that the column compression has measurable greater effect on the binlog size than the InnoDB tablespace size, I expected less effect on binlog size, if any.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.