I want to migrate my project from MySQL to postgres, I have one table in MySQL, in which utf8mb4 set for particular column in a table, what alternative is there in postgres to set in column for encoding?
1 Answer
utf8mb4 is MySQL's way to represent 4-byte UTF characters, however, as the documentation clarifies:
Requires a maximum of four bytes per multibyte character.
So, actually not all characters are stored in four bytes. The OS is also not using up all the 4 available bytes for each characters, so you should be able to migrate your utf8mb4 characters into a UTF-8 encoded target field (MySQL - PostgreSQL) without problems, at least in theory.
But you never know whether this fits practice, so it is advisable to first create a backup of your MySQL database (so you will not be afraid of doing changes to it if for some reason you decide that the initial database needs some changes), export your database and modify your table's/column's definition to no longer use utf8mb4 as an encoding, but rather leave it unspecified (if you can rely on the fact that PostgreSQL has UTF-8 as the default encoding) or specify a UTF-8 encoding explicitly and run the inserts. Take a few samples of data from the original database and compare them to what PostgreSQL returns to them. If it works out of the box, then the theory was fitting the practice. If not, then you will need to research for the cause of the problem you experience.
utf8mb4really does, but a proper UTF-8 encoding should include thatutf8mb4is MySQL's way to represent 4-byte UTF8 encoding.utf8mb4is essentially a standard UTF-8 encoding (whereasutf8mb3would be non-standard)