3

I want to migrate my project from MySQL to postgres, I have one table in MySQL, in which utf8mb4 set for particular column in a table, what alternative is there in postgres to set in column for encoding?

8
  • 1
    Well, Postgres uses UTF-8 encoding by default. Not sure what utf8mb4 really does, but a proper UTF-8 encoding should include that Commented Sep 17, 2022 at 10:08
  • 1
    @a_horse_with_no_name utf8mb4 is MySQL's way to represent 4-byte UTF8 encoding. Commented Sep 17, 2022 at 10:30
  • @LajosArpad: but that's how UTF-8 is defined: encode characters in 1 to 4 bytes. Or is that a fixed width encoding? So the correct name would be UTF-32 then? Commented Sep 17, 2022 at 11:26
  • 1
    @LajosArpad: so that means utf8mb4 is essentially a standard UTF-8 encoding (whereas utf8mb3 would be non-standard) Commented Sep 17, 2022 at 16:55
  • 1
    Furthermore, MySQL's utf8mb3 and utf8mb4 are, technically, subsets of UTF-8. UTF-8 can have 5-bytes characters, but none are defined yet. MySQL can't handle such, yet. (I don't know about Postgres.) Commented Sep 17, 2022 at 17:50

1 Answer 1

2

utf8mb4 is MySQL's way to represent 4-byte UTF characters, however, as the documentation clarifies:

Requires a maximum of four bytes per multibyte character.

So, actually not all characters are stored in four bytes. The OS is also not using up all the 4 available bytes for each characters, so you should be able to migrate your utf8mb4 characters into a UTF-8 encoded target field (MySQL - PostgreSQL) without problems, at least in theory.

But you never know whether this fits practice, so it is advisable to first create a backup of your MySQL database (so you will not be afraid of doing changes to it if for some reason you decide that the initial database needs some changes), export your database and modify your table's/column's definition to no longer use utf8mb4 as an encoding, but rather leave it unspecified (if you can rely on the fact that PostgreSQL has UTF-8 as the default encoding) or specify a UTF-8 encoding explicitly and run the inserts. Take a few samples of data from the original database and compare them to what PostgreSQL returns to them. If it works out of the box, then the theory was fitting the practice. If not, then you will need to research for the cause of the problem you experience.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.