I have some data which has been imported into Postgres, for use in a Rails application. However somehow the foreign accents have become strangely encoded:
äappears asâ§áappears asâ°éappears asâ©óappears asââ¥
I'm pretty sure the problem is with the integrity of the data, rather than any problem with Rails. It doesn't seem to match any encoding I try:
# Replace "cp1252" with any other encoding, to no effect
"Trollâ§ttan".encode("cp1252").force_encoding("UTF-8") #-> junk
If anyone was able to identify what kind of encoding mixup I'm suffering from, that would be great.
As a last resort, I may have to manually replace each corrupted accent character, but if anyone can suggest a programatic solution (or a even a starting point for fixing this - I've found it very hard to debug), I'd be v. grateful.
UTF8(collationen_US.UTF-8). The data went through quite a complex import process (originally CSV, then went through Google Refine, and then a bunch more transformations). It won't be very easy to reimport the data, so an in-place fix would be ideal.