1

I'm working on a ruby script (using sequel) to transfer data from an old database to a new one. Because of an encoding problem with the old database I got values like "München" instead of "München".

DB = Sequel.mysql2 'db_name', user: 'name', password: '***', host: '127.0.0.1' # , encoding:  Encoding::CP1252.name) # doesn't work
city = DB[:users].first['city'] # => "München"
city.encoding # => #<Encoding:UTF-8>
city.encode(Encoding::UTF_8, Encoding::CP1252) # => "München"

The old db's encoding is set to CP1252, the new one is utf-8.

I tried to #gsub the broken umlauts, but that doesn't work:

umlauts = {
 'ä' => 'ä',
 'ö' => 'ö',
 'ü' => 'ü',
 'ß' => 'ß'
}

city.gsub(/[#{umlauts.keys.join}]/, umlauts) # => "Mnchen"

I'm completely clueless how to correctly work with encoding, do you know how I can get 'München'?

1
  • 1
    What was it that didn't work about your gsub? mine works fine: string = "München";string.gsub("ü", "ü");=> "München" Commented Apr 16, 2014 at 15:12

1 Answer 1

1

Turns out the way I used #gsub has been wrong (thanks Mike H-R!), this works:

umlauts = {
  'ä' => 'ä',
  'ö' => 'ö',
  'ü' => 'ü',
  'ß' => 'ß'
}

city.gsub(/#{umlauts.keys.join("|")}/, umlauts)' # => "München"
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.