Our Rails 3 app needs to be able to accept foreign characters like ä and こ, and save them to our MySQL db, which has its character_set as 'utf8.'
One of our models runs a validation which is used to strip out all the non-word characters in its name, before being saved. In Ruby 1.8.7 and Rails 2, the following was sufficient:
def strip_non_words(string)
string.gsub!(/\W/,'')
end
This stripped out bad characters, but preserved things like 'ä', 'こ', and '3.' With Ruby 1.9's new encodings, however, that statement no longer works - it is now removing those characters as well as the others we don't want. I am trying to find a way to do that.
Changing the gsub to something like this:
def strip_non_words(string)
string.gsub!(/[[:punct]]/,'')
end
lets the string pass through fine, but then the database kicks up the following error:
Mysql2::Error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation
Running the string through Iconv to try and convert it, like so:
def strip_non_words(string)
Iconv.conv('LATIN1', 'UTF8', string)
string.gsub!(/[[:punct]]/,'')
end
Results in this error:
Iconv::IllegalSequence: "こäè" # "こäè" being a test string
I'm basically at my whits end here. Does anyone know of a way to do do what I need?