0

Our Rails 3 app needs to be able to accept foreign characters like ä and こ, and save them to our MySQL db, which has its character_set as 'utf8.'

One of our models runs a validation which is used to strip out all the non-word characters in its name, before being saved. In Ruby 1.8.7 and Rails 2, the following was sufficient:

def strip_non_words(string)
  string.gsub!(/\W/,'')
end

This stripped out bad characters, but preserved things like 'ä', 'こ', and '3.' With Ruby 1.9's new encodings, however, that statement no longer works - it is now removing those characters as well as the others we don't want. I am trying to find a way to do that.

Changing the gsub to something like this:

def strip_non_words(string)
  string.gsub!(/[[:punct]]/,'')
end

lets the string pass through fine, but then the database kicks up the following error:

Mysql2::Error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation

Running the string through Iconv to try and convert it, like so:

def strip_non_words(string)
  Iconv.conv('LATIN1', 'UTF8', string)
  string.gsub!(/[[:punct]]/,'')
end

Results in this error:

Iconv::IllegalSequence: "こäè" # "こäè" being a test string

I'm basically at my whits end here. Does anyone know of a way to do do what I need?

1 Answer 1

1

This ended up being a bit of an interesting fix.

I discovered that Ruby has a regex I could use, but only for ASCII strings. So I had to convert the string to ASCII, run the regex, then convert it back for submission to the db. End result looks like this:

def strip_non_words(string)
  string_encoded = string.force_encoding(Encoding::ASCII_8BIT)
  string_encoded.gsub!(/\p{Word}+/, '') # non-word characters
  string_reencoded = string_encoded.force_encoding('ISO-8859-1')
  string_reencoded #return
end

Turns out you have to encode things separately due to how Ruby handles changing a character encoding: http://ablogaboutcode.com/2011/03/08/rails-3-patch-encoding-bug-while-action-caching-with-memcachestore/

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.