6

I know this have been asked several times, but to me is happening something strange:

I have an index view where rendering certain characters (letters with accent) causes Rails to raise the exception

incompatible character encodings: ASCII-8BIT and UTF-8

so i checked my strings encoding and this is actually ASCII-8BIT everywhere, even though i set the proper encoding to UTF-8 in my application.rb

config.encoding = "utf-8"

and in my enviroment.rb

Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8

and in my database it appear:

character_set_database = utf-8

as suggestend in some guides.

Strings are inserted with a textarea field and are not concatenated to any other already inserted string.

The strange things are:

  • this happens only in the index view, whereas this is not happening in the show (same resource)
  • this happens only for this model (which is an email, with subject and body, but this shouldn't affect anything)
  • In my development environment everything goes well setting str.force_encoding('utf-8'), whereas in my production environment this is not working. (dev i'm with Ruby 2.0.0, in production Ruby 2.1.0, both Rails4, and both MySql)
  • setting the file view with # encoding utf-8 also doesn't work
  • trying str.force_encoding('ascii-8bit').encode('utf-8') says Encoding::UndefinedConversionError "\xC3" from ASCII-8BIT to UTF-8 which is an à, while using body.force_encoding('ascii-8bit').encode('UTF-8', :invalid => :replace, :undef => :replace, :replace => '?'), replaces all accented charaters with a ?, while str.force_encoding('iso-8859-1').encode('utf-8') obviously generates the wrong character (a ?).

So my questions are 2: - why is rails setting the string encodint to ascii-8bit? - how to solve this issue?

I've already checked these questions (the newest ones with rails4):

Rails View Encoding Issues

"\xC2" to UTF-8 in conversion from ASCII-8BIT to UTF-8

How to convert a string to UTF8 in Ruby

Encoding::UndefinedConversionError: "\xE4" from ASCII-8BIT to UTF-8

and other resources also, but nothing worked.

8
  • Are you entering the accented characters into the view using a text editor? Commented Mar 7, 2014 at 12:33
  • it's a textarea_field Commented Mar 7, 2014 at 15:03
  • Is the source code files all utf-8 or are your text editor saving the files in ascii-8bit perhaps. Commented Mar 7, 2014 at 20:55
  • these are not string generated by my text editor...anyway files are in UTF-8 Commented Mar 8, 2014 at 10:38
  • I'm struggling with a similar issue (getting incompatible character encodings: ASCII-8BIT and UTF-8 errors from user entered data). Were you able to solve your problem? Did you find a way to replicate it in a test? Commented Feb 7, 2017 at 5:50

1 Answer 1

1

You probably have a string literal in your source code somewhere that you then concatenate another string too. For instance:

some_string = "this is a string"

or even

some_string = "" #empty string

Those strings, stored in some_string, will be marked ASCII_8BIT, and if you then later do something like:

some_string = some_string + unicode_string

Then you'll get the error. That is, those strings will be marked ASCII-8BIT unless you add, to the top of the file where the string literals are created:

#encoding: utf-8

That declaration determines the default encoding that string literals in source code will have.

I am just guessing, because this pattern is a common source of this problem. To know more for sure, it would take more information than is in your question -- it would take debugging the actual source code, to figure out exactly what string is tagged as ASCII-8BIT when you expect it to be tagged UTF-8 instead, and exactly where that String came from.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.