1

Can anyone explain what's going on with this code?

s1 = "\x20".force_encoding 'UTF-8'
s2 = "\x20".force_encoding 'ASCII-8BIT'
puts "s1 == s2: #{s1 == s2}"

s3 = "\xAB".force_encoding 'UTF-8'
s4 = "\xAB".force_encoding 'ASCII-8BIT'
puts "s3 == s4: #{s3 == s4}"

In Ruby 2.0.0p353 it prints:

s1 == s2: true
s3 == s4: false

I don't understand why s3 and s4 are not equal when s1 and s2 is. 0xAB is the ASCII code for '½', which as far as I know is representable in both ASCII-8BIT and UTF8.

3
  • \0xAB is also not ½ as a UTF-8 character code. I found this: "\xAB".force_encoding('CP850').encode('UTF-8') - gives ½ . . . en.wikipedia.org/wiki/Code_page_850 - probably a few other MSDOS-based extensions have this mapping too. Commented Feb 8, 2014 at 16:31
  • I don't know where you got your info about that being the ASCII code for 1/2. It is actually the Left-pointing double angle quotation mark, left pointing guillemet. Did you mean \xBD? Commented Feb 8, 2014 at 16:45
  • 0xAB is not ASCII, and [0xAB] is not a valid UTF-8 string. Commented Apr 23, 2014 at 18:31

1 Answer 1

3

\xAB in isn't the same as \xAB in , because to encode the utf-8 is coded in multi-byte set, and chars from \x80 to \xff is used to encode symbols with codes over \x80.

But since the ASCII-8BIT isn't specific encoding, but can be treated as the encoding class based on , and it is aliased to encoding in . The codes from from \x80 to \xff can't also be converted any encoding. So it is like an abstraction for ASCII based-codepages.

So, if you try to convert from ASCII-8BIT to utf-8 you would get the conversion exception:

Encoding::UndefinedConversionError: "\xC9" from ASCII-8BIT to UTF-8

However, you are able to handle ½ symbol properly in 8-bit encoding using explicitly set or codepages, and char \xBD as follows:

"\xBD".force_encoding('ISO-8859-1').encode('UTF-8')
# => "½"
"\xBD".force_encoding('CP1252').encode('UTF-8')
# => "½"
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.