Can anyone explain what's going on with this code?
s1 = "\x20".force_encoding 'UTF-8'
s2 = "\x20".force_encoding 'ASCII-8BIT'
puts "s1 == s2: #{s1 == s2}"
s3 = "\xAB".force_encoding 'UTF-8'
s4 = "\xAB".force_encoding 'ASCII-8BIT'
puts "s3 == s4: #{s3 == s4}"
In Ruby 2.0.0p353 it prints:
s1 == s2: true
s3 == s4: false
I don't understand why s3 and s4 are not equal when s1 and s2 is. 0xAB is the ASCII code for '½', which as far as I know is representable in both ASCII-8BIT and UTF8.
\0xABis also not½as a UTF-8 character code. I found this:"\xAB".force_encoding('CP850').encode('UTF-8')- gives½. . . en.wikipedia.org/wiki/Code_page_850 - probably a few other MSDOS-based extensions have this mapping too.\xBD?