3

Here is the error:

=> ["Mænd med navnet Kim", "30.094", "29.946", "-148", "Kvinder med navnet Kim",
 "341", "345", "4", "Mænd med navnet Kim Hansen", "1.586", "1.573", "-13", "Kvin
der med navnet Kim Hansen", "5", "5", "0", "Mænd og kvinder med efternavnet Hans
en", "226.040", "223.478", "-2.562"]
irb(main):094:0>
irb(main):095:0* @tester.index("Mænd med navnet Kim")
=> nil
irb(main):096:0> @tester.index("Kvinder med navnet Kim")
=> 4
irb(main):097:0> @tester.index("Mænd med navnet Kim Hansen")
=> nil
irb(main):098:0> @tester.index("Kvinder med navnet Kim Hansen")
=> 12
irb(main):099:0> @tester.index("Mænd og kvinder med efternavnet Hansen")
=> nil
irb(main):100:0>

Example tried Gsub method:

<ap(&:text).map{|d| d.delete "'"}.map{|d| d.gsub("æ", "#844"}
irb(main):113:1> )
SyntaxError: (irb):112: syntax error, unexpected '}', expecting ')'
9
  • What Ruby version? I can't replicate this on 1.9.3-p194. Commented Jun 9, 2012 at 4:18
  • Among other things, as entered it appears your strings "Kvin der med navnet Kim Hansen" & "Mænd og kvinder med efternavnet Hans en" have newlines in them, and your fourth index call has a ' in it that's not in the string in the array. Also, I assume the array shown is actually in @tester? You don't show the assignment call itself, only the REPL's output. Commented Jun 9, 2012 at 4:23
  • 1
    Works correctly on 1.9.2 also..hmm.. Commented Jun 9, 2012 at 4:23
  • Seems to have something with encoding the æ is giving trouble... Commented Jun 9, 2012 at 4:28
  • I think we need to see the @tester assignment. The encodings of the string in the array might be different than the strings you type or paste into irb (just a wild guess). Commented Jun 9, 2012 at 4:30

1 Answer 1

4

Since your input strings seem to be UTF-8, the easiest solution is to run your irb session with the same encoding:

irb -EUTF-8

That should make string entry in the irb command prompt default to UTF-8.

Good resource on Ruby 1.9 encodings:
http://blog.grayproductions.net/articles/understanding_m17n

Sign up to request clarification or add additional context in comments.

2 Comments

You could also ensure that your $LANG environmental variable is set to some UTF-8 language (e.g. en_US.UTF-8), as I believe IRB will use whatever that is set to.
@AndrewMarshall Yes...agree. Best to unify encodings across the system. Thx.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.