1

Given a character (one letter of a string), how could I identify to which language it belongs ? The options are: English, Russian, Hebrew.

Background: this character was entered by user in a form and then stored in a database.

It can be for example the first letter in one of these words:

  • Hello
  • Привет
  • שלום
2

1 Answer 1

3

The UNICODE standard is divided into "blocks". Go here:

http://www.unicode.org/charts/

http://en.wikipedia.org/wiki/Unicode_block

http://www.unicode.org/versions/Unicode6.0.0/

and find unicode blocks (intervals) for each language.

My guess:

So for you its the matter of simple number comparsion for each character (unicode ordinal value). Very simple.

Sign up to request clarification or add additional context in comments.

2 Comments

@Izap: Great! That's what I thought to do. Which Ruby function returns the unicode of the character ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.