3

console.log("HİNDİ".toLocaleLowerCase() == "hindi");
console.log("HİNDİ" == "hindi");

console.log("HİNDİ".toLowerCase());


console.log("HİNDİ".toLocaleLowerCase())
console.log("HİNDİ".toLowerCase())

I am building a search functionality but i come across a thing:

"HİNDİ".toLocaleLowerCase() // "hindi"

"hindi" == "HİNDİ".toLocaleLowerCase() //false

What the heck is going on here?

Solution: @pmrotule's answer seems to work:

function to_lower(s)
{
    var n = "";
    for (var i = 0; i < s.length; i++) // do it for one character at a time
    {
        var c = s[i].toLowerCase();

        // call replace() only if the character has a length > 1
        // after toLowerCase()
        n += c.length > 1 ? c[0].replace(/[^ -~]/g,'') : c;
    }
    return n;
}

Thanks,

7
  • Different encoding on the strings? Javascript uses UTF-16 internally. You could run "HİNDİ".toLocaleLowerCase() in your console. Also, you can try to type the character code directly. Example: "\u90AB" Commented Jun 13, 2016 at 13:36
  • Both UTF8, You may try it on the console too here on stackowerflow. Same result. Commented Jun 13, 2016 at 13:38
  • 1
    The current code in the updated question gives me true, false, hindi, hindi, hindi. No problem for me. Possibly locale dependent. Commented Jun 13, 2016 at 13:50
  • 1
    encodeURIComponent("HİNDİ".toLocaleLowerCase()) Commented Jun 13, 2016 at 14:01
  • 1
    Seems to be browser dependent as well. For me the problem occurs in IE, not in FF. Commented Jun 13, 2016 at 14:07

2 Answers 2

3

It is a problem of string format. toLocaleLowerCase is meant for human-readable display only. However, there is still a trick you can do:

if ("hindi" == "HİNDİ".toLowerCase().replace(/[^ -~]/g,''))
{
    alert("It works!");
}

EDIT

If you want to make it works with all special characters:

function to_lower(s)
{
    var n = "";
    for (var i = 0; i < s.length; i++) // do it for one character at a time
    {
        var c = s[i].toLowerCase();
        
        // call replace() only if the character has a length > 1
        // after toLowerCase()
        n += c.length > 1 ? c.replace(/[^ -~]/g,'') : c;
    }
    return n;
}

console.log("gök" == to_lower("GÖK"));
console.log("hindi" == to_lower("HİNDİ"));

function to_low(s) // shorter version
{
    var n = "";
    for (var i = 0; i < s.length; i++)
    { n += s[i].toLowerCase()[0]; }

    return n;
}

console.log("hindi" == to_low("HİNDİ"));

Sign up to request clarification or add additional context in comments.

2 Comments

in this case ("gök" == "GÖK".toLowerCase().replace(/[^ -~]/g,'')) return false.
@serdem420 I edited my answer to make it works with all special characters like in your example.
3

The problem is that your character İ is composed by 2 characters.

You have the I and then the 'dot' at the top (UTF-8 decimal code: 775).

Try this:

"HİNDİ".toLocaleLowerCase().split('').map((_,v)=>console.log(_.charCodeAt(0)))

Compare it with this:

"hindi".toLocaleLowerCase().split('').map((_,v)=>console.log(_.charCodeAt(0)))

2 Comments

Thanks for the answer, that makes sense. Is there any way to produce "true" for this kind of situations?
Yes, there is. But, it isn't a nice solution. You can remove every diacritic mark from your string and compare it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.