5

Is it save to write JavaScript source code (to be executed in the browser) which includes UTF-8 character literals?

For example, I would like to use an ellipses literal in a string as such:

var foo = "Oops… Something went wrong";

Do "modern" browsers support this? Is there a published browser support matrix somewhere?

3
  • 1
    That won't cause you any problems as long as your JavaScript files are served up with proper content headers. However if you're unsure you can always use hex escapes. Note that three- and four-byte sequences are somewhat of a pain, but 16-bit characters are pretty safe. Commented Mar 4, 2015 at 15:23
  • @Pointy: Make that an answer please so that I can properly upvote it :-) Commented Mar 4, 2015 at 15:25
  • @Bergi well I was hesitant because although I believe that to be true it's not something about which I feel I've got extensive experience or knowledge, but since JavaScript is explicitly a Unicode language I guess it's safe :) Commented Mar 4, 2015 at 15:28

1 Answer 1

7

JavaScript is by specification a Unicode language, so Unicode characters in strings should be safe. You can use hex escapes (\u8E24) as an alternative. Make sure your script files are served with proper content type headers.

Note that characters beyond one- and two-byte sequences are problematic, and that JavaScript regular expressions are terrible with characters beyond the first codepage. (Well maybe not "terrible", but primitive at best.)

You can also use Unicode letters, Unicode combining marks, and Unicode connector punctuation characters in identifiers, in case you want to impress your friends. Thus

var wavy﹏line = "wow";

is perfectly good JavaScript (but good luck with your bug report if you find a browser where it doesn't work).

Read all about it in the spec, or use it to fall asleep at night :)

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks @Bergi! I forgot that part. (The spec says something about the language assuming that the text is normalized Unicode in all cases, but I think I would not rely on that working out properly without correct headers.)
Thanks for a most informative and amusing answer. I wish I could upvote twice ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.