187
var s = "overpopulation";
var ar = [];
ar = s.split();
alert(ar);

I want to string.split a word into array of characters.

The above code doesn't seem to work - it returns "overpopulation" as Object..

How do i split it into array of characters, if original string doesn't contain commas and whitespace?

3
  • related: JavaScript access string chars as array Commented Jun 26, 2011 at 14:57
  • 1
    ar is an array variable but alert() takes string variables. :) Commented Jul 25, 2014 at 18:45
  • You should use console.log(arr) here and not alert(arr) Commented Oct 31 at 13:45

8 Answers 8

289

You can split on an empty string:

var chars = "overpopulation".split('');

If you just want to access a string in an array-like fashion, you can do that without split:

var s = "overpopulation";
for (var i = 0; i < s.length; i++) {
    console.log(s.charAt(i));
}

You can also access each character with its index using normal array syntax. Note, however, that strings are immutable, which means you can't set the value of a character using this method, and that it isn't supported by IE7 (if that still matters to you).

var s = "overpopulation";

console.log(s[3]); // logs 'r'
Sign up to request clarification or add additional context in comments.

4 Comments

You can also access the string's characters in an array like fashion like so: mystr = "apples"; mystr[0]; // a
This does NOT work for emojis "😒".length #=> 1 "😒".chars #=> ["😒"]
This does not consider wide characters.
This will not handle strings like this: "⬆️⬆️⬇️⬇️⬅️➡️⬅️➡️🅱️🅰️🏁". However, [..."⬆️⬆️⬇️⬇️⬅️➡️⬅️➡️🅱️🅰️🏁"] and Array.from("⬆️⬆️⬇️⬇️⬅️➡️⬅️➡️🅱️🅰️🏁") also won't handle this string correctly. To correctly handle this string you'll need to use a library like this one: npmjs.com/package/grapheme-splitter
200

Do NOT use .split('')

You'll get weird results with non-BMP (non-Basic-Multilingual-Plane) character sets.

Reason is that methods like .split() and .charCodeAt() only respect the characters with a code point below 65536; bec. higher code points are represented by a pair of (lower valued) "surrogate" pseudo-characters.

'𝟙𝟚𝟛'.length     // —> 6
'𝟙𝟚𝟛'.split('')  // —> ["�", "�", "�", "�", "�", "�"]

'😎'.length      // —> 2
'😎'.split('')   // —> ["�", "�"]

Use ES2015 (ES6) features where possible:

Using the spread operator:

let arr = [...str];

Or Array.from

let arr = Array.from(str);

Or split with the new u RegExp flag:

let arr = str.split(/(?!$)/u);

Examples:

[...'𝟙𝟚𝟛']        // —> ["𝟙", "𝟚", "𝟛"]
[...'😎😜🙃']     // —> ["😎", "😜", "🙃"]

For ES5, options are limited:

I came up with this function that internally uses MDN example to get the correct code point of each character.

function stringToArray() {
  var i = 0,
    arr = [],
    codePoint;
  while (!isNaN(codePoint = knownCharCodeAt(str, i))) {
    arr.push(String.fromCodePoint(codePoint));
    i++;
  }
  return arr;
}

This requires knownCharCodeAt() function and for some browsers; a String.fromCodePoint() polyfill.

if (!String.fromCodePoint) {
// ES6 Unicode Shims 0.1 , © 2012 Steven Levithan , MIT License
    String.fromCodePoint = function fromCodePoint () {
        var chars = [], point, offset, units, i;
        for (i = 0; i < arguments.length; ++i) {
            point = arguments[i];
            offset = point - 0x10000;
            units = point > 0xFFFF ? [0xD800 + (offset >> 10), 0xDC00 + (offset & 0x3FF)] : [point];
            chars.push(String.fromCharCode.apply(null, units));
        }
        return chars.join("");
    }
}

Examples:

stringToArray('𝟙𝟚𝟛')     // —> ["𝟙", "𝟚", "𝟛"]
stringToArray('😎😜🙃')  // —> ["😎", "😜", "🙃"]

Note: str[index] (ES5) and str.charAt(index) will also return weird results with non-BMP charsets. e.g. '😎'.charAt(0) returns "�".

UPDATE: Read this nice article about JS and unicode.

5 Comments

thanks for teaching me how to make my regexps emoji-friendly, I never knew I needed that 'till now
This should be the accepted answer.
What if you want to use ES6 but split words at a ' ', and not every char?
@AlfaBravo same.
Note that this solution splits some emoji such as 🏳️‍🌈, and splits combining diacritics mark from characters
32

Problem

.split('') splits emojis in half.

Even Onur's solutions only works for some emojis, but can't handle more complex languages or combined emojis.

Consider this emoji being ruined:

[..."🏳️‍🌈"] // returns ["🏳", "️", "‍", "🌈"]  instead of ["🏳️‍🌈"]

Also consider this Hindi text अनुच्छेद which is split like this:

[..."अनुच्छेद"]  // returns   ["अ", "न", "ु", "च", "्", "छ", "े", "द"]

but should in fact be split like this:

["अ","नु","च्","छे","द"]

This happens because some of the characters are combining marks (think diacritics/accents in European languages).

Solution

You can use the grapheme-splitter library for this:

It does proper standards-based letter split in all the hundreds of exotic edge-cases - yes, there are that many.

Install:
$ npm install --save grapheme-splitter

Usage:

const splitter = new GraphemeSplitter();

// plain latin alphabet - nothing spectacular
splitter.splitGraphemes("abcd"); // returns ["a", "b", "c", "d"]

// two-char emojis and six-char combined emoji
splitter.splitGraphemes("🌷🎁💩😜👍🏳️‍🌈"); // returns ["🌷","🎁","💩","😜","👍","🏳️‍🌈"]

// diacritics as combining marks, 10 JavaScript chars
splitter.splitGraphemes("Ĺo͂řȩm̅"); // returns ["Ĺ","o͂","ř","ȩ","m̅"]

// individual Korean characters (Jamo), 4 JavaScript chars
splitter.splitGraphemes("뎌쉐"); // returns ["뎌","쉐"]

// Hindi text with combining marks, 8 JavaScript chars
splitter.splitGraphemes("अनुच्छेद"); // returns ["अ","नु","च्","छे","द"]

// demonic multiple combining marks, 75 JavaScript chars
splitter.splitGraphemes("Z͑ͫ̓ͪ̂ͫ̽͏̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ͫ͗͢L̠ͨͧͩ͘G̴̻͈͍͔̹̑͗̎̅͛́Ǫ̵̹̻̝̳͂̌̌͘!͖̬̰̙̗̿̋ͥͥ̂ͣ̐́́͜͞"); // returns ["Z͑ͫ̓ͪ̂ͫ̽͏̴̙̤̞͉͚̯̞̠͍","A̴̵̜̰͔ͫ͗͢","L̠ͨͧͩ͘","G̴̻͈͍͔̹̑͗̎̅͛́","Ǫ̵̹̻̝̳͂̌̌͘","!͖̬̰̙̗̿̋ͥͥ̂ͣ̐́́͜͞"]

2 Comments

["अ", "न", "ु", "च", "्", "छ", "े", "द"] is that right in the sense of the Hindi language, it's the right split
This is great. But the standard way to do this now is with Intl.Segmenter: developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/…
19

It's as simple as:

s.split("");

The delimiter is an empty string, hence it will break up between each single character.

1 Comment

Does not consider wide characters.
9

The split() method in javascript accepts two parameters: a separator and a limit. The separator specifies the character to use for splitting the string. If you don't specify a separator, the entire string is returned, non-separated. But, if you specify the empty string as a separator, the string is split between each character.

Therefore:

s.split('')

will have the effect you seek.

More information here

Comments

6

A string in Javascript is already a character array.

You can simply access any character in the array as you would any other array.

var s = "overpopulation";
alert(s[0]) // alerts o.

UPDATE

As is pointed out in the comments below, the above method for accessing a character in a string is part of ECMAScript 5 which certain browsers may not conform to.

An alternative method you can use is charAt(index).

var s = "overpopulation";
    alert(s.charAt(0)) // alerts o.

2 Comments

This does not work in all browsers though (not in some versions of IE: developer.mozilla.org/en/JavaScript/Reference/Global_Objects/….
Thanks Felix. I've updated my answer to include charAt as defined pre ECMAScript 5.
5

To support emojis use this

('Dragon 🐉').split(/(?!$)/u);

=> ['D', 'r', 'a', 'g', 'o', 'n', ' ', '🐉']

1 Comment

It breaks with 'Flag 🏳️‍🌈'.split(/(?!$)/u) => ['F', 'l', 'a', 'g', ' ', '🏳', '️', '‍', '🌈']
4

You can use the regular expression /(?!$)/:

"overpopulation".split(/(?!$)/)

The negative look-ahead assertion (?!$) will match right in front of every character.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.