Something worked for me today, but I'm not sure that I understand it enough to be certain that it will work in random future versions of Javascript.
I wanted something like string.split() on whitespace, but that would also return the delimiter strings. In other words:
f("abc def ghi")
=> ["abc", " ", "def", " ", "ghi"]
My first attempt was a dozen lines of ugly regexp searches and loops.
Then I had a crazy idea that I figured had low odds of working, but was worth a quick test: do a .split that would match on either delimiter and non-delimiter ranges. To my joy and surprise, this basically worked:
"abc def ghi".split(/([^\s]+|[\s]+)/)
=> ["", "abc", "", " ", "", "def", "", " ", "", "ghi", ""]
With one more small tweak, I have exactly what I need:
"abc def ghi".split(/([^\s]+|[\s]+)/).filter(s=>s.length)
=> ["abc", " ", "def", " ", "ghi"]
The problem, of course, is that I can imagine Javascript implementations that would behave differently on this somewhat pathological regexp.
Can I depend on this behavior always working? Why? Where is the spec documented?
For "extra credit" can you give an intuitive argument why this behavior is the most reasonable?
split's behavior pathological, but I also can't explain why it's "the most reasonable" any more than any other JS feature. You can use"abc def ghi".match(/\s+|\S+/g)if you don't care for the empty strings. While this seems too broad, I don't think the dupe is accurate, since OP realizes that JS engines exist and they change over time.str.split(/\b/).(\s+). The+will match multiple spaces. The()wrapper will make sure the delimiters are also added in the output.string.match(/([^\s]+|[\s]+)/g)