8

I'm trying to turn this:

"This is a test this is a test"

into this:

["This is a", "test this is", "a test"]

I tried this:

const re = /\b[\w']+(?:[^\w\n]+[\w']+){0,2}\b/
const wordList = sample.split(re)
console.log(wordList)

But I got this:

[ '',
  ' ',
  ' ']

Why is this?

(The rule is to split the string every N words.)

3
  • What is the rule to follow to split the string? Commented Nov 26, 2016 at 10:10
  • @A.J I updated the question. Commented Nov 26, 2016 at 10:11
  • 3
    .split() doesn't include the delimiter so it does the opposite of what you want. You need to do a regular regex search (with a g modifier) instead of split. Commented Nov 26, 2016 at 10:12

5 Answers 5

11

The String#split method will split the string by the matched content so it won't include the matched string within the result array.

Use the String#match method with a global flag (g) on your regular expression instead:

var sample="This is a test this is a test"

const re = /\b[\w']+(?:\s+[\w']+){0,2}/g;
const wordList = sample.match(re);
console.log(wordList);

Regex explanation here.

Sign up to request clarification or add additional context in comments.

Comments

8

As an alternate approach, you can split string by space and the merge chunks in batch.

function splitByWordCount(str, count) {
  var arr = str.split(' ')
  var r = [];
  while (arr.length) {
    r.push(arr.splice(0, count).join(' '))
  }
  return r;
}

var a = "This is a test this is a test";
console.log(splitByWordCount(a, 3))
console.log(splitByWordCount(a, 2))

3 Comments

Awesome! +1 for splitting text in chunks.
Can you explain to me what arr.length means in the while loop argument?
@MekelIlyasa every array has a property called length which denotes the number of items it holds. while(arr.length) will check if length is greater than 0, as in JS 0 is considered as falsey. Also, in the next line, I'm removing items using .splice, so it will update .length. Hope its clear now
4

your code is good to go. but not with split. split will treat it as a delimitor. for instance something like this:

var arr = "1, 1, 1, 1";
arr.split(',') === [1, 1, 1, 1] ;
//but 
arr.split(1) === [', ', ', ', ', ', ', '];

Instead use match or exec. like this

var x = "This is a test this is a test";
var re = /\b[\w']+(?:[^\w\n]+[\w']+){0,2}\b/g
var y = x.match(re);
console.log(y);

Comments

1

Use whitespace special character (\s) and match function instead of split:

var wordList = sample.text().match(/\s?(?:\w+\s?){1,3}/g);

Split breaks string where regex matches. Match returns whatever that is matched.

Check this fiddle.

Comments

1

You could split like that:

var str = 'This is a test this is a test';
var wrd = str.split(/((?:\w+\s+){1,3})/);
console.log(wrd);

But, you have to delete empty elements from the array.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.