2

I am trying to get the first two words of a string in JavaScript using regular expressions.

I am using:

var str = "Reed Hastings, CEO Netflix"; 
var res = str.match(/^\s*(\w+ \w+)/);

Which spits back - Reed Hastings,Reed Hastings

It's kind of working, but can anyone tell me why it is getting duplicated?

2
  • Can your words be separated with punctuation? If they are separated with spaces, use Marin's solution, no regex is required. Commented Apr 27, 2015 at 8:07
  • Sometimes there is a comma after the second word other times there is just a space. Commented Apr 27, 2015 at 8:10

4 Answers 4

3

...why it is duplicated?

match returns an array where the first entry is the overall match for the entire expression, followed entries for the contents of each capture group you've defined in the regex. Since you've defined a capture group, your array has two entries. The first entry would have leading whitespace if anything matched the \s* at the beginning; the second wouldn't, beause it only has what's in the group.

Here's a simple example:

var rex = /This is a test of (.*)$/;
var str = "This is a test of something really cool";
var match = str.match(rex);
match.forEach(function(entry, index) {
  snippet.log(index + ": '" + entry + "'");
});
<!-- Script provides the `snippet` object, see http://meta.stackexchange.com/a/242144/134069 -->
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>

Sometimes there is a comma after the second word other times there is just a space

Your expression won't match that, it's only allowing for a space (and it's only allowing for one of them). If you want to allow a comma as well, and perhaps any number of spaces, then:

/^\s*(\w+[,\s]+\w+)/

Or if you only want to allow one comma, possibly with whitespace on either side

/^\s*(\w+\s*,?\s*+\w+)/

You might also consider two capture groups (one for each word):

/^\s*(\w+)\s*,?\s*+(\w+)/

Example:

var str = "Reed Hastings, CEO Netflix"; 
var res = str.match(/^\s*(\w+)\s*,?\s*(\w+)/);
if (res) {
  snippet.log("Word 1: '" + res[1] + "'");
  snippet.log("Word 2: '" + res[2] + "'");
} else {
  snippet.log("String didn't match");
}
<!-- Script provides the `snippet` object, see http://meta.stackexchange.com/a/242144/134069 -->
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>

Sign up to request clarification or add additional context in comments.

Comments

0

A regex solution to grab only words from the beginning of a line (even split by punctuation):

var re = /^([a-z]+)[\s,;:]+([a-z]+)/i; 
var str = 'Reed Hastings, CEO Netflix';
var m;
 
if ((m = re.exec(str)) !== null) {
    document.getElementById("res").innerHTML = m[1] + " " + m[2];
}
<div id="res"/>

T.J.Crowder gave you an explanation why you have 2 entries after match, the main point is that there is always a 0th group in a regex object that is equal to the full match. N(stringsInArray) = N(capturing groups) + 1.

Comments

0
var str = "How are you doing today?";
var wordsArray = str.split(" ");
var result = wordsArray[0] + " " + wordsArray[1];

result will be "How are".

3 Comments

Obviously add a bit of error handling in, but I'd go with this over regex.
Depends on the actual data. If the white space can be tabs or if there can be multiple spaces, a regex would be more appropriate.
This doesn't answer the question, however.
0

remove the ^ in front and make the expression global. ^ means beginning of a string so it will only matches for Reed Hastings.

str.match(/\s*(\w+ \w+)/g)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.