6

There are already a couple of similar questions:

My situation is a bit different.

I need to count the number of sentences in a string.

The closest answer to what I need would be:

str.replace(/([.?!])\s*(?=[A-Z])/g, "$1|").split("|")

The only problem here is that this RegEx assumes a sentence starts with a capital letter, which may not always be the case.

To be more specific, I would define a sentence as:

  • Starting with a letter (capital or not), a number or even a symbol (such as $ or €).
  • Ending with a punctuation sign, such as a " . ", a " ? " or a " ! ".

However, if a sentence contains a number, which itself contains a " . " or a " , ", then the sentence should be considered as one sentence and not two.

Last but not least, we can assume that, except the first sentence, a sentence is preceded by a space.

Given a random string, how can I count the number of sentences it contains with Javascript (or CoffeeScript for that matter)?

3 Answers 3

5

One regex to solve your problem is:

\w[.?!](\s|$)

The parts are as follows:

\w - Word character
\[.?!] - Punctuation as specified.
(\s|$) - Whitespace character OR the end of the string.

You may be able to use a character class instead of a group:

[\s|$]

For the final element, but that isn't working on https://regex101.com/.

Tested on the following:

Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.

And finds six sentences (bolded the end of sentences, not the actual match). Note that the different grouping might pose a problem if you're depending on it for any reason.

Sign up to request clarification or add additional context in comments.

2 Comments

How to add this in JS?
Your last 2 sentences are counted as 1 with the example regex. It doesn't match ).. The corrected regex is str.match(/[\w|\)][.?!](\s|$)/g).length
1

I figured out a much simpler solution.

let text = text + " ";
const count = text.split(". ").length - 1;
console.log(count);

1 Comment

hardly covers all cases
1

This works if you have a single char at the end of a sentence in a string.

const text = ""; //insert your string here
const re = /[.!?]/;
const numOfSentences = text.split(re);
console.log(numOfSentences.length - 1);

1 Comment

Please, specify what exactly is your question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.