3

I am trying to convert a numbered list into an array of items. This is what I have so far:

let input = `1. This is a text\n    where each item can span over multiple lines\n  1.1 this is another item\n 1.2another item\n  2. that I want to\n    extract each seperate\n    item from\n    3. How can I do that?`;

let regex = /(\d+\.\d+|\d+)\s(.*)/g;
let matches = input.match(regex);
console.log(matches);

This only produces the following output:

"1.1 this is another item"

What I would like is something like this:

"1. This is a text"
"1.1 this is another item"
"1.2another item"
...and so on

Why is it matching only one item out of this string? What am I doing wrong and how can I fix it?

2 Answers 2

3

Your regex does not foresee a dot after a number when there is no second number following it. It also requires a space after the number, but you have a case where there is no such space. So make it optional.

Use the s modifier so . also matches newline characters.

If a new item can start on the same line, you'll need a look-ahead to foresee where a match must end.

Suggested correction:

let input = `1. This is a text\n    where each item can span over multiple lines\n  1.1 this is another item\n 1.2another item\n  2. that I want to\n    extract each seperate\n    item from\n    3. How can I do that?`;

let regex = /(\d+\.\d*)\s?(.*?)(?=\d+\.|$)/gs;
let matches = input.match(regex);
console.log(matches);

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! This works on this example, but when I try it on another string such as the one here, it returns a null or just the entire string.
OK, I see. Updated my answer.
0

Another option using a negated character class:

\b\d+\.\D*(?:\d(?!\.)[^.]*)*

Explanation

  • \b\d+\.\ A word boundary, match 1+ digits and a dot
  • \D* Optionally match non digits
  • (?:\d(?!\.)[^.]*)* Optionally match a digit asserting not a dot directly to the right

Regex demo

let input = `1. This is a text\n    where each item can span over multiple lines\n  1.1 this is another item\n 1.2another item\n  2. that I want to\n    extract each seperate\n    item from\n    3. How can I do that?`;

let regex = /\b\d+\.\D*(?:\d(?!\.)[^.]*)*/g;
let matches = input.match(regex);
console.log(matches);

If you want to keep the start of the string into account where the digits and the dot start, you can follow the match by asserting not a digit and dot pattern at the start of the string:

^[^\S\n]*\d+\..*(?:\n(?![^\S\n]*\d+\.).*)*

Regex demo

let input = "1. This is a text with a number 1.2 and 3.\n    where each item can span over multiple lines\n  1.1 this is another item\n 1.2another item\n  2. that I want to\n    extract each seperate\n    item from\n    3. How can I do that?";

let regex = /^[^\S\n]*\d+\..*(?:\n(?![^\S\n]*\d+\.).*)*/gm;
let matches = input.match(regex).map(s => s.trim());
console.log(matches);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.