-4

I have this RegEx in JavaScript:

/^\d{4}-\d{2}-\d{2} ([a-z,]*)((.|\n|\r)*?)/mig  
2025-11-13 bb Test.
2025-11-12 bb Mail von Test
testd
trest
Tes
2025-11-12 bb Mail v

Regex101

but I need the multiline text. With ((.|\n|\r)*?) I got nothing, with ((.|\n|\r)*) I get all entries as one. How can I get the 3 multiline texts?

I expect the result to be an array like ["Test.", "Mail von Test testd trest Tes", "Mail v"].

4
  • The s flag will make . match newlines, you don't need to use (.|\n|\r) Commented Nov 21 at 17:14
  • @Barmar the problem of using . with s flag is that it will consume the whole string. Commented Nov 21 at 20:53
  • 1
    @Eddi That's not from using s, it's from using * in the pattern. Commented Nov 21 at 20:55
  • @Barmar You're right. Commented Nov 21 at 21:23

2 Answers 2

1

With ((.|\n|\r)*?) you're saying to take only the first character - a space, in the example.

On the other hand, ((.|\n|\r)*) you take any amount of characters (wich includes 2025-11-13too).

So you need a condition to stop:

.(?!^\d{4}-\d{2}-\d{2})

...tells to match any amount of characters if the current position is not of kind ^\d{4}-\d{2}-\d{2}.

Like said by @Barmar, you can simple use . with the s flag in replace to (.|\n|\r). You'r code:

/^\d{4}-\d{2}-\d{2} ([a-z,]*)(.(?!^\d{4}-\d{2}-\d{2}))*/smig
Sign up to request clarification or add additional context in comments.

Comments

0

An easy way around the newline issue is replacing newline characters with a space character before applying your regular expression, or replacing them with symbols that don't occur in the data naturally, so that you can restore them later if needed.

After that, i also made some adjustments to your regular expression, as it did not seem to do what you describe.

Here is my approach:

let data = `2025-11-13 bb Test.
2025-11-12 bb Mail von Test
testd
trest
Tes
2025-11-12 bb Mail v`;

data = data.replaceAll("\n", " ");
const pattern = /\d{4}-\d{2}-\d{2}\s+\S+\s+(.*?)(?=\s+\d|$)/g;
const matches = [...data.matchAll(pattern)];

matches.forEach(match => {
    console.log(`"${match[1].trim()}"`);
});

I first replace newlines with a space, and then use a regular expression to match the pattern in your data like this:

\d{4}-\d{2}-\d{2} checks for valid dates

\s+\S+\s+ skips one group of non-whitespace characters as a more robust way of handling the bb's in your data

(.*?)(?=\s+\d|$)/g matches anything that comes after until we reach a word that starts with a digit (mainly dates of the next entry)

With that, i get an output that matches your desired output:

"Test."
"Mail von Test testd trest Tes"
"Mail v"

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.