Regex filtering groups of value and key fields with optionally empty value fields

Question

I've got a bit of a problem with coming up with the correct regex. I had to create a regex for the following text: Feld1 = 1134 2000 0101 0202 0303

Name1 = Ein Kleiner Namens Test

Daten1 = 2200220

VWZ =

Name2. =

Daten2 = 1100110

The regex has to find all keys and appropriate values and store them in the matches. So far so good. ([\s]+(?[^\s]+)[\s]+=[\s]+(?[^\r\n]+)) did a very nice job there. With one exception: If a value is empty it doesnt recognize it and thinks the key+value of the next line is the value it should assign to the key.

I experimented and found some regex that would rectify this problem BUT it would put the space after the = also into the regex. TRy as I might I'm not finding a regex that solves both situations so that I just get the correct key, value pairs:

The question would be what am I doing wrong and how do I need to modify the regex to achieve my goal?

Why do you think you need a regular exception for that, and why only one regular exception? The Regular exception engine could work as a parser and can be used in parsing, but this is not a parser. Maybe you could greatly simplify the problem if you look at it rationally. Besides, you did not really describe the set of valid input data. How many blank spaces are allowed? How should the lines be separated? and so on... — Sergey A Kryukov
– Sergey A Kryukov, Commented Mar 12 at 20:19
@SergeyAKryukov sry didnt notice that the screenshot didn't have it. lines are separated by \r\n the number of spaces before and after a = is 1 each (but as can be seen with vwz the before can have more spaces than only 1 the after always only 1. I didn't see much of a simpler way than a regex tbh. — Thomas
– Thomas, Commented Mar 12 at 21:17
At least parse into separate lines, and parse each line parse using the same function. Even if you perfectly match the entire input with a regular expression, you will have trouble finding the keys and values in all those groups. With a single regular expression, you lose the reuse. — Sergey A Kryukov
– Sergey A Kryukov, Commented Mar 12 at 22:57
Key value pairs. "For each" line split on "="; token[0] is the key; token[1], if any, is the value. — Gerry Schmitz
– Gerry Schmitz, Commented Mar 13 at 4:50

blhsing · Accepted Answer · 2025-03-13 07:03:47Z

0

The problem is that \s matches a newline as well, so your [\s]+ after = also consumes a newline, leaving the next line as the value.

You can instead use =[^\n\S]* to consume non-newline space characters after =:

^\s*(?<Key>\S+)\s*=[^\n\S]*(?<Value>.*)

Demo: https://regex101.com/r/5l0yNA/1

answered Mar 13 at 7:03

blhsing

109k9 gold badges88 silver badges132 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Regex filtering groups of value and key fields with optionally empty value fields

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related