regex: match everything, but not a certain multiletter string within VBA Excel function (regular expression, inspite of, anything but, visual basic)

Question

Folks, here is another on "regex: match everything, but not ...", but so far non seems to fit my simple question.

I need to program my Excel function to separate strings from their preceding enumerators (similar as done here: VBA regex: extract multiple strings between strings within Excel cell with custom function)

My first simple string is: "1 Rome; 2 London; 3 Wembley Stadium"

My second string looks like: "1.1 Winner; 2.1 Looser; 3.3 Penalties (always loose, dam)"

And I need to extract only the names but not the ranks ( eg. "Rome; London; Wembley Stadium" and "Winner; Looser; Penalties (always loose, dam)").

Using a regex tester (https://extendsclass.com/regex-tester.html), I can simply match the opposite by:

([0-9]+\s*) and it gives me:

"1 Rome, 2 London, 3 Wembley Stadium".

But how to reverse it? I tried something like:

[^0-9 |;]+[^0-9 |;], but it also excludes white spaces that I want to maintain (e.g. after the comma and in between Wembley and Stadium, ... "1 Rome, 2 London, 3 Wembley Stadium"). I guess the "0-9 " needs be determined somehow as one continuous string. I tried various brackets, quotation marks, \s*, but nothing jet.

Note: I'm working in a visual basic environment and not allowing lookbehinds!
Note: My solutions needs to be compatible across Excel versions as far as possible!

If you want to end up with a list of individual names then splitting on ; & looping removing leading spaces/digits would be a simple way. If you want the names in a single string together then just match the digit part (\d*(\.?\d+)\s+) and RegEx.Replace it with "". — Alex K.
– Alex K., Commented Jul 23, 2021 at 14:45
You should simply add (?:\.\d+)* to match zero or more occurrences of a . and one or more digits, \d+(?:\.\d+)*\s*(.*?)(?=;\s*\d+(?:\.\d+)*\s|$) — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Jul 23, 2021 at 16:22
@Wiktor: Somehow it does not, even though it reads logical to me. It includes also the numerical prefix in my VBA function. No idea why. — MsGISRocker
– MsGISRocker, Commented Jul 26, 2021 at 13:49
Again, use match.Submatches(0) only. Of course the number will land in the whole match. — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Jul 26, 2021 at 14:48

Echo9k · Accepted Answer · 2021-07-23 15:40:14Z

1

I tried negating the numerical values and the period as "one continuous string" ([^\d|\.]). This will keep two spaces at some places.

The explaination group by group regexr

To remove these double spaces try with ([^\d|\.])(?<!; ) Here I'm just adding a negative look behind which might not be supported by all regex interpreters.

answered Jul 23, 2021 at 15:40

Echo9k

5946 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

regex: match everything, but not a certain multiletter string within VBA Excel function (regular expression, inspite of, anything but, visual basic)

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related