Regular Expression to match string which doesn't contain substring

Question

I have a comma separated list as shown below. The list is actually on one line, but I have split it up to demonstrate the syntax and that each single unit contains 5 elements. There is no comma at the end of the list

ro:2581,1309531682152,A,Place,Page,
me:2642,1310989368864,A,Place,Page,
uk:2556,1309267095061,A,Place,Page,
me:2642,1310989380238,D,Place,Page,
me:2642,1334659643627,D,Place,Page,
ro:3562,1378721526696,A,Place,Page,
uk:1319,1309337246675,D,Place,Page,
ro:2581,1379500694666,D,Place,Page,
uk:1319,1309337246675,A,Place,Page

What I am trying to do is remove any unit (full line) that does not begin with uk:. I.e., the results will be:

uk:2556,1309267095061,A,Place,Page,
uk:1319,1309337246675,D,Place,Page,
uk:1319,1309337246675,A,Place,Page

If the string was on separate lines as my example, I could do this relatively easy, but because it is all on one line, I cannot get it to work. Can anyone point me in the right direction?

Thanks

Just to get it clear,is your input looks like something like this?: ro:2581,1309531682152,A,Place,Pageme:2642,1310989368864,A,Place,Page (note: there is no comma between "page" and "me") — nafas
– nafas, Commented Aug 22, 2014 at 13:34
Why do you need regex solution and what tool/platform are you using for this? — anubhava
– anubhava, Commented Aug 22, 2014 at 13:35
Lot's of confusing negations in your description. 'Doesn't contain' and 'remove.. that does not begin with'. You just plainly want all "rows" that begin with uk right? — KekuSemau
– KekuSemau, Commented Aug 22, 2014 at 13:37
@nafas. There IS a comma between "page" and "me". My actual string is ro:2581,1309531682152,A,Place,Page,me:2642,1310989368864,A,Place,Page,uk:2556,1309267095061,A,Place,Page,me:2642,1310989380238,D,Place,Page,me:2642,1334659643627,D,Place,Page,ro:3562,1378721526696,A,Place,Page,uk:1319,1309337246675,D,Place,Page,ro:2581,1379500694666,D,Place,Page,uk:1319,1309337246675,A,Place,Page — BobbyP
– BobbyP, Commented Aug 22, 2014 at 13:37
I'm not sure I understand your question correctly. But maybe you're looking for something like this: \b(?!uk)[a-z]+:\d+,\d+,[a-z]+,[a-z]+,[a-z]+,. See demo. — Amal
– Amal, Commented Aug 22, 2014 at 13:38

RevanProdigalKnight · Accepted Answer · 2014-08-22 13:55:52Z

3

This should work:

(uk:\d+,\d+,\w,\w+,\w+)

Demo

It looks for uk: and then it's pretty much comma-counting from there on.

EDIT:

Since OP has now clarified that what they're using can only remove strings:

,?[^u][^k]:\d+,\d+,\w,\w+,\w+

Demo 2

This looks for an optional comma followed by two letters that are not u and not k in that order, then a colon (:), and then the rest of the regex is the same.

edited Aug 22, 2014 at 13:55

answered Aug 22, 2014 at 13:37

RevanProdigalKnight

1,3261 gold badge14 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

BobbyP Over a year ago

It seems a big chunk of my original question has somehow been removed, which clearly has caused a bit of confusion. @RevanProdigalKnight, this is the closest answer. What is missing from the question is, I am using a custom language of my CMS, which only allows me to "remove" matched strings. Therefore, I actually need to match anything that DOES NOT begin with uk:, so I can remove it from the original. This will leave the lines that DO begin with uk:. In short, I need the opposite of this demo. I could probably use ((ro|me):\d+,\d+,\w,\w+,\w+), but in real life, there will be other values.

RevanProdigalKnight Over a year ago

@Typhoon101 I've added a regex that should handle only removing the cases that don't begin with uk:.

Community · Accepted Answer · 2020-06-20 09:12:55Z

0

I would suggest a simple regex like this:

(\buk:.+?,Page)(?:,|$)

and grab matched group #1

RegEx Demo

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Aug 22, 2014 at 13:41

anubhava

790k67 gold badges603 silver badges671 bronze badges

3 Comments

anubhava Over a year ago

Hope you have seen the linked demo :)

hex494D49 Over a year ago

Of course I did :) The second match should stop on ro:2581...

anubhava Over a year ago

See it working here That is because OP has typed Page which was in hex: 0x6150 0x80e2 0xe28c 0x8b80 0x6567 0x000a

Collectives™ on Stack Overflow

Regular Expression to match string which doesn't contain substring

2 Answers 2

2 Comments

RegEx Demo

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related