1

I have a comma separated list as shown below. The list is actually on one line, but I have split it up to demonstrate the syntax and that each single unit contains 5 elements. There is no comma at the end of the list

ro:2581,1309531682152,A,Place,Page,
me:2642,1310989368864,A,Place,Page,
uk:2556,1309267095061,A,Place,Page,
me:2642,1310989380238,D,Place,Page,
me:2642,1334659643627,D,Place,Page,
ro:3562,1378721526696,A,Place,Page,
uk:1319,1309337246675,D,Place,Page,
ro:2581,1379500694666,D,Place,Page,
uk:1319,1309337246675,A,Place,Page

What I am trying to do is remove any unit (full line) that does not begin with uk:. I.e., the results will be:

uk:2556,1309267095061,A,Place,Page,
uk:1319,1309337246675,D,Place,Page,
uk:1319,1309337246675,A,Place,Page

If the string was on separate lines as my example, I could do this relatively easy, but because it is all on one line, I cannot get it to work. Can anyone point me in the right direction?

Thanks

10
  • Just to get it clear,is your input looks like something like this?: ro:2581,1309531682152,A,Place,Pageme:2642,1310989368864,A,Place,Page (note: there is no comma between "page" and "me") Commented Aug 22, 2014 at 13:34
  • Why do you need regex solution and what tool/platform are you using for this? Commented Aug 22, 2014 at 13:35
  • Lot's of confusing negations in your description. 'Doesn't contain' and 'remove.. that does not begin with'. You just plainly want all "rows" that begin with uk right? Commented Aug 22, 2014 at 13:37
  • @nafas. There IS a comma between "page" and "me". My actual string is ro:2581,1309531682152,A,Place,Page,me:2642,1310989368864,A,Place,Page,uk:2556,1309267095061,A,Place,Page,me:2642,1310989380238,D,Place,Page,me:2642,1334659643627,D,Place,Page,ro:3562,1378721526696,A,Place,Page,uk:1319,1309337246675,D,Place,Page,ro:2581,1379500694666,D,Place,Page,uk:1319,1309337246675,A,Place,Page Commented Aug 22, 2014 at 13:37
  • I'm not sure I understand your question correctly. But maybe you're looking for something like this: \b(?!uk)[a-z]+:\d+,\d+,[a-z]+,[a-z]+,[a-z]+,. See demo. Commented Aug 22, 2014 at 13:38

2 Answers 2

3

This should work:

(uk:\d+,\d+,\w,\w+,\w+)

Demo

It looks for uk: and then it's pretty much comma-counting from there on.

EDIT:

Since OP has now clarified that what they're using can only remove strings:

,?[^u][^k]:\d+,\d+,\w,\w+,\w+

Demo 2

This looks for an optional comma followed by two letters that are not u and not k in that order, then a colon (:), and then the rest of the regex is the same.

Sign up to request clarification or add additional context in comments.

2 Comments

It seems a big chunk of my original question has somehow been removed, which clearly has caused a bit of confusion. @RevanProdigalKnight, this is the closest answer. What is missing from the question is, I am using a custom language of my CMS, which only allows me to "remove" matched strings. Therefore, I actually need to match anything that DOES NOT begin with uk:, so I can remove it from the original. This will leave the lines that DO begin with uk:. In short, I need the opposite of this demo. I could probably use ((ro|me):\d+,\d+,\w,\w+,\w+), but in real life, there will be other values.
@Typhoon101 I've added a regex that should handle only removing the cases that don't begin with uk:.
0

I would suggest a simple regex like this:

(\buk:.+?,Page)(?:,|$)

and grab matched group #1

RegEx Demo

3 Comments

Hope you have seen the linked demo :)
Of course I did :) The second match should stop on ro:2581...
See it working here That is because OP has typed Page which was in hex: 0x6150 0x80e2 0xe28c 0x8b80 0x6567 0x000a

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.