0

I have this line, that is sometimes repeated in the html file, and I want to:

1- get a regex to just find files that has that line duplicate

2- get a regex to search and remove the second instance it come on the file, and leave the first. So it keeps only the first, not the second

Given that the lines are not after each other, they are separated with lots of code and text.

The line is:

<script src="/resources/common.js" type="text/javascript"></script>

or it could have words before or after the line that is needed to be removed, like:

<script src="/resources/common.js" type="text/javascript"></script><div id=something"...

I use Notepad++ to search and replace.

4
  • 2
    Which tool/language are you planning to use for this? Also, is it possible that there may be more than two copies of that line in the file (and if so, do you want to remove all but the first)? Commented Oct 18, 2012 at 12:36
  • What tool are you using for regular expressions ? sed ? grep ? java ? an editor ? Commented Oct 18, 2012 at 12:41
  • Notepad++, and yes, it would be great if there are more to remove them all but the first instance in that file.. Thanks Commented Oct 18, 2012 at 12:41
  • I was working in a solution using grep and sed, but I see you are using Windows. :( Commented Oct 18, 2012 at 12:46

2 Answers 2

2

If you were using EditPad Pro (or EditPad Lite, which is free), it would be easy:

Search for

(?s)(?<=<script src="/resources/common\.js" type="text/javascript"></script>.*)<script src="/resources/common\.js" type="text/javascript"></script>

and replace all with nothing.

A screenshot to clarify:

EPP screenshot

With other editors, you will have to apply the following regex repeatedly (once for each duplication):

(?s)(?<=<script src="/resources/common\.js" type="text/javascript"></script>)(.*?)<script src="/resources/common\.js" type="text/javascript"></script>

but this time replace the match with \1.

Sign up to request clarification or add additional context in comments.

9 Comments

That's a better answer than mine. :)
Thanks both of you for your time.. I'll try EditPad and see, as I need to do that for thousands of files, and NP++ have an option to search and replace something in a whole directory.. I'll see if EP is fast too..
@Mike RegexBuddy has built in module for applying these patterns using grep, give it a try too. regexbuddy.com
@deadlock: Right, but there is no free version of RegexBuddy (but it's worth every penny).
I tried the regex suggested in the answer using EditPad but it found nothing.. although the file I was testing with had a duplicate.. perhaps because there were some words after the line? I updated the question with another example..
|
0

You may consider using positive lookbehind which is used for finding and not matching, you can use this to find the first occurrence of your line, then matching the rest of occurrences.

Try This one. It will match all the occurrences of your line except the first one.

(?<=<script src=./resources/common.js..+?</script>.*?)(<script src=./resources/common.js..+?</script>)

note: Positive lookbehinds may or may not work depending on the regex engine you are using, but it should work in most cases.*


More info: Regular Expression Lookaround

4 Comments

This would only find the first duplicate; also it will only work in .NET languages or EditPadPro, definitely not in Notepad++.
@TimPietzcker Actually, by trying this pattern on RegexBuddy using JGSoft, it matches all but the first occurrence. And I do not know if this would work on Notepad++. Even if it doesn't, he could use anything else. I think Notepad++ is not the constraint here.
Yes, but RegexBuddy does use the JGSoft regex engine (like EditPadPro), so you can't generalize the result for other editors. Click on the Use tab and choose Perl to see what I mean.
@TimPietzcker yes I understand what you mean, that's why I left the note above.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.