0

I am editing a couple of hundred HTML files and I have to replace all the stuff manually, so I was wondering whether it could be done using regex.I don't think it is possible, but it might be, so please help me out.
Okay, so for example, I have many <p> tags in a file, each with a different class. eg:
<p class="class1">stuff here</p>
<p class="class2">more stuff here</p>
I wanted to replace the "stuff here" and "more stuff here" with something, for example
<p class="class1">[content]</p>
<p class="class2">[content]</p> .
I wanted to know if that is possible.
I'm using notepad++.
P.S. I'm new to regex.

5
  • are you replacing everything with the same content? Commented Jan 2, 2016 at 15:03
  • Please show us what you have attempted so far. Commented Jan 2, 2016 at 15:03
  • @christopherclark yes, replace everything within the tags with the same content. Commented Jan 2, 2016 at 15:06
  • I would suggest googling around a little bit, install a regex helper, and attempting something. There are a lot of things you know that we don't for example is there a chance of " <" being in the innerhtml? would any of your <p> tags have id's? Are you only looking for class="classX"? or other p tags as well? Commented Jan 2, 2016 at 15:10
  • Honestly, I don't see a regex being the best option. I see PHP and a database updating the content dynamically as the best option for something like this, just my two cents. Commented Jan 3, 2016 at 5:21

1 Answer 1

1

I think notepad++ is great for stuff like this. Open up Find/Replace, and check the regular expressions box in the dialog's Search Mode section.

In the "Find what" field, try this:

    \<p\ class\=(.*)\>(.*)\<\/p\>

and in "Replace with":

    \<p\ class\=\1\>[content]\<\/p\>

the \1 here will take whatever (found by (.*)) between the class= and the angle bracket > which ends the tag, and replace it with itself, which essentially results in ignoring the class name, rather than having to specify. the second (.*) catches the current content inside the paragraph tag, which is what you want to replace. So where I wrote [content] in the "Replace with" block, that's where you'd put your new content. This does limit you to content that you can paste into the notepad++ find/replace dialog, but I think it has a pretty huge limit.

If I'm remembering that text field's limitations incorrectly, another thing you could do is just adjust my "Replace with" text to just replace the old text with some newlines:

    \<p\ class\=\1\>\n\n\<\/p\>

This will delete the old text and leave a clear line where it once was, making it easy to paste whatever you want into the normal editor pane.

The first way is probably better, if your new content will fit the Replace With field, because this regex works once per line. And you can click "Replace" a couple times, and if it's working, clicking "Replace all" will iterate through every <p> element in the file.

Note: this solution assumes that your <p> tags open and close within one line, as you typed them your question description. If they break lines, you're going to want to enable . matches newline in the Replace dialog, and... you need trickier (more precise) syntax than (.*) to catch your class name and content-to-be-replaced. Let me know if this is the the case, and I'll fiddle with it and see if I can help more. The (.*) needs to change to (.*?) or something; the search needs to get more greedy, because if . matches newline, then .* matches any and every possible character infinite times, i.e., the whole document.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.