0

I am writing javascript and I have to select some text using RegExp. Cheatsheets doesn't help me.

I have a text:

Some dummy text and nothing more.<address style='text-align: right;'><span style='color: #EA5528; font: 13px Arial !important;'>asd</span></address>

So I want to remove all besides text: address-tag and all what is inside. The expected result:

Some dummy text and nothing more.

Nothing completely, but I am novice in RegExps.

3 Answers 3

1

If you can have nested address tags, a regexp will be quite hard to build.

If not, you could replace "<address .*?</address>" by "".

Javascript: .replace(/<address .*?<\/address>/g, "");

Otherwise, use a parser ;)

Some doc: http://www.regular-expressions.info/repeat.html, chapter "Laziness Instead of Greediness".

Sign up to request clarification or add additional context in comments.

1 Comment

.replace(/<address .*?<\/address>/g, "") is what I wanted. Works fine, thanks.
1

How about making an element from the HTML and selecting the first child? Let your browser do the heavy lifting:

var elem = document.createElement();
elem.innerHTML = "Some dummy text and nothing more.<address style='text-align: right;'><span style='color: #EA5528; font: 13px Arial !important;'>asd</span></address>";
console.log(elem.firstChild.nodeValue);

jsFiddle

This creates an empty HTML element, then sets the HTML to your required HTML. Now your browser sees the whole thing as something like:

<Node>
    Some dummy text and nothing more.<address style='text-align: right;'><span style='color: #EA5528; font: 13px Arial !important;'>asd</span></address>
</Node>

Now, it also breaks down unrwapped text into "text nodes". So the firstChild of the Node element that you created would be the block of text (or pretty much anything that isn't HTML tags):

Some dummy text and nothing more.

2 Comments

It works. But how? You create Element, than paste text with html and in the output I see only text, magic! What means document.createElement() without arguments?
Added explanation. I guess I shouldn't be able to create an element with an empty tagName (that's the first argument to createElement), but it works. You could easily replace it with a placeholder element such as: document.createElement("span")
0

don't use regular expressions to parse html...

Get the node your text is in, loop over childNodes, skipping the address nodes, gather the innerText for remaining elements.

something like this might work:

var element = document.getElementById('message'),
    result = '', i = 0;

for (i = 0; i < element.childNodes.length; i ++)
{
    var e = element.childNodes.item(i);
    if (e is HTMLElement && e.localName.toUpperCase() == 'ADDRESS')
    {
        // skip these
    } else {
        result += e.innerText;
    }
}

Note that this is untested, typed in the SO textbox and provided to illustrate an idea, not to solve the worlds problems.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.