1

I am coding a little bookmarket to convert all the devises in the current page to another. I heavily relies on regexp, and I use Jquery to easy the work.

For now, I do that like a big fat pig, replacing all the body :

$("body").children().each(function(){
    var $this = $(this);
    var h = $.html().replace(/eyes_hurting_regexp/g, "my_super_result");
    $this.html(h);
});

It works fine on static page, but if js events are involves, it's an Apocalypse.

The only way I can think of is to go trough all the node, check if it contains text only, then replace the text. On heavy HTML markup, I'm worried about the perfs.

Any idea out here ?

2 Answers 2

7

Unfortunately, going through each text node, step-by-step, is the only reliable way to do this. This has worked for me in the past: (demo)

function findAndReplace(searchText, replacement, searchNode) {
    if (!searchText || typeof replacement === 'undefined') {
        // Throw error here if you want...
        return;
    }
    var regex = typeof searchText === 'string' ?
                new RegExp(searchText, 'g') : searchText,
        childNodes = (searchNode || document.body).childNodes,
        cnLength = childNodes.length,
        excludes = 'html,head,style,title,link,meta,script,object,iframe';
    while (cnLength--) {
        var currentNode = childNodes[cnLength];
        if (currentNode.nodeType === 1 &&
            (excludes + ',').indexOf(currentNode.nodeName.toLowerCase() + ',') === -1) {
            arguments.callee(searchText, replacement, currentNode);
        }
        if (currentNode.nodeType !== 3 || !regex.test(currentNode.data) ) {
            continue;
        }
        var parent = currentNode.parentNode,
            frag = (function(){
                var html = currentNode.data.replace(regex, replacement),
                    wrap = document.createElement('div'),
                    frag = document.createDocumentFragment();
                wrap.innerHTML = html;
                while (wrap.firstChild) {
                    frag.appendChild(wrap.firstChild);
                }
                return frag;
            })();
        parent.insertBefore(frag, currentNode);
        parent.removeChild(currentNode);
    }
}
Sign up to request clarification or add additional context in comments.

4 Comments

Wow, whish I could vote twice for this one. Full algo and confirmation of the pitfall. Thank you so much. :-) !
BTW, is that cross browser of do you advice me to rewrite it with JQuery ?
I forgot to mention; make sure 'searchText' is a regular expression - if it's a string it will be "converted" to a regex anyway but it's something to be aware of - characters like '[' and '{' and '?' will be taken literally. (if not escaped)
I've tested it in IE, FF and Chrome - seems to work fine. jQuery offers little in the way of text node traversing so I wouldn't bother rewriting it. :)
0

I modified the script for my own needs and put the new version here in case somebody would need the new features :

  • Can handle a replace callback function.
  • External node blacklist.
  • Some comments so the code won't hurt someone else eyes :-)

    function findAndReplace(searchText, replacement, callback, searchNode, blacklist) {
    
    var regex = typeof searchText === 'string' ? new RegExp(searchText, 'g') : searchText,
        childNodes = (searchNode || document.body).childNodes,
        cnLength = childNodes.length,
        excludes = blacklist || {'html' : '',
                    'head' : '',
                    'style' : '',
                    'title' : '',
                    'link'  : '',
                    'meta' : '',
                    'script' : '',
                    'object' : '',
                    'iframe' : ''};
    
    while (cnLength--) 
    {
        var currentNode = childNodes[cnLength];
    
        // see http://www.sutekidane.net/memo/objet-node-nodetype.html for constant ref 
    
        // recursive call if the node is of type "ELEMENT" and not blacklisted
        if (currentNode.nodeType === Node.ELEMENT_NODE &&
            !(currentNode.nodeName.toLowerCase() in excludes)) {
            arguments.callee(searchText, replacement, callback, currentNode, excludes);
        }
    
        // skip to next iteration if the data is not a text node or a text that matches nothing
        if (currentNode.nodeType !== Node.TEXT_NODE || !regex.test(currentNode.data) ) {
            continue;
        }
    
        // generate the new value
        var parent = currentNode.parentNode;
        var new_node = (callback 
                       || (function(text_node, pattern, repl) {
                                text_node.data = text_node.data.replace(pattern, repl); 
                                return text_node;
                           }))
                      (currentNode, regex, replacement);
    
       parent.insertBefore(new_node, currentNode);
       parent.removeChild(currentNode);
    
    }
    }
    

Example of callback function :

findAndReplace(/foo/gi, "bar", function(text_node, pattern, repl){

    var wrap = document.createElement('span');
    var txt = document.createTextNode(text_node.data.replace(pattern, repl));
    wrap.appendChild(txt);

    return wrap;

});

Thanks again, J-P, for this very helpful piece of code.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.