2

I'm trying to get element text content only ignoring element's descendants, for instance if you look at this HTML:

<p>hello <h1> World </H1> </p>

for element "P" the right output should be ONLY "hello ".

I have checked the function: "element.textContent" but this returns the textual content of a node and its descendants (in my example it will return "hello world").

Thanks,

3
  • That would be the content of h1, not p. Commented Sep 20, 2013 at 10:04
  • Your markup is incorrect. <h1> element can't be inside <p>. Commented Sep 20, 2013 at 10:22
  • 1
    BTW your HTML code is broken, so most valid solutions posted below won't work. (you can't have block-level elements inside <p />). Commented Sep 20, 2013 at 10:22

7 Answers 7

3

Considering this HTML:

<div id="gettext">hello <p> not this </p> world?</div>

do you want to extract "hello" AND "world"? if yes, then:

var div = document.getElementById('gettext'), // get a reference to the element
    children = [].slice.call(div.childNodes), // get all the child nodes
                                              // and convert them to a real array  
    text = children.filter(function(node){
        return node.nodeType === 3;           // filter-out non-text nodes
    })
    .map(function( t ){ 
        return t.nodeValue;                   // convert nodes to strings 
    });    

console.log( text.join('') );                 // text is an array of strings.

http://jsfiddle.net/U7dcw/

Sign up to request clarification or add additional context in comments.

3 Comments

You should also add a notice about browser support of filter and map methods.
IE9+ and every other sane browser. For IE8 and below you need to provide Array.map and .filter methods.
+1. This is the only answer here that really deserves the upvote.
1

well behind it is an explanation

 $("p").clone()   //clone element
        .children() //get all child elements
        .remove()   //remove all child elements
        .end()  //get back to the parent
        .text();

1 Comment

The question isn't tagged jQuery so I wouldn't assume its presence ;)
1

The answer i have is the same provided in couple of other answer. However let me try and offer an explanation.

<p >hello<h1>World</h1> </p>

This line will be rendered as

hello

World

If you look at this code it will be as follow

<p>hello</p>
<h1>World</h1> 
<p></p>

With the <p> tag you do not necessarily need the closing </p> tag if the paragraph is followed by a element. Check this article

Now you can select the content of the first p tag simply by using the following code

var p = document.getElementsByTagName('p');
console.log(p[0].textContent);

JS FIDDLE

12 Comments

@BOTH The markup is incorrect. <h1> element can't be inside <p>. The browser tries to repair it pushing <h1> out of <p>. Replace <p> with <div> and you'll see the difference.
Then say it, instead of just saying Wrong. The spirit of SO is to learn stuff, if you just say wrong even when you are right, when the output is clearly the right string, people gets confused
I know, AFTER commenting on various answers saying just "Wrong, it will output everything"
@VisioN Are YOU ok? xD If you noticed, after you provided the explanation, I deleted my own answer as I realized it's wrong, it's you who can't accept some constructive criticism because of your ego, we are just saying that if you go around just saying "You are wrong" even when they are getting the correct output (in a wrong way) people get confused xD
@VisioN, you one downvote doesnt make a difference buddy, three other ppl have upvoted my answer so guess that spoils your party doesnt it. think about it. what did you really achieve after all this? food for thought?
|
0

You can use the childNodes property, i.e.:

var p = document.querySelector('p');
p.childNodes[0]; // => hello

jsFiddle

1 Comment

how about <p>hello <h1> World </H1> THIS? </p> :)
0

Change your html to

<p id="id1">hello <h1> World </h1> </p>

Use this script,

alert(document.getElementById("id1").firstChild.nodeValue);

Comments

0

Try to provide id for the element which you want to do some operation with that.

Below is the working example, it show output as "hello" as you expected.


<!DOCTYPE html>
<html>
<head>
<script type="text/javascript">
function showParagraph()
{
   alert(document.getElementById('test').innerHTML);

}
</script>
</head>

<body>
<p id="test">hello <h1> World </H1> </p>
<input type="button" onclick="showParagraph()" value="show paragraph" />
</body>

</html>

Comments

0

Plain texts are considered as nodes named #text. You can use childNodes property of element p and check the nodeName property of each item in it. You can iterate over them and select just #text nodes.

The function below loops over all element in document and prints just #text items

function myFunction()
{
    var txt="";
    var c=document.body.childNodes;
    for (i=0; i<c.length; i++)
    {
        if(c[i].nodeName == "#text")
            txt=txt + c[i].nodeName + "<br>";
    };
    return txt;
}

EDIT:

As @VisioN said in comments, using nodeType is much more safer (for browser compatibility) and recommended.

2 Comments

Is #text browser consistent? I'd better go for .nodeType === 3.
Yes you'r right. It would be much better to use nodeType for this purpose

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.