0

The sample code below currently gets an HTML page, and tries to read it into an array. The AJAX is working perfectly, and I can get a nodelist object successfully. Is it possible to somehow read this page into an array and not one singular object? Eventually I need to pull out every single member of this array individually as I am attempting in the for loop below:

$.ajax({
 url: "/thePageToScrape.html",
 dataType: 'text',
 success: function(data) {
      var elements = $("<div>").html(data)[0].getElementsByTagName("body");
      for(var i = 0; i < elements.length; i++) {
           var theText = elements.firstChild.nodeValue;
           // Do something here
      }
 }
});
7
  • 2
    An array of what? Are you trying to get dom objects out? Commented Jul 26, 2013 at 15:07
  • elements is not an array, but a NodeList instance, there's a difference! Commented Jul 26, 2013 at 15:11
  • I am pulling data from a database with Javascript and I am document.write("Array of Numbers") to the screen. This is the only thing on the screen. Commented Jul 26, 2013 at 15:11
  • I am sorry, I know elements is not an array but I was hoping to make it one somehow Commented Jul 26, 2013 at 15:12
  • @user2577829: Array.prototype.slice.apply(elements, [0]); turns it into an array Commented Jul 26, 2013 at 15:14

3 Answers 3

2

If all you want, like you stated in your comment, is to turn the NodeList into an array:

elements = Array.prototype.slice.apply(elements);

That's all, really.

Sign up to request clarification or add additional context in comments.

5 Comments

Yes, thank you. follow up question though, what does the [0] in this line mean exactly if it is not getting the array there? "var elements = $("<div>").html(data)[0].getElementsByTagName("body");"
It is referring to the first result in the JQuery object. Even results with one node are treated as a list
@user2577829: jQ references to dom elements are, in essence array-like objects, that wrap around a standard DOM element. By either applying $().get(0) or $()[0], you get the DOM element itself. In this case, you're applying the getElementsByTagName('body') to that DOM reference, which will either return null, or an instance of the NodeList object, never an array
I changed this line var elements = $("<div>").html(data)[0].get("body"); to var elements = $().get(0) and when I try to console.log(elements.length); it comes up as elements in not defined
How ahout just $("<div>").html(data)[0]? jQuery returns the body by default. I just tried this in console jQuery('<html><head></head><body><p>Foobar</p></body></html>')
1

If you are using JQuery, you can get a list of each node immediately below the body with

var elements = $(data).children("body").children();

or every node with

var elements = $(data).children("body *");

you can then loop over them with

$.each(elements, function(index, value) {
  var text = this.text()
//..do something with text
});

Comments

1

Looks like $.parseHTML() method do exactly what you want:

Description: Parses a string into an array of DOM nodes.

var arrElements = $.parseHTML(data);

1 Comment

Do you know if there is an example similar to mine on stack overflow? I do not quite understand how to implement it for my case based on the link you posted

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.