0

I'm new to node js and mongodb. Just testing out some ideas so I am trying to create a web scraper that adds data to a database in mongodb. I have a script that connects to mongodb and dynamically adds data to the database through a node js script. I run this script in the console like so: 'node scrapeData.js' The script runs without any errors, but when I fire up the mongo shell and run db.posts.find(), I get 0 records. I know that my script is scraping the data successfully as it's logging the array of data in the console. Not sure where I'm going wrong. Here's my script:

var MongoClient = require('mongodb').MongoClient;


//requiring the module so we can access them later on
var request = require("request");
var cheerio = require("cheerio");

MongoClient.connect('mongodb://127.0.0.1:27017/mydb', function (err, db) {
    if (err) {
        throw err;
    } 
    else {
        console.log("successfully connected to the database");

        //define url to download 
        var url = "http://www.nyxcosmetics.ca/en_CA/highlight-contour";

        var prodList = [];
        var priceList = [];
        var products = [];

        request(url, function(error, response, body) {
            if(!error) {


                //load page into cheerio
                var $ = cheerio.load(body);

                $(".product_tile_wrapper").each(function(i, elem) {
                    prodList[i] = $(this).find($(".product_name")).attr("title");
                    priceList[i] = $(this).find($(".product_price")).attr("data-pricevalue");
                });
                prodList.join(', ');

                for(var i = 0; i < prodList.length; i++) {
                    var prod = {
                        name: prodList[i], 
                        price: priceList[i]
                    };
                    products.push(prod);
                }

                console.log(products); //print the array of scraped data

                //insert the prods into the database
                //in the 'posts' collection
                var collection = db.collection('posts');
                collection.insert(products);
                console.log("products inserted into posts collection");

                //Locate all the entries using find
                collection.find().toArray(function(err, results) {
                    console.log(results);
                });

            } else {
                console.log("We've encountered an error!")
            }
        });
    }

    db.close();
});

1 Answer 1

2

Just a few hints:

  • Can you check which mongodb NodeJS driver version are you using? 1.x and 2.x are very different
  • in 2.x driver version, db.collection.insert() is actually deprecated in favor of more verbose .insertOne() and .insertMany()
  • you could debug what happened with your insert if you provide a write callback, e.g.

collection.insert(products, function(error,result) { console.log(error); console.log(result); })

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for your response! Turns out the version wasn't an issue. I debugged with your callback code above and I was getting an error like 'mongoError: Topology was destroyed'. It was to do with the db.close() statement. Apparently, I wasn't supposed to have that in the node js script. I'm really used to sql so closing a connection to db seemed normal to me but that's not good practice in mongodb. This post helped me as well: stackoverflow.com/questions/30425739/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.