scrape table using google app scripts

Question

I would love to scape data from this website: https://finviz.com/screener.ashx?v=141&f=sh_avgvol_o500,sh_curvol_o2000,sh_price_u50&o=-volume

I want to scrape the whole table. I tried using this :

function myFunction(start) {
    var url = "https://finviz.com/screener.ashx?
v=141&f=sh_avgvol_o500,sh_curvol_o2000,sh_price_u50&o=-volume&r="+ 
start;
    var fromText = '<tbody>';
    var toText = '</tbody>';
    var content = UrlFetchApp.fetch(url).getContentText();
    var scraped = Parser
              .data(content)
              .from(fromText)
              .to(toText)
              .iterate();
}

I could scrape every element using xpath, but I think it would be quite slow.

Here is the html and the table:

Can I get the whole table ? Thanks

Parser isn't a built-in Google Apps Script class. Are you using the library mentioned on the accepted answer? — Wicket
– Wicket, Commented Oct 27, 2018 at 23:28

Tanaike · Accepted Answer · 2018-10-29 00:02:20Z

4

How about a following modification? The retrieved data is imported to Spreadsheet.

NOTE: Parser is a GAS library. You can see the detail at https://www.kutil.org/2016/01/easy-data-scrapping-with-google-apps.html

Modified script :

function myFunction(start) {
  var url = "https://finviz.com/screener.ashx?v=141&f=sh_avgvol_o500,sh_curvol_o2000,sh_price_u50&o=-volume&r="+ start;
  var content = UrlFetchApp.fetch(url).getContentText();
  var scraped = Parser.data(content).from('class=\"screener-body-table-nw\"').to('</td>').iterate();
  var res = [];

  // If you don't want column titles, please remove this part.
  var temp = [];
  var titles = Parser.data(content).from("style=\"cursor:pointer;\">").to("</td>").iterate();
  titles.forEach(function(e){
    if (!~e.indexOf('\">')) {
      temp.push(e);
    } else if (~e.indexOf('img')) {
      temp.push(e.replace(/<img.+>/g, ''));
    }
  });
  res.push(temp);
  // -----

  var temp = [];
  var oticker = "";
  scraped.forEach(function(e){
    var ticker = Parser.data(e).from("<a href=\"quote.ashx?t=").to("&").build();
    var data1 = Parser.data(e).from("screener-link\">").to("</a>").build();
    var data2 = Parser.data(data1).from(">").to("<").build();
    if (oticker == "") oticker = ticker;
    if (ticker != oticker) {
      temp.splice(1, 0, oticker);
      res.push(temp);
      temp = [];
      oticker = ticker;
      temp.push(data1);
    } else {
      if (!~(data2 || data1).indexOf('<')) temp.push(data2 || data1);
    }
  });
  var ss = SpreadsheetApp.getActiveSheet();
  ss.getRange(ss.getLastRow() + 1, 1, res.length, res[0].length).setValues(res);
}

Result :

edited Oct 29, 2018 at 0:02

answered Sep 26, 2017 at 5:15

Tanaike

204k12 gold badges126 silver badges222 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

VincFort Over a year ago

thanks a lot ! I don't understand why it only gets the first 19 results when the page has 20.

Tanaike Over a year ago

@VincFort Welcome. Thank you, too. Yes. Although it may be the specification for the site, I don't know it. I'm sorry.

Md Khairul Islam Over a year ago

@Tanaike getting this error. ReferenceError: "Parser" is not defined. (line 4, file "Code")

Tanaike Over a year ago

@Md. Khairul Islam I'm sorry. In my answer, I had forgotten to explain about "Parser". It's a GAS library. You can see the detail at here. kutil.org/2016/01/easy-data-scrapping-with-google-apps.html

Bruno Over a year ago

not safe - "lib:parser:8" want access to "gdrive"

Collectives™ on Stack Overflow

scrape table using google app scripts

1 Answer 1

Modified script :

Result :

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Modified script :

Result :

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related