I am trying to practice web scraping using a betting site for UFC fights. I am using javascript and the packages request-promise and cheerio.
Site: https://www.oddsshark.com/ufc/odds
I want to scrape the name of the fighters and their respective betting lines for each betting company.
My goal is to end up with something like an array of objects that I can later seed a postgresql database with.
Example of my desired output (doesn't have to be exactly like that but similar):
[
{ fighter 1: 'Khabib Nurmagomedov', openingBetLine: -333, bovadaBetLine: -365, etc. },
{ fighter 2: 'Dustin Poirier', openingBetLine: 225, bovadaBetLine: 275, etc. },
{ fighter 3: etc.},
{ fighter 4: etc.}
]
Below is the code I have so far. I am a noob at this:
const rp = require("request-promise");
const url = "https://www.oddsshark.com/ufc/odds";
// cheerio to parse HTML
const $ = require("cheerio");
rp(url)
.then(function(html) {
// it worked :)
// console.log("MMA page:", html);
// console.log($("big > a", html).length);
// console.log($("big > a", html));
console.log($(".op-matchup-team-text", html).length);
console.log($(".op-matchup-team-text", html));
})
// why isn't catch working?
.catch(function(error) {
// handle error
});
My code above returns indexes as keys with nested objects as values. Below is just one of them as an example.
{ '0':
{ type: 'tag',
name: 'span',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object: null prototype] { class: 'op-matchup-team-text' },
'x-attribsNamespace': [Object: null prototype] { class: undefined },
'x-attribsPrefix': [Object: null prototype] { class: undefined },
children: [ [Object] ],
parent:
{ type: 'tag',
name: 'div',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: [Object],
'x-attribsNamespace': [Object],
'x-attribsPrefix': [Object],
children: [Array],
parent: [Object],
prev: [Object],
next: [Object] },
prev: null,
next: null },
I don't know what to do from here. Am I calling the right class (op-matchup-team-text)? If so, how do I extract the fighter names and betting line tag elements from the website?
////////////////////////////////////////////////////////////////////////// UPDATE 1 ON ORIGINAL POST //////////////////////////
Updated: Using Henk's suggestion, I'm able to scrape fighter name. Using the code template for fighter name, I was able to scrape fighter betting lines as well.
BUT I don't know how to get both on one object. For example, how do I associate the betting line with the fighter him/herself?
Below is my code for scraping the OPENING company's betting line:
rp(url)
.then(function(html) {
const $ = cheerio.load(html);
const openingBettingLine = [];
// parent class of fighter name
$("div.op-item.op-spread.op-opening").each((index, currentDiv) => {
const openingBet = {
opening: JSON.parse(currentDiv.attribs["data-op-moneyline"]).fullgame
};
openingBettingLine.push(openingBet);
});
console.log("openingBettingLine array test 2:", openingBettingLine);
})
// why isn't catch working?
// eslint-disable-next-line handle-callback-err
.catch(function(error) {
// handle error
});
It console logs out the following:
openingBettingLine array test 2: [ { opening: '-200' },
{ opening: '+170' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '' },
{ opening: '+105' },
{ opening: '-135' },
{ opening: '-165' },
{ opening: '+135' },
{ opening: '-120' },
{ opening: '-110' },
{ opening: '-135' },
{ opening: '+105' },
{ opening: '-165' },
{ opening: '+135' },
{ opening: '-115' },
{ opening: '-115' },
{ opening: '-145' },
{ opening: '+115' },
{ opening: '+208' },
{ opening: '-263' },
etc.
My desired object output is still (as example below). So how would I get the openingBettingLine into the object associated with the fighter?
[
{ fighter 1: 'Khabib Nurmagomedov', openingBetLine: -333, bovadaBetLine: -365, etc. },
{ fighter 2: 'Dustin Poirier', openingBettingLine: 225, bovadaBetLine: 275, etc. },
{ fighter 3: etc.},
{ fighter 4: etc.}
]
////////////////////////////////////////////////////////////////////////// UPDATE 2 ON ORIGINAL POST //////////////////////////
I can't get the BOVADA company's betting line to scrape. I isolated the code to just this company below.
// BOVADA betting line array --> not working
rp(url)
.then(function(html) {
const $ = cheerio.load(html);
const bovadaBettingLine = [];
// parent class of fighter name
$("div.op-item.op-spread.border-bottom.op-bovada.lv").each(
(index, currentDiv) => {
const bovadaBet = {
BOVADA: JSON.parse(currentDiv.attribs["data-op-moneyline"]).fullgame
};
bovadaBettingLine.push(bovadaBet);
}
);
console.log("bovadaBettingLine:", bovadaBettingLine);
})
// why isn't catch working?
// eslint-disable-next-line handle-callback-err
.catch(function(error) {
// handle error
});
It returns: bovadaBettingLine: [] with nothing in it.
Below is the HTML code for that part of the website.


npmpackage for that which converts such things into an array