I am trying to get this line out from a page:
$ 55 326
I have made this regex to get the numbers:
player_info['salary'] = re.compile(r'\$ \d{0,3} \d{1,3}')
When I get the text I use bs4 and the text is of type 'unicode'
for a in soup_ntr.find_all('div', id='playerbox'):
player_box_text = a.get_text()
print(type(player_box_text))
I can't seem to get the result. I have also tried with a regex like these
player_info['salary'] = re.compile(ur'\$ \d{0,3} \d{1,3}')
player_info['salary'] = re.compile(ur'\$ \d{0,3} \d{1,3}', re.UNICODE)
But I can't find out to get the data. The page I am reading has this header:
Content-Type: text/html; charset=utf-8
Hope for some help to figure it out.