I am trying to scrape http://emojipedia.org/emoji/ , but I am not sure what is the most efficient way to do so. What I would like to scrape is found inside the table class ="emoji_list". I would like to save the stuff inside each "td" in separate columns. The output will be like the following where each line represent an emoji:
Col1_Link Col2_emoji Col3_Comment Col4_UTF
"/emoji/%F0%9F%98%80/" 😀 Grinning Face U+1F600
I have written the following code so far, but I am not sure what is the best way to do that.
import requests
from bs4 import BeautifulSoup
import urllib
import re
url = "http://emojipedia.org/emoji/"
html = urllib.urlopen(url)
soup = BeautifulSoup(html)
soup.findAll('tr', limit=2)
Many thanks in advance for your help.