1

I am using the code below to try an extract the data at the table in this URL. However, I get the following error message:

Error: `AttributeError: 'NoneType' object has no attribute 'find'`in 
the line `data = iter(soup.find("table", {"class": 
"tablestats"}).find("th", {"class": "header"}).find_all_next("tr"))`

My code is as follows:

 from bs4 import BeautifulSoup
 import requests

 r = requests.get(
 "http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html")
 soup = BeautifulSoup(r.content)

 data = iter(soup.find("table", {"class": "tablestats"}).find("th", {"class": "header"}).find_all_next("tr"))


 headers = (next(data).text, next(data).text)
 table_items =  [(a.text, b.text) for ele in data for a, b in [ele.find_all("td")]]

 for a, b in table_items:
     print(u"Date={}, Maturity={}".format(a, b if b.strip() else "null"))

Thank You

2 Answers 2

0
from bs4 import BeautifulSoup
import requests


r = requests.get(
    "http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html")
soup = BeautifulSoup(r.content)

 # column headers
 h = data.find_all("th", scope="col")
 # get all the tr tags after the headers
 final = [[t.th.text] + [ele.text for ele in t.find_all("td")] for t in h[-1].find_all_next("tr")]
 headers = [th.text for th in h]

The final out list is all the rows in individual lists:

[['2015-06-05', '4.82039691', '-4.66420959', '-4.18904598', 
'-3.94541434', '1.1477', '2.9361', '3.3588', '0.6943', '1.5881',
 '2.3034', '2.7677', '3.0363', '3.1801', '3.2537', '3.2930', '3.3190', 
'3.3431', '3.3707', '3.4038', '3.4428', '3.4871', '3.5357', '3.5876',
 '3.6419', '3.6975', '3.7538', '3.8100', '3.8656', '3.9202', '3.9734',
 '4.0250', '4.0748', '4.1225', '4.1682', '4.2117', '4.2530', '4.2921',
 '0.3489', '0.7464', '1.1502', '1.4949', '1.7700', '1.9841', '2.1500', 
 '2.2800', '2.3837', '2.4685', '2.5396', '2.6006', '2.6544', '2.7027', 
 '2.7469', '2.7878', '2.8260', '2.8621', '2.8964', '2.9291', '2.9603',
 '2.9901', '3.0187', '3.0461', '3.0724', '3.0976', '3.1217', '3.1448',
 '3.1669', '3.1881', '0.3487', '0.7469', '1.1536', '1.5039', '1.7862',      
 '2.0078', '2.1811', '2.3179', '2.4277', '2.5181', '2.5943', '2.6603', 
 '2.7190', '2.7722', '2.8215', '2.8677', '2.9117', '2.9538', '2.9944', 
 '3.0338', '3.0721', '3.1094', '3.1458', '3.1814', '3.2161', '3.2501',
 '3.2832', '3.3156', '3.3472', '3.3781', '1.40431658', '9.48795888'], 
 ['2015-06-04', '4.64953424', '-4.52780982', '-3.98051369', 
 ......................................

The headers:

['BETA0', 'BETA1', 'BETA2', 'BETA3', 'SVEN1F01', 'SVEN1F04', 'SVEN1F09', 'SVENF01', 'SVENF02', 'SVENF03', 'SVENF04', 'SVENF05', 'SVENF06', 'SVENF07', 'SVENF08', 'SVENF09', 'SVENF10', 'SVENF11', 'SVENF12', 'SVENF13', 'SVENF14', 'SVENF15', 'SVENF16', 'SVENF17', 'SVENF18', 'SVENF19', 'SVENF20', 'SVENF21', 'SVENF22', 'SVENF23', 'SVENF24', 'SVENF25', 'SVENF26', 'SVENF27', 'SVENF28', 'SVENF29', 'SVENF30', 'SVENPY01', 'SVENPY02', 'SVENPY03', 'SVENPY04', 'SVENPY05', 'SVENPY06', 'SVENPY07', 'SVENPY08', 'SVENPY09', 'SVENPY10', 'SVENPY11', 'SVENPY12', 'SVENPY13', 'SVENPY14', 'SVENPY15', 'SVENPY16', 'SVENPY17', 'SVENPY18', 'SVENPY19', 'SVENPY20', 'SVENPY21', 'SVENPY22', 'SVENPY23', 'SVENPY24', 'SVENPY25', 'SVENPY26', 'SVENPY27', 'SVENPY28', 'SVENPY29', 'SVENPY30', 'SVENY01', 'SVENY02', 'SVENY03', 'SVENY04', 'SVENY05', 'SVENY06', 'SVENY07', 'SVENY08', 'SVENY09', 'SVENY10', 'SVENY11', 'SVENY12', 'SVENY13', 'SVENY14', 'SVENY15', 'SVENY16', 'SVENY17', 'SVENY18', 'SVENY19', 'SVENY20', 'SVENY21', 'SVENY22', 'SVENY23', 'SVENY24', 'SVENY25', 'SVENY26', 'SVENY27', 'SVENY28', 'SVENY29', 'SVENY30', 'TAU1', 'TAU2']
Sign up to request clarification or add additional context in comments.

10 Comments

@PadraicCunnigham Thank You. However, with the code you provided I get the Error AttributeError: 'NoneType' object has no attribute 'text'.
It gets all the table data for me, each row in individual lists, version of bs4?
@PadraicCunnigham I have the 3.2 version. Should I install a newer version?
yes, the latest is > 4.3, also what parser are you using?
Actually you will have one problem, make sure to give the dates a header, headers.insert(0,"dates");df = pd.DataFrame(final,columns=headers), if you want to save to to a csv df.to_csv("test.csv",sep="\t")
|
0

There are a lot of issues in your code.

  1. There is no table with class 'tablestats'.
  2. There are no 'th' fields with class 'header'.
  3. Following line-

    table_items = [(a.text, b.text) for ele in data for a, b in [ele.find_all("td")]]

doesnt return just 2 values, so cant assign to a, b

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.