Python add data to an empty pd.Dataframe

Question

I'm quite new to python, the thing I'm trying to do is get data from an website and add a part of the webpage to and pandas dataframe.

This is the code I got already but I'm getting an error when adding data to the Dataframe.

The Code I got:

url = 'https://oldschool.runescape.wiki/w/Module:Exchange/Anglerfish/Data'
r = requests.get(url)

soup = BeautifulSoup(r.content, 'html.parser')

price_data = soup.find_all('span', class_='s1')
df = pd.DataFrame()

for data in price_data:
  a = pd.DataFrame(data.text.split(":")[0],data.text.split(":")[1])
  df.append(a)

print(df)

The Error I'm Getting:

ValueError                                Traceback (most recent call last)
<ipython-input-33-963d51917cf2> in <module>()
 10 
 11 for data in price_data:
---> 12   a = pd.DataFrame(data.text.split(":")[0],data.text.split(":")[1])
 13   df.append(a)
 14 

/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
507                 )
508             else:
--> 509                 raise ValueError("DataFrame constructor not properly called!")
510 
511         NDFrame.__init__(self, mgr, fastpath=True)

ValueError: DataFrame constructor not properly called!

Hey, I'm a big rs fan!

Celius Stingher
– Celius Stingher

2020-07-09 00:36:51 +00:00
Commented Jul 9, 2020 at 0:36 — Celius Stingher
– Celius Stingher, Commented Jul 9, 2020 at 0:36

Celius Stingher · Accepted Answer · 2020-07-09 00:25:05Z

1

It seems that the data structure you get from data.text.split(":")[0],data.text.split(":")[1] does not suit what is expected from the function pd.DataFrame(). First take a look at the documentation of the function to fully understand what is expecting and how to properly pass data to it. You can either pass a dictionary with the column name and the values (arrays must be of equal length, or an index should be specified), or lists/arrays as YOBEN_S proposed, for example:

a = pd.DataFrame({'Column_1':data.text.split(":")[0],'Column_2':data.text.split(":")[1]})

Since you are dealing with html data, you should try a different approach using pandas.read_html() which can be read here for more information

answered Jul 9, 2020 at 0:25

Celius Stingher

18.4k6 gold badges26 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BENY · Accepted Answer · 2020-07-09 00:23:29Z

0

Fix your code by

pd.DataFrame([[data.text.split(":")[0],data.text.split(":")[1]]])

answered Jul 9, 2020 at 0:23

BENY

324k22 gold badges176 silver badges250 bronze badges

Comments

Piet Jetse · Accepted Answer · 2020-07-09 09:48:27Z

0

I did some more research, the best way for me to do it was:

#get data from marketwatch

url = 'https://oldschool.runescape.wiki/w/Module:Exchange/Anglerfish/Data'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
price_data = soup.find_all('span', class_='s1')
df = pd.DataFrame(columns=['timestamp', 'price'])

for data in price_data:
  df = df.append({'timestamp': data.text.split(":")[0], 'price': data.text.split(":")[1]}, ignore_index=True)

print(df)

answered Jul 9, 2020 at 9:48

Piet Jetse

4161 gold badge6 silver badges17 bronze badges

Collectives™ on Stack Overflow

Python add data to an empty pd.Dataframe

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related