0

if you look at this page https://metals-api.com/currencies there is an html table with 2 columns. I would like to extract all the rows from column1 into a list/array. How do I go about this?

import requests
from bs4 import BeautifulSoup

URL = "https://metals-api.com/currencies"
page = requests.get(URL)


soup = BeautifulSoup(page.content, "html.parser")


with open('outpu2t.txt', 'w', encoding='utf-8') as f: 

    f.write(soup.text)

To clarify I am not looking to run some fetch price commands against these tickers, I'm trying to compile a list of tickers so I can add them to a dropdown menu for my app

4
  • 1
    The site you're trying to access goes out of their way to block most requests like this, because they have an API you can pay for for this information. Bypassing this would require something far more complicated than just the basic requests library. Commented Apr 14, 2022 at 19:09
  • The site blocks easy scraping attempts against the HTML, you would need to pretend to be a full browser. Consider using selenium to that end. Commented Apr 14, 2022 at 19:11
  • I can get the html by just viewing the source in the browser, I'm not actually trying to see the prices for those tickers, I want to make a dropdown menu in my app with those tickers as select options Commented Apr 14, 2022 at 19:18
  • Turns out I'm mostly wrong, see my answer below :') Commented Apr 14, 2022 at 19:44

2 Answers 2

1

If I understand the question, then you can try the next example

import requests
from bs4 import BeautifulSoup
import pandas as pd
data=[]
URL = "https://metals-api.com/currencies"
page = requests.get(URL)

soup = BeautifulSoup(page.content, "html.parser")
for code in soup.select('.table tbody tr td:nth-child(1)'):
    code =code.text
    data.append(code)
df=pd.DataFrame(data,columns=['code'])
#df.to_csv('code.csv',index=False)# to store data
print(df)

Output:

     code
0     XAU
1     XAG
2     XPT
3     XPD
4     XCU
..    ...
209  LINK
210   XLM
211   ADA
212   BCH
213   LTC

[214 rows x 1 columns]
Sign up to request clarification or add additional context in comments.

3 Comments

You are in fact understanding the question correctly, I will now attempt to figure out what you did lol, much love!
For sure its accepted, could you please add some comments perhaps explaining what .table tbody tr td:nth-child(1) and df.to_csv('code.csv',index=False) do? Very new to python would appreciate it greatly <3
table tbody tr td:nth-child(1) is css selector with bs4 see doc::crummy.com/software/BeautifulSoup/bs4/doc and df.to_csv('code.csv',index=False) save data into a csv file into pc just uncomment and run the then you will find a csv file .Thanks
1

I sit corrected, I initially just tried pd.read_html("https://metals-api.com/currencies") which normally works, but apparently with a very slight work around it can still work just fine.

import pandas as pd
import requests
URL = "https://metals-api.com/currencies"
page = requests.get(URL)
df = pd.read_html(page.content)[0]
print(df)

Output:

     Code                                               Name
0     XAU  1 Ounce of 24K Gold. Use Carat endpoint to dis...
1     XAG                                             Silver
2     XPT                                           Platinum
3     XPD                                          Palladium
4     XCU                                             Copper
..    ...                                                ...
209  LINK                                          Chainlink
210   XLM                                            Stellar
211   ADA                                            Cardano
212   BCH                                       Bitcoin Cash
213   LTC                                           Litecoin

[214 rows x 2 columns]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.