0

I have generated a DF from the below code:

url='https://www.rootsandrain.com/event4493/2017-aug-26-uci-world-cup-dh-7-val-di-sole/results/'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')
table = soup.find('table', {'class':'list'})
headers = [heading.text for heading in table.find_all('th')]

response = requests.get(url)
dfs = pd.read_html(response.text)[0]

#rename headers
dfs.rename(columns = {'Pos⇧' : 'Race_Pos'}, inplace = True)

df_sf = dfs.iloc[:,[1,3,5,14]].copy()

#df_sf['Race_rank'] = df_sf['Race_Pos'].rank()
#df_sf['Race_Pos'] = df_sf['Race_Pos'].astype('str')
#df_sf['Race_Pos_Num'] = df_sf['Race_Pos'].str[:-2]

df_sf['Race_Pos']=df_sf.index

print(df_sf)

print(df_sf.dtypes)

Then also extracted the title (as yet uncleaned) using this code:

print(soup.h1)

However I want to add this value to each row of the table. I can add a fixed value such as assign a new column with a value of 'X' but when I try to assign the title to the X value I get an error.

How to do this?

2
  • Are you trying to add a numeric value to all rows of a column in your dataframe? Commented Apr 27, 2022 at 0:25
  • 2
    It is unclear what your expected output looks like, please clarify. Commented Apr 27, 2022 at 2:47

1 Answer 1

1

Assuming you like to attach the race series and your near to your goal:

df_sf['Series'] = soup.h1.text

This will create a new column and applies the value of soup.h1.text to each of its rows.

Example
url='https://www.rootsandrain.com/event4493/2017-aug-26-uci-world-cup-dh-7-val-di-sole/results/'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')

dfs = pd.read_html(response.text)[0]

dfs.rename(columns = {'Pos⇧' : 'Race_Pos'}, inplace = True)
df_sf = dfs.iloc[:,[1,3,5,14]].copy()
df_sf['Race_Pos'] = df_sf.index

df_sf['Series'] = soup.h1.text

df_sf
Output
Race_Pos Name Licence Qualifier Series
0 0 Aaron GWIN 10006516663 3:37.8281 2017 UCI World Cup DH round 7 at Val di Sole
1 1 Amaury PIERRON 10008827283 3:41.7866 2017 UCI World Cup DH round 7 at Val di Sole
2 2 Loïc BRUNI 10007544358 3:38.8623 2017 UCI World Cup DH round 7 at Val di Sole
3 3 Loris VERGIER 10008723112 3:40.2095 2017 UCI World Cup DH round 7 at Val di Sole
4 4 Troy BROSNAN 10007307417 3:39.8674 2017 UCI World Cup DH round 7 at Val di Sole
5 5 Laurie GREENLAND 10009404738 3:48.38614 2017 UCI World Cup DH round 7 at Val di Sole

...

Sign up to request clarification or add additional context in comments.

1 Comment

Great, how to use those links into the orignal code (basically replace an iterate the url link that scraped to form the huge data set)?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.