0

I am trying to create a new variable "spg", or shots per game using Pandas on Python. This new variable is simply s/gp or shots/games played. I would like to add this new variable to my current dataframe so I can print it to excel.

I would like my final columns to look like this: columns = ("name", "team", "gp", "g", "s", "spg")

# import modules

from urllib.request import urlopen
from lxml.html import fromstring

import pandas as pd

# connect to url

url = "https://www.hockey-reference.com/leagues/NHL_2020_skaters.html"

# remove HTML comment markup

content = str(urlopen(url).read())

comment = content.replace("-->", "").replace("<!--", "")
tree = fromstring(comment)

# setting up excel columns

columns = ("name", "team", "gp", "g", "s")

df = pd.DataFrame(columns=columns)

# parsing data while using loop

for idx, row in enumerate(tree.xpath('//table[@id="stats"]/tbody/tr[not(@class="thead")]')):
    name = row.xpath('.//td[@data-stat="player"]')[0].text_content()
    team = row.xpath('.//td[@data-stat="team_id"]')[0].text_content()
    gp = row.xpath('.//td[@data-stat="games_played"]')[0].text_content()
    goals = row.xpath('.//td[@data-stat="goals"]')[0].text_content()
    s = row.xpath('.//td[@data-stat="shots"]')[0].text_content()


    # create pandas dataframe to export data to excel.

    df.loc[idx] = (name, team, gp, goals, s) # I would like to add "spg"
    print(df)


# write data to excel
writer = pd.ExcelWriter('NHL skater20.xlsx')
df.to_excel(writer, 'Sheet1')
writer.save()

Any recommendations on the best way to do this? I greatly appreciate any help or feedback I can get. Thanks in advance!

2 Answers 2

1

Simple: after the loop, add this line:

df['spg'] = df['s'] / df['gp']

UPD: (tested and working)

# import modules

from urllib.request import urlopen
from lxml.html import fromstring

import pandas as pd

# connect to url

url = "https://www.hockey-reference.com/leagues/NHL_2020_skaters.html"

# remove HTML comment markup

content = str(urlopen(url).read())

comment = content.replace("-->", "").replace("<!--", "")
tree = fromstring(comment)

# setting up excel columns

columns = ("name", "team", "gp", "g", "s")

df = pd.DataFrame(columns=columns)

# parsing data while using loop

for idx, row in enumerate(tree.xpath('//table[@id="stats"]/tbody/tr[not(@class="thead")]')):
    name = row.xpath('.//td[@data-stat="player"]')[0].text_content()
    team = row.xpath('.//td[@data-stat="team_id"]')[0].text_content()
    gp = row.xpath('.//td[@data-stat="games_played"]')[0].text_content()
    goals = row.xpath('.//td[@data-stat="goals"]')[0].text_content()
    s = row.xpath('.//td[@data-stat="shots"]')[0].text_content()


    # create pandas dataframe to export data to excel.
    # this is fine for a small dataset, however
    # for a bigger one writing a generator is much more efficient
    df.loc[idx] = (name, team, gp, goals, s) # I would like to add "spg"
    # print(df)  # commented out as unnecessary

# at this point these columns are .text_content(), converting to numbers
df = df.astype({'gp': int, 'g': int, 's': int})
# main part: adding a new column
df['spg'] = df['s'] / df['gp']

# write data to excel
writer = pd.ExcelWriter('NHL skater20.xlsx')
df.to_excel(writer, 'Sheet1')
writer.save()
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for you reply @Marat. Can you please show me in the code? I have tried adding the code but I am getting a NameError: name 'spg' is not defined. I am still new so any advice would be very helpful. =)
@AbleArcher added full code. The only change is two lines before creating writer
This is precisely what I was looking for! Thank you @Marat for sharing some knowledge! Much appreciated good sir! =)
1

To send it to excel you can use either of these:

DataFrame.to_excel('filename.xlsx')
DataFrame.to_csv('filename.csv')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.