2

Problem


I have a data set with played football games and I'm trying to look more closely into the games played by each team at home. Therefore, I am trying to create a column for every team with a counter that tracks the consecutive wins. The following logic should apply:

  • If win: increase counter += 1
  • If tie: keep counter
  • If loss: decrease counter -= 1
df[['Date', 'Home', 'Home result']].head()

|      | Date     | Home           | Home result |
| ---- | -------- | -------------- | ----------- |
| 0    | 1/1/2018 | Stoke City     | Defeat      |
| 1    | 1/1/2018 | Brighton       | Tie         |
| 2    | 1/1/2018 | Burnley        | Defeat      |
| 3    | 1/1/2018 | Leicester City | Victory     |

I created a function that outputs a dictionary that can be mapped to the DataFrame (sorted by Date) to create the new counter column.


    def result_counter(df):
        d = {}
        for index, row in df.iterrows():
            counter = 0
            for second_index, second_row in df.iloc[:index+1].iterrows(): 
                if second_row['Home result'] == 'Victory':
                    counter += 1
                elif second_row['Home result'] == 'Tie':
                    counter += 0
                else:
                    counter -= 1
            if second_row['Home'] not in d:
                d[second_row['Home']] = [counter]
            else:
                d[second_row['Home']].append(counter)
                
        return d

Expected Result


A new column should be created that indicates the consecutive score of the home team. The expected result for a subset should look like this:

|      | Date       | Home      | Home result | Counter |
| ---- | ---------- | --------- | ----------- | ------- |
| 0    | 2017-08-19 | Liverpool | Victory     | 1       |
| 1    | 2017-08-27 | Liverpool | Victory     | 2       |
| 2    | 2017-09-16 | Liverpool | Tie         | 2       |
| 3    | 2017-10-14 | Liverpool | Defeat      | 1       |

What I tried


I tried to map the dict to the new DataFrame. When doing this I either get a map object displayed or the whole list that corresponds to the key in the dictionary.

# with subset and map 
sorted_liverpool['Counter'] = map(sorted_liverpool, d)
sorted_liverpool[['Date', 'Home', 'Home result', 'Counter']].head()

|      | Date       | Home      | Home result | Counter                        |
| ---- | ---------- | --------- | ----------- | ------------------------------ |
| 0    | 2017-08-19 | Liverpool | Victory     | <map object at 0x7fc0650fe550> |
| 1    | 2017-08-27 | Liverpool | Victory     | <map object at 0x7fc0650fe550> |
| 2    | 2017-09-16 | Liverpool | Tie         | <map object at 0x7fc0650fe550> |
| 3    | 2017-10-14 | Liverpool | Defeat      | <map object at 0x7fc0650fe550> |
# with subset and map directly to "home" column
sorted_liverpool['Counter'] = sorted_liverpool['Home'].map(d)
sorted_liverpool[['Date', 'Home', 'Home result', 'Counter']].head()

|      | Date       | Home      | Home result | Counter                                           |
| ---- | ---------- | --------- | ----------- | ------------------------------------------------- |
| 0    | 2017-08-19 | Liverpool | Victory     | [1, 2, 2, 2, 3, 4, 4, 4, 4, 5, 6, 7, 7, 8, 9, ... |
| 1    | 2017-08-27 | Liverpool | Victory     | [1, 2, 2, 2, 3, 4, 4, 4, 4, 5, 6, 7, 7, 8, 9, ... |
| 2    | 2017-09-16 | Liverpool | Tie         | [1, 2, 2, 2, 3, 4, 4, 4, 4, 5, 6, 7, 7, 8, 9, ... |
| 3    | 2017-10-14 | Liverpool | Defeat      | [1, 2, 2, 2, 3, 4, 4, 4, 4, 5, 6, 7, 7, 8, 9, ... |

How can I map the created dictionary to the DataFrame so it displays only the corresponding value in the list and not the whole list?

0

3 Answers 3

1

Avoid loop, especially iterrows when you can.

You need to map the Home result to respective value, then cumsum:

df['Counter'] = df['Home result'].map({
   'Victory': 1,
   'Tie': 0,
   'Defeat': -1
}).cumsum()

Also, you might need to groupby on Home:

df['Counter'] = df['Home result'].map({
       'Victory': 1,
       'Tie': 0,
       'Defeat': -1
    }).groupby(df['Home']).cumsum()

Output:

         Date       Home Home result  Counter
0  2017-08-19  Liverpool     Victory        1
1  2017-08-27  Liverpool     Victory        2
2  2017-09-16  Liverpool         Tie        2
3  2017-10-14  Liverpool      Defeat        1
Sign up to request clarification or add additional context in comments.

Comments

1
print(df)

        Date        Home Home result
0 2017-08-19   Liverpool     Victory
1 2017-08-27   Liverpool     Victory
2 2017-09-16   Liverpool         Tie
3 2017-10-14   Liverpool      Defeat
4 2017-08-20  Stoke City      Defeat
5 2017-08-28  Stoke City         Tie
6 2017-09-17  Stoke City      Defeat
7 2017-10-15  Stoke City     Victory

    df['new column'] = df['Home result'].map({'Victory': 1, 'Tie': 0, 'Defeat': -1})
    df['new column'] = df.groupby('Home')['new column'].transform(pd.Series.cumsum)
    print(df)

        Date        Home Home result  new column
0 2017-08-19   Liverpool     Victory           1
1 2017-08-27   Liverpool     Victory           2
2 2017-09-16   Liverpool         Tie           2
3 2017-10-14   Liverpool      Defeat           1
4 2017-08-20  Stoke City      Defeat          -1
5 2017-08-28  Stoke City         Tie          -1
6 2017-09-17  Stoke City      Defeat          -2
7 2017-10-15  Stoke City     Victory          -1

Comments

0

try group by home and accumulative sum cumsum the results of an apply logic

txt="""Date,Home,Home Result
1/1/2018,Stoke City,Defeat
1/1/2018,Brighton,Tie
1/1/2018,Burnley,Defeat
1/1/2018,Leicester City,Victory
2017-08-19,Liverpool,Victory
2017-08-27,Liverpool,Victory
2017-09-16,Liverpool,Tie
2017-10-14,Liverpool,Defeat"""

df = pd.read_table(StringIO(txt), sep=',')    
df['Date']=pd.to_datetime(df['Date'])
df['Outcome']=df['Home Result'].apply(lambda x: -1 if x=='Defeat' else 0 if x=='Tie' else 1 if x=='Victory' else 0)
df=df.sort_values(by=['Home','Date'],ascending=True)

df['Date_Total'] = df.groupby(['Home'])['Outcome'].cumsum()
print(df)

output

 Date            Home Home Result  Outcome  Date_Total
1 2018-01-01        Brighton         Tie        0           0
2 2018-01-01         Burnley      Defeat       -1          -1
3 2018-01-01  Leicester City     Victory        1           1
4 2017-08-19       Liverpool     Victory        1           1
5 2017-08-27       Liverpool     Victory        1           2
6 2017-09-16       Liverpool         Tie        0           2
7 2017-10-14       Liverpool      Defeat       -1           1
0 2018-01-01      Stoke City      Defeat       -1          -1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.