0

I have a Pandas Dataframe which I got from an SQL Output via MySQL.Connector which looks like the following:

    Group   Sales   Period
0   0   136471.06   2015-1
1   0   645949.37   2015-2
2   0   1414552.66  2015-3
3   0   684672.48   2015-4
4   0   71529.99    2016-1
... ... ... ...
303 119 18641.06    2018-1
304 119 18514.82    2018-2
305 119 16042.67    2018-3
306 119 15043.29    2019-3
307 119 0.00    2020-2

The customers belong to a specific group. From this groups I have the quarterly (period) sales report.

How can I manage plotting the development of each group for each period in a line diagram? So far I've only managed it doing it manually like this:

plt.rcParams["figure.figsize"] = (20,10)
group_0 = df_4[df_4.Group == 0]
group_100 = df_4[df_4.Group == 100]
group_101 = df_4[df_4.Group == 101]
plt.plot(group_0.Period, group_0.Sales)
plt.plot(group_100.Period, group_100.Sales)
plt.plot(group_101.Period, group_101.Sales)
plt.legend(['0', '100', '101'])
plt.title("Sales per Group per Quarter")
plt.xlabel("Quarter")
plt.ylabel("Sales in Million")
plt.show()

Which gives me the output I need, but I assume there must be a better way. Other attempts with plotting the whole dataframe just gives me quite weird plotting-results. The attached image is the manual attempt which is good, but inefficient. So basically I'm looking for a solution attempt to get this done more efficiently. Any help is welcome

enter image description here

2
  • You want a line for every group? Commented May 23, 2021 at 23:48
  • yes, that would be the goal. There are 'only' like 15 different groups (group 0, 100...115), which isn't clear from the dataframe. otherwise it'd be acomplete mess. Commented May 24, 2021 at 0:36

1 Answer 1

2

Try groupby + plot:

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

# Generate some sample data
np.random.seed(5)
gs = 4
ng = 3
df = pd.DataFrame({
    'Group': np.concatenate([np.full(gs, i) for i in range(ng)]),
    'Sales': np.random.random(gs * ng) * 1_000_000,
    'Period': pd.to_datetime(
        np.tile(pd.date_range('2015-01', freq='Q', periods=gs).to_numpy(), ng)
    )
})

fig, ax = plt.subplots()
for label, group in df.groupby('Group'):
    group.plot(kind='line', x='Period', y='Sales', ax=ax, label=label)

plt.title("Sales per Group per Quarter")
plt.xlabel("Quarter")
plt.ylabel("Sales in Million")
plt.tight_layout()
plt.show()

Sample df:

    Group          Sales     Period
0       0  221993.171090 2015-03-31
1       0  870732.306177 2015-06-30
2       0  206719.155339 2015-09-30
3       0  918610.907938 2015-12-31
4       1  488411.188795 2015-03-31
5       1  611743.862903 2015-06-30
6       1  765907.856480 2015-09-30
7       1  518417.987873 2015-12-31
8       2  296800.501576 2015-03-31
9       2  187721.228661 2015-06-30
10      2   80741.268765 2015-09-30
11      2  738440.296199 2015-12-31

Sample Figure:

Sample Figure

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much @Henry. I really apprechiate it

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.