0

Give a dataframe as follows:

          date    gdp  tertiary_industry  gdp_growth  tertiary_industry_growth
0    2015/3/31   3768               2508        10.3                      11.3
1    2015/6/30   8285               5483        10.9                      12.0
2    2015/9/30  12983               8586        11.5                      12.7
3   2015/12/31  18100              12086        10.5                      13.2
4    2016/3/31   4118               2813        13.5                      14.6
5    2016/6/30   8844               6020        13.3                      14.3
6    2016/9/30  14038               9513        14.4                      13.9
7   2016/12/31  19547              13557        16.3                      13.3
8    2017/3/31   4692               3285        13.3                      12.4
9    2017/6/30   9891               6881        12.9                      12.5
10   2017/9/30  15509              10689        12.7                      12.3
11  2017/12/31  21503              15254        14.8                      12.7
12   2018/3/31   4954               3499        12.4                      11.3
13   2018/6/30  10653               7520        12.9                      12.4
14   2018/9/30  16708              11697        13.5                      13.0
15  2018/12/31  22859              16402        14.0                      13.2
16   2019/3/31   5508               3983        13.5                      13.9
17   2019/6/30  11756               8556        10.2                      13.4
18   2019/9/30  17869              12765        10.2                      14.8
19  2019/12/31  23629              16923        11.6                      15.2
20   2020/3/31   5229               3968        11.9                      14.9

I have applied following code to draw a bar plot for gdp and tertiary_industry.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import matplotlib.ticker as ticker
import matplotlib.style as style
style.available
style.use('fivethirtyeight')
from pylab import rcParams
plt.rcParams["figure.figsize"] = (20, 10)
plt.rcParams['font.sans-serif']=['SimHei'] 
plt.rcParams['axes.unicode_minus']=False 
import matplotlib
matplotlib.matplotlib_fname() 
plt.rcParams.update({'font.size': 25})

colors = ['#c23531','#2f4554', '#61a0a8', '#d48265', '#91c7ae','#749f83', '#ca8622', '#bda29a', '#6e7074', '#546570', '#c4ccd3']

df = df.sort_values(by = 'date')
df['date'] = pd.to_datetime(df['date']).dt.to_period('M')

df = df.set_index('date')
df.columns
cols = ['gdp', 'tertiary_industry']
df[cols] = df[cols].apply(pd.to_numeric, errors='coerce')
color_dict = dict(zip(cols, colors))
plt.figure(figsize=(20, 10))
df[cols].plot(color=[color_dict.get(x, '#333333') for x in df.columns], kind='bar', width=0.8)
plt.xticks(rotation=45)
plt.xlabel("")
plt.ylabel("million dollar")
fig = plt.gcf()
plt.show()
plt.draw()
fig.savefig("./gdp.png", dpi=100, bbox_inches = 'tight')
plt.clf()

The output from the code above:

enter image description here

Now I want to use line type and right axis to draw gdp_growth and tertiary_industry_growth, which are percentage values, on the same plot.

Please note I want to use colors from customized color list in the code instead of default ones.

How could I do that based on code above? Thanks a lot for your kind help.

1 Answer 1

1

This is what I would do:

#convert to datetime
df['date'] = pd.to_datetime(df['date']).dt.to_period('M')

cols = ['gdp', 'tertiary_industry']
colors = ['#c23531','#2f4554', '#61a0a8', '#d48265', '#91c7ae','#749f83', '#ca8622', '#bda29a', '#6e7074', '#546570', '#c4ccd3']

df[cols] = df[cols].apply(pd.to_numeric, errors='coerce')

# modify color_dict here:
color_dict = dict(zip(cols, colors))


# initialize an axis instance
fig, ax = plt.subplots(figsize=(10,6))

# plot on new instance
df.plot.bar(y=cols,ax=ax, 
            color=[color_dict.get(x, '#333333') for x in cols])

# create a twinx axis
ax1 = ax.twinx()

# plot the other two columns on this axis
df.plot.line(y=['gdp_growth','tertiary_industry_growth'], ax=ax1,
            color=[color_dict.get(x, '#333333') for x in line_cols])
ax.set_xticklabels(df['date'])

# set y-axes labels:
ax.set_ylabel('Million Dollar')
ax1.set_ylabel('%')

# set x-axis label
ax.set_xlabel('Quarter')

plt.show()

Output:

enter image description here

If you replace both colors=[...] in the above codes with your original color=[color_dict.get(x, '#333333') for x in df.columns] you would get

enter image description here

Sign up to request clarification or add additional context in comments.

4 Comments

Great, thanks a lot, btw, can we use colors from colors = ['#c23531','#2f4554', '#61a0a8', '#d48265', '#91c7ae','#749f83', '#ca8622', '#bda29a', '#6e7074', '#546570', '#c4ccd3'] for line plot? As I mentioned in the question, I don't want to use default colors.
Could you modify in the code, please? Also, I would like to add similar of plt.ylabel("million dollar") on the right side axis such as plt.ylabel("%") if possible.
Sorry, another problem, as you can see, the label on the right top of plot is on the plot, is it possible to move blanck areas?
Sure, add these before plt.show(): ax1.legend(loc=(1.1,0.4)); ax.legend(loc=(1.1,0.6)). Play with the number to your liking. P/S: This ought to be the last :D

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.