Python: How can I draw a bar graph in python using matplotlib?

Question

My goal is to create a bar graph with my .csv data to see the relationship between work year (x) and wage (y) grouped by gender (separate bars).

First off, I want to group the variable'workyear' into three groups: (1) more than 10 years, (2) just 10 years and (3) less than 10 years Then I would like to create the bar graph with gender (1=female, 0=male)

Part of my data looks like this:

...    workyear gender wage 
513         12    0  15.00
514         16    0  12.67
515         14    1   7.38
516         16    0  15.56
517         12    1   7.45
518         14    1   6.25
519         16    1   6.25
520         17    0   9.37
....

To do this, I tried to replace the variable's value into three groups and I used matplotlib.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

#load data 
df=pd.DataFrame.from_csv('data.csv', index_col=None)
print(df)
df.sort_Values("workyear", ascending=True, inplace=True)

#parameters
bar_width = 0.2

#replacing Education year -> Education level grouped by given criteria.
#But I got an error.
df.loc[df.workyear<10, 'workyear'] = 'G1'
df.loc[df.workyear==10, 'workyear'] = 'G2'
df.loc[df.workyear>10, 'workyear']='G3'

#plotting
plt.bar(x, df.education[df.gender==1], bar_width, yerr=df.wage,color='y', label='female')
plt.bar(x+bar_width, df.education[df.gender==0], bar_width, yerr=df.wage, color='c', label='male')

I want to see the bar graph like this (please consider '+' as a bar):

y=wage|                 + +
      | +        +      + +
      | +      + +      + +
      | + +    + +      + +
      |_______________________ x=work year (3-group)
        >10     10       10<

But this is what I actually got... (yes. all errors)

Traceback (most recent call last):
File "data.py", line 21, in <module>
df.loc[df.workyear>10, 'workyear']='G3'
in wrapper
res = na_op(values, other)
in na_op
result = _comp_method_OBJECT_ARRAY(op, x, y)
in _comp_method_OBJECT_ARRAY
result = lib.scalar_compare(x, y, op)
File "pandas\_libs\lib.pyx", line 769, in pandas._libs.lib.scalar_compare (pandas\_libs\lib.c:13717) 
TypeError: unorderable types: str() > int()

Could you please advice me?

Try this df.apply(pd.to_numeric) right before the plotting. — Tom Wojcik
– Tom Wojcik, Commented Dec 5, 2017 at 10:42
@Goyo : did you mean to convert df.workyear like this? -> df.loc[df.workyear>10, 'workyear']='3'? Since I am a beginner in python. I am not sure how to solve this at all. — user
– user, Commented Dec 5, 2017 at 10:42
@Tom Wojcik Thank you. But I got an error with df.loc[df.workyear>10, 'workyear']='3'. — user
– user, Commented Dec 5, 2017 at 10:43

B. M. · Accepted Answer · 2017-12-05 13:21:10Z

1

A more direct way :

df['Age']=pd.cut(df.workyear,[1,13,14,100])
df['Gender']=df.gender.map({0:'male',1:'female'})
df.pivot_table(values='wage',index='Age',columns='Gender').plot.bar()

for :

edited Dec 5, 2017 at 13:21

answered Dec 5, 2017 at 11:04

B. M.

18.7k2 gold badges40 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

B. M. Over a year ago

there was no 10 in your sample data :) . for you it is [min,10,11,max] . ( is included, ] is excluded.

user Over a year ago

I accidently deleted my previous comment. Thank you so much again @B. M. I actually have more data (n=534). I changed the code using [min,10,11, max], but I still have an error.

B. M. Over a year ago

I edit because I realize than mean is not necessary since (G','gender') is a unique key. Is it better ?

user Over a year ago

Or, is there any other ways to cut the variable into three groups? Let's say, the first group includes 0 ~9.9, and the second group has 10 only, and the last group works more than 10.1~

B. M. Over a year ago

[min,9.9,10.1, max] ?

|

Collectives™ on Stack Overflow

Python: How can I draw a bar graph in python using matplotlib?

1 Answer 1

9 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

9 Comments

Your Answer

Sign up or log in

Post as a guest

Related