0

I have been stuck at plotting dataframe. This might be simple, but I can't able to figure out!

I have panda dataframe records like this:

    Year  occurrence   Count
0   2011           0     306
1   2011           1    1838
2   2012           0     422
3   2012           1    1816
4   2013           0     423
5   2013           1    3471
6   2014           0     537
7   2014           1    3239
8   2015           0     993
9   2015           1    7668
10  2016           0     415
11  2016           1    2052
12  2017           0     511
13  2017           1    4750
14  2018           0     705
15  2018           1    2125

I want to plot this dataframe as bar chart such that, x-axis contains Year and Y-axis contains Count.

  1. Now I want to plot this Count based on occurrence value. means that in year 2011 one bar has count=306 and second bar has count=1838, same for remaining years.
  2. Also, if possible, I also have to display stacked bar chart based on same thing.
  3. And, How can I plot line charts with two lines in it?

    Can anyone have workaround on this?

    I have created sample df based on my result:
df = spark.createDataFrame({
(2011,  0,   306),
(2011,  1,  1838),
(2012,  0,   422),
(2012,  1,  1816),
(2013,  0,   423),
(2013,  1,  3471),
(2014,  0,   537),
(2014,  1,  3239),
(2015,  0,   993),
(2015,  1,  7668),
(2016,  0,   415),
(2016,  1,  2052),
(2017,  0,   511),
(2017,  1,  4750),
(2018,  0,   705),
(2018,  1,  2125),
}, ['Year', 'occurrence', 'Count'])

pdf_1 = df.toPandas()

I have tried with this:
pdf_1.plot(x='Year', y=['Count'], kind='bar')

but it does not give me what exactly I wanted.

2 Answers 2

1

You can use pivot to reshape:

pdf_1.pivot('Year', 'occurrence', 'Count').plot.bar(stacked=True)

Output:

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks @BigBen, for quick answer! Also, do you know how can I simply plot line and bar(2 bars per year) chart with same condition?
1

As per @BigBen,

I have figured out all three answers:

# For Question 1
pdf_1.pivot(index='Year', columns='above_threshold', values='Count').plot.bar()

# For Question 2
pdf_1.pivot('Year', 'above_threshold', 'Count').plot.bar(stacked=True)

# For Question 3
pdf_1.pivot(index='Year', columns='above_threshold', values='Count').plot.line()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.