1

Hello guys I have this dataset:

import pandas as pd 

# intialise data of lists. 
data = {'Year':['2017', '2018', '2018', '2019'],'Month':['1', '1', '2', '3'],'Outcome':['dead', 'alive', 'alive', 'empty'], 'outcome_count':[20, 21, 19, 18]} 

# Create DataFrame 
dfy = pd.DataFrame(data) 

# Print the output. 
print(dfy)

I do want to plot Outcome against period which should be month and year. Now, month and year are on different columns, how can I combine them so that I have a graph of the outcome against month and year. legends should have outcome name?

2
  • This may be helpful. Commented Feb 5, 2020 at 6:24
  • @LivingstoneM How did my suggestion work out for you? Commented Feb 5, 2020 at 11:11

2 Answers 2

2

You can create new column filled by datetimes by to_datetime if passed 3 columns DataFrame with Year, Month, Day columns and then month periods by Series.dt.to_period:

dfy['dates'] = pd.to_datetime(dfy[['Year','Month']].assign(Day=1))
dfy['per'] = dfy['dates'].dt.to_period('m')
print(dfy)
   Year Month Outcome  outcome_count      dates      per
0  2017     1    dead             20 2017-01-01  2017-01
1  2018     1   alive             21 2018-01-01  2018-01
2  2018     2   alive             19 2018-02-01  2018-02
3  2019     3   empty             18 2019-03-01  2019-03

Then is possible plot with periods or with datetimes:

dfy.plot(x='per', y='outcome_count')
dfy.plot(x='dates', y='outcome_count')
Sign up to request clarification or add additional context in comments.

3 Comments

How can plot a line graph for each outcome on same graph, like have dead, alive and empty
@LivingstoneM - Do you think pivot like df = dfy.pivot('per','Outcome','outcome_count') and then df.plot() ?
Or something like df = dfy.pivot('per','Outcome','outcome_count').ffill() ?
1

Your dataset is very limited. Building on the approach from jezrael I'm able to produce this:

enter image description here

If this is in fact what you're looking for, I can explain the details. If not, then I'm sure we'll find another approach.

Here's the code so far:

import pandas as pd 
import plotly.graph_objects as go
import plotly.express as px

# intialise data of lists. 
data = {'Year':['2017', '2018', '2018', '2019'],'Month':['1', '1', '2', '3'],'Outcome':['dead', 'alive', 'alive', 'empty'], 'outcome_count':[20, 21, 19, 18]} 

# Create DataFrame 
dfy = pd.DataFrame(data) 

# approach from jezrael
dfy['dates'] = pd.to_datetime(dfy[['Year','Month']].assign(Day=1))
dfy['per'] = dfy['dates'].dt.to_period('m')

# periods as string
dfy['period']=[d.strftime('%Y-%m') for d in dfy['dates']]

# unique outcomes
outcomes = dfy['Outcome'].unique()

# plotly setup
fig = go.Figure()

# one trace per outcome
for outcome in outcomes:
    df_plot = dfy[dfy['Outcome']==outcome]
    fig.add_trace(go.Scatter(x=df_plot['period'], y=df_plot['outcome_count'],
                             name=outcome
                          ))

fig.show()

4 Comments

@LivingstoneM I'm glad to hear that! Since jezrael has rightfully earned the acceptance mark, may I be so blunt as to guide you towards the up-vote button in my case if my contribution was helpful to you in any way?
Done so sir. Has been very useful
@LivingstoneM Thanks! Asking for upvotes is a total taboo, but I think it was in order this one time since the solution to your challenge was a clear team-effort.
@LivingstoneM Forgive me for asking, but would you mind me inviting you to chat for a brief minute?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.