0

I would like to create a bar graph to show the number of sick people in a place each week using Python. The end date of the week is to be input by the user. The bar graph should show exactly 7 days before and including the end date even when the date is missing in the dataset.

The location is always the same and should be the title of the bar graph

Below is my dataset:

end_date = 2022-10-18

data = {'Date': [2022-10-14, 2022-10-14, 2022-10-14, 2022-10-15, 2022-10-16, 2022-10-16, 2022-10-17],
        'Location': ['Lion House', 'Lion House', 'Lion House', 'Lion House', 'Lion House', 'Lion House', 'Lion House']}
      
df = pd.DataFrame(data)

My first objective is to transform df with all the dates from 2022-10-12 to 2022-10-18 with the relevant cases thus producing a dataframe as below.

data1 = {'Date': [2022-10-12, 2022-10-13, 2022-10-14, 2022-10-15, 2022-10-16, 2022-10-17, 2022-10-18],
        'Count': [0, 0, 3, 1, 2, 1, 0]}
      
df_transform = pd.DataFrame(data1)

I know I can sum up the count using groupby and sum but I do not know how to insert the missing dates to create exactly one week and finally plot the graph.

Below is a graph I plot using Excel

Any help is much appreciated as I am new to Python.

Thank you.

1 Answer 1

1

You can reindex after aggregation:

start_date = '2022-10-12'

idx = pd.date_range(start_date, end_date, freq='D').astype(str)

(pd.crosstab(df['Date'], df['Location'])
   .reindex(idx, fill_value=0)
  # .plot.bar() # uncomment to see the plot
 )

Output:

Location    Lion House
2022-10-12           0
2022-10-13           0
2022-10-14           3
2022-10-15           1
2022-10-16           2
2022-10-17           1
2022-10-18           0

Graph:

bar plot

updated example

from pandas import Timestamp
d = {'Date_of_Consult': [Timestamp('2022-10-12 00:00:00'), Timestamp('2022-10-12 00:00:00'), Timestamp('2022-10-12 00:00:00'), Timestamp('2022-10-13 00:00:00'), Timestamp('2022-10-13 00:00:00')], 
     'Dorm_Address': ['Shaw Lodge Dormitory', 'Shaw Lodge Dormitory', 'Shaw Lodge Dormitory', 'Shaw Lodge Dormitory', 'Shaw Lodge Dormitory']}
df = pd.DataFrame(d)

start_date = '2022-10-12'
end_date = '2022-10-18'
idx = pd.date_range(start_date, end_date, freq='D') #.astype(str)

(pd.crosstab(df['Date_of_Consult'], df['Dorm_Address'])
   .reindex(idx, fill_value=0)
  # .plot.bar() # uncomment to see the plot
 )
Sign up to request clarification or add additional context in comments.

10 Comments

Thank you for your help. Apologies but my graph shows no output. I also needed the title to be lion house, no legend. And is it possible to input x-axis as date and y-axis as count?
Then slice just before plotting: ['Lion House'].plot.bar()
If the plot is not shown you might not be in interactive mode. Try import matplotlib.pyplot as plt in the preamble and plt.show() after your plot.
Well, I don't know your setup, make sure to first follow the matplotlib documentation or a tutorial and ensure you can display a plot.
Can you provide a reproducible input (yours is invalid): df.head().to_dict('list')
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.