2

I have a simple dataframe with two columns, 'date' and 'amount'. I want to plot the amount using date as the x-axis. The first lines of the data are:

22/05/2018,52068.67
21/05/2018,52159.19
15/05/2018,52744.03
08/05/2018,54666.21
08/05/2018,54677.51
01/05/2018,53890.59
30/04/2018,54812.25
27/04/2018,52258.23
26/04/2018,52351.47
23/04/2018,49777.04
23/04/2018,49952.44
23/04/2018,49992.44
05/04/2018,53238.59
03/04/2018,53631.09
03/04/2018,53839.64
28/03/2018,50836.78
26/03/2018,51206.67
26/03/2018,51372.02
14/03/2018,51110.17
12/03/2018,51411.31
06/03/2018,51169.91
05/03/2018,51374.57
27/02/2018,48728.85
27/02/2018,48730.5
16/02/2018,44988.25
14/02/2018,41948.03
12/02/2018,43776.31
12/02/2018,43800.31
12/02/2018,43840.11
05/02/2018,29358.96
26/01/2018,39491.0
24/01/2018,36470.03
23/01/2018,36562.76
23/01/2018,36616.61
22/01/2018,36582.46
22/01/2018,36665.71
22/01/2018,36743.31
17/01/2018,36965.3
16/01/2018,37044.6
09/01/2018,42083.65
08/01/2018,42183.39
05/01/2018,42285.41
03/01/2018,41537.51
03/01/2018,41579.51
02/01/2018,41945.32
27/12/2017,43003.33
27/12/2017,43217.29
18/12/2017,38208.63
15/12/2017,38315.53

However, the plot gives me points that don't appear in the data. For example, in May 2018 there is no value near 30000.

enter image description here

My code is:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("test.csv", header=None, names =['date', 'amount'])
df['time'] = pd.to_datetime(df['date'])
df.set_index(['time'],inplace=True)
df['amount'].plot()
plt.show()

What am I doing wrong?

12
  • You need to sort your data by date. Commented May 23, 2018 at 19:19
  • @A.Kot The x-axis in the plot is sorted by date (see the figure in the question) so hasn't that already been done by pandas? Commented May 23, 2018 at 19:20
  • @A.Kot How can I sort by date? Commented May 23, 2018 at 19:25
  • 1
    Shame that pd.to_datetime("05/02/2018") returns Timestamp('2018-05-02 00:00:00'). Commented May 23, 2018 at 19:38
  • 2
    You can also use the "dayfirst" option: df['time'] = pd.to_datetime(df['date'], dayfirst=True) Commented May 23, 2018 at 19:41

1 Answer 1

5

You need to covert the dates to date time using correct format and use pandas plot

df['date'] = pd.to_datetime(df['date'], format = '%d/%m/%Y')
df.plot('date', 'amount')

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

Nice work. That explains it. 29 05/02/2018 29358.96 never understood why the US decided to reverse day/month in dates when all other countries use dd/mm/yyyy....
@Bill, yeah the date format does throw you off in the beginning but as is with any topic in programming, your mind gets trained to look for those problem areas
@Parfait You don't even speak English! (colour, behaviour, programme, ...). But we're getting off topic.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.