0

I have a data frame that I'm trying to create a tsplot using seaborn, the first issue I'm having after converting my DateTime from string to a DateTime object is that the day had been automatically added.

The original data frame looks like this:

zillowvisualtop5.head()
Out[161]: 
        City_State     Year     Price
0     New York, NY  1996-04  169300.0
1  Los Angeles, CA  1996-04  157700.0
2      Houston, TX  1996-04   86500.0
3      Chicago, IL  1996-04  114000.0
6      Phoenix, AZ  1996-04   88100.0

(Note that the date is in year-month format) After I covert it into DateTime object so that I could plot it using seaborn, I get the issue of having a date added after the month.

zillowvisualtop5['Year'] = pd.to_datetime(zillowvisualtop5['Year'], format= '%Y-%m')
zillowvisualtop5.head()
Out[165]: 
        City_State       Year     Price
0     New York, NY 1996-04-01  169300.0
1  Los Angeles, CA 1996-04-01  157700.0
2      Houston, TX 1996-04-01   86500.0
3      Chicago, IL 1996-04-01  114000.0
6      Phoenix, AZ 1996-04-01   88100.0

The solution that I've found seems to suggest converting to strftime but I need my time to be in DateTime format so I can plot it using seaborn.

2
  • I also tried converting it to period which looked like it solved the problem, however then I am unable to plot it. Commented Dec 20, 2019 at 4:40
  • You can create a new column for stringformatted date, too. Commented Dec 20, 2019 at 7:48

1 Answer 1

1

The problem that you are having is that a DateTime object will always include all the components of a date object and a time object. There is no way to have a DateTime object that only has year and month info (source: here).

However, you can use the matplotlib converters like this:

import pandas as pd
from pandas.plotting import register_matplotlib_converters
import seaborn as sns

cols = ['City', 'Price', 'Month']
data = [['New York', 125, '1996-04'],
        ['New York', 79, '1996-05'],
        ['New York', 85, '1996-06'],
        ['Houston', 90, '1996-04'],
        ['Houston', 95, '1996-05'],
        ['Houston',127, '1996-06']]

df = pd.DataFrame(data, columns = cols)
print (df)

chart = sns.lineplot(x='Month', y='Price', hue='City', data=df)

enter image description here

Does that get the results you were looking for?

Sign up to request clarification or add additional context in comments.

4 Comments

Sure, I'm trying to make a time series plot with time on the x-axis and price on the y-axis. There should be one line for each of the cities. I have done this already when my dataframe only has year. But when it's in the year-month format, I get this error while plotting: ``` ValueError: could not convert string to float: '1996-04' ```
@Lee Thanks for clarifying, I hope the answer above works for you?
@ Roberto Moctezuma Thank you very much for your help! I guess I was approaching the problem wrong. I am using tsplot by the way. I was able to get a plot in the beginning without removing the automatically added day and the plot looked really smooth, the only problem was that my x-axis label was wrong. But now that I've removed the day using the method you've described, the graph looks very steplike which is weird and the label is off.
Thank you very much! Your last edit works! I switched over to using lineplot instead of tsplot.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.