How to disable date interpolation in matplotlib?

Question

Despite trying some solutions available on SO and at Matplotlib's documentation, I'm still unable to disable Matplotlib's creation of weekend dates on the x-axis.

As you can see see below, it adds dates to the x-axis that are not in the original Pandas column.

I'm plotting my data using (commented lines are unsuccessful in achieving my goal):

fig, ax1 = plt.subplots()

x_axis = df.index.values
ax1.plot(x_axis, df['MP'], color='k')
ax2 = ax1.twinx()
ax2.plot(x_axis, df['R'], color='r')

# plt.xticks(np.arange(len(x_axis)), x_axis)
# fig.autofmt_xdate()
# ax1.fmt_xdata = mdates.DateFormatter('%Y-%m-%d')

fig.tight_layout()
plt.show()

An example of my Pandas dataframe is below, with dates as index:

2019-01-09  1.007042  2585.898714  4.052480e+09  19.980000  12.07     1
2019-01-10  1.007465  2581.828491  3.704500e+09  19.500000  19.74     1
2019-01-11  1.007154  2588.605258  3.434490e+09  18.190001  18.68     1
2019-01-14  1.008560  2582.151225  3.664450e+09  19.070000  14.27     1

Some suggestions I've found include a custom ticker here and here however although I don't get errors the plot is missing my second series.

Any suggestions on how to disable date interpolation in matplotlib?

I suspect you frame the problem in the wrong way. Plotting tools don't interpolate, but rather show a linear axis by default. Any day is hence part of the axis, similar to how the number 2 is necessarily part of an axis ranging from 1 to 4. A solution on how to plot an axis with dates left out is shown in the matplotlib FAQ. — ImportanceOfBeingErnest
– ImportanceOfBeingErnest, Commented Jan 20, 2019 at 21:10

James Dellinger · Accepted Answer · 2019-01-24 22:24:01Z

The matplotlib site recommends creating a custom formatter class. This class will contain logic that tells the axis label not to display anything if the date is a weekend. Here's an example using a dataframe I constructed from the 2018 data that was in the image you'd attached:

df = pd.DataFrame(
data = {
 "Col 1" : [1.000325, 1.000807, 1.001207, 1.000355, 1.001512, 1.003237, 1.000979],
 "MP": [2743.002071, 2754.011543, 2746.121450, 2760.169848, 2780.756857, 2793.953050, 2792.675162],
 "Col 3": [3.242650e+09, 3.453480e+09, 3.576350e+09, 3.641320e+09, 3.573970e+09, 3.573970e+09, 4.325970e+09], 
 "Col 4": [9.520000, 10.080000, 9.820000, 9.880000, 10.160000, 10.160000, 11.660000],
 "Col 5": [5.04, 5.62, 5.29, 6.58, 8.32, 9.57, 9.53],
 "R": [0,0,0,0,0,1,1]
}, 
index=['2018-01-08', '2018-01-09', '2018-01-10', '2018-01-11',
       '2018-01-12', '2018-01-15', '2018-01-16'])

Move the dates from the index to their own column:

df = df.reset_index().rename({'index': 'Date'}, axis=1, copy=False)
df['Date'] = pd.to_datetime(df['Date'])

Create the custom formatter class:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import Formatter
%config InlineBackend.figure_format = 'retina' # Get nicer looking graphs for retina displays

class CustomFormatter(Formatter):
    def __init__(self, dates, fmt='%Y-%m-%d'):
        self.dates = dates
        self.fmt = fmt

    def __call__(self, x, pos=0):
        'Return the label for time x at position pos'
        ind = int(np.round(x))
        if ind >= len(self.dates) or ind < 0:
            return ''

        return self.dates[ind].strftime(self.fmt)

Now let's plot the MP and R series. Pay attention to the line where we call the custom formatter:

formatter = CustomFormatter(df['Date'])

fig, ax1 = plt.subplots()
ax1.xaxis.set_major_formatter(formatter)
ax1.plot(np.arange(len(df)), df['MP'], color='k')
ax2 = ax1.twinx()
ax2.plot(np.arange(len(df)), df['R'], color='r')
fig.autofmt_xdate()
fig.tight_layout()
plt.show()

The above code outputs this graph:

Now, no weekend dates, such as 2018-01-13, are displayed on the x-axis.

yep, this did the trick, thanks so much... I guess moving the 'Date' out of index and using your formatter were the crucial steps I needed

Him · Accepted Answer · 2019-01-25 17:54:13Z

2

If you would like to simply not show the weekends, but for the graph to still scale correctly matplotlib has a built-in functionality for this in matplotlib.mdates. Specifically, the WeekdayLocator pretty much solves this problem singlehandedly. It's a one line solution (the rest just fabricates data for testing). Note that this works whether or not the data includes weekends:

import matplotlib.pyplot as plt
import datetime
import numpy as np
import matplotlib.dates as mdates
from matplotlib.dates import MO, TU, WE, TH, FR, SA, SU

DT_FORMAT="%Y-%m-%d"

if __name__ == "__main__":
    N = 14
    #Fake data
    x =  list(zip([2018]*N, [5]*N, list(range(1,N+1))))
    x = [datetime.datetime(*y) for y in x]
    x = [y for y in x if y.weekday() < 5]
    random_walk_steps = 2 * np.random.randint(0, 6, len(x)) - 3
    random_walk = np.cumsum(random_walk_steps)
    y = np.arange(len(x)) + random_walk

    # Make a figure and plot everything
    fig, ax = plt.subplots()
    ax.plot(x, y)

    ### HERE IS THE BIT THAT ANSWERS THE QUESTION
    ax.xaxis.set_major_locator(mdates.WeekdayLocator(byweekday=(MO, TU, WE, TH, FR)))
    ax.xaxis.set_major_formatter(mdates.DateFormatter(DT_FORMAT))

    # plot stuff
    fig.autofmt_xdate()
    plt.tight_layout()
    plt.show()

edited Jan 25, 2019 at 17:54

answered Jan 25, 2019 at 17:43

Him

5,6024 gold badges45 silver badges98 bronze badges

1 Comment

pepe Over a year ago

thanks for chiming in @scott, I need the chart to be continuous, see solution below/above

Kyle · Accepted Answer · 2019-01-24 23:31:59Z

If you are trying to avoid the fact that matplotlib is interpolating between each point of your dataset, you can exploit the fact that matplotlib will plot a new line segment each time a np.NaN is encountered. Pandas makes it easy to insert np.NaN for the days that are not in your dataset with pd.Dataframe.asfreq():

df = pd.DataFrame(data = {
    "Col 1" : [1.000325, 1.000807, 1.001207, 1.000355, 1.001512, 1.003237, 1.000979],
    "MP": [2743.002071, 2754.011543, 2746.121450, 2760.169848, 2780.756857, 2793.953050, 2792.675162],
    "Col 3": [3.242650e+09, 3.453480e+09, 3.576350e+09, 3.641320e+09, 3.573970e+09, 3.573970e+09, 4.325970e+09],
    "Col 4": [9.520000, 10.080000, 9.820000, 9.880000, 10.160000, 10.160000, 11.660000],
    "Col 5": [5.04, 5.62, 5.29, 6.58, 8.32, 9.57, 9.53],
    "R": [0,0,0,0,0,1,1]
    },
    index=['2018-01-08', '2018-01-09', '2018-01-10', '2018-01-11', '2018-01-12', '2018-01-15', '2018-01-16'])

df.index = pd.to_datetime(df.index)

#rescale R so I don't need to worry about twinax
df.loc[df.R==0, 'R'] = df.loc[df.R==0, 'R'] + df.MP.min()
df.loc[df.R==1, 'R'] = df.loc[df.R==1, 'R'] * df.MP.max()

df = df.asfreq('D')

df
               Col 1           MP         Col 3  Col 4  Col 5            R
2018-01-08  1.000325  2743.002071  3.242650e+09   9.52   5.04  2743.002071
2018-01-09  1.000807  2754.011543  3.453480e+09  10.08   5.62  2743.002071
2018-01-10  1.001207  2746.121450  3.576350e+09   9.82   5.29  2743.002071
2018-01-11  1.000355  2760.169848  3.641320e+09   9.88   6.58  2743.002071
2018-01-12  1.001512  2780.756857  3.573970e+09  10.16   8.32  2743.002071
2018-01-13       NaN          NaN           NaN    NaN    NaN          NaN
2018-01-14       NaN          NaN           NaN    NaN    NaN          NaN
2018-01-15  1.003237  2793.953050  3.573970e+09  10.16   9.57  2793.953050
2018-01-16  1.000979  2792.675162  4.325970e+09  11.66   9.53  2793.953050

df[['MP', 'R']].plot(); plt.show()

thanks for chiming in @kyle, I need the chart to be continuous, see solution below/above

Collectives™ on Stack Overflow

How to disable date interpolation in matplotlib?

3 Answers 3

1 Comment

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related