2

In the two snippets below, where the only difference seems to be the datasource type (pd.Series vs pd.DataFrame), does plt.figure(num=None, figsize=(12, 3), dpi=80) have an effect in one case but not in the other when using pd.DataFrame.plot?


Snippet 1 - Adjusting plot size when data is a pandas Series

# Imports
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# data
np.random.seed(123)
df = pd.Series(np.random.randn(10000),index=pd.date_range('1/1/2000', periods=10000)).cumsum()
print(type(df))

# plot
plt.figure(num=None, figsize=(12, 3), dpi=80)
ax = df.plot()
plt.show()

Output 1 enter image description here

Snippet 2 - Now the data source is a pandas Dataframe

# imports
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# data
np.random.seed(123)
dfx = pd.Series(np.random.randn(100),index=pd.date_range('1/1/2000', periods=100)).cumsum()
dfy = pd.Series(np.random.randn(100),index=pd.date_range('1/1/2000', periods=100)).cumsum()
df = pd.concat([dfx, dfy], axis = 1)
print(type(df))

# plot
plt.figure(num=None, figsize=(12, 3), dpi=80)
ax = df.plot()
plt.show()

enter image description here

The only difference here seems to be the type of the datasource. Why would that have something to say for the matplotlib output?

0

1 Answer 1

1

It seems that pd.Dataframe.plot() works a bit differently from pd.Series.plot(). Since the dataframe might have any number of columns, which might require subplots, different axes, etc., Pandas defaults to creating a new figure. The way around this is to feed the arguments directly to the plot call, ie, df.plot(figsize=(12, 3)) (dpi isn't accepted as a keyword-argument, unfortunately). You can read more about in this great answer:

In the first case, you create a matplotlib figure via fig = plt.figure(figsize=(10,4)) and then plot a single column DataFrame. Now the internal logic of pandas plot function is to check if there is already a figure present in the matplotlib state machine, and if so, use it's current axes to plot the columns values to it. This works as expected.

However in the second case, the data consists of two columns. There are several options how to handle such a plot, including using different subplots with shared or non-shared axes etc. In order for pandas to be able to apply any of those possible requirements, it will by default create a new figure to which it can add the axes to plot to. The new figure will not know about the already existing figure and its size, but rather have the default size, unless you specify the figsize argument.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for answering! Your suggestion works beautifully! I'm a bit baffled that I didn't find the link you provided myself, because I sure tried. So I guess it's time to bring out the dupe hammer, although my question perhaps is a tiny bit more specific.
Glad I could help! I'm not sure it's exactly a dupe - I'll leave it to the mods to decide.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.