28

I'm plotting a Pandas DataFrame with a few lines, each in a specific color (specified by rgb value). I'm looking for a way to make my code more readable by assigning the plot line colors directly to DataFrame column names instead of listing them in sequence.

I know I can do this:

import pandas as pd

df = pd.DataFrame(columns=['red zero line', 'blue one line'], data=[[0, 1], [0, 1]])
df.plot(colors = ['#BB0000', '#0000BB']) # red and blue

but with a lot more than two lines, I'd really like to be able to specify the colors by column header, to make the code easy to maintain. Such as this:

df.plot(colors = {'red zero line': '#FF0000', 'blue one line': '#0000FF'})

The colors keyword can't actually be a dictionary though. (Technically it's type-converted to list, which yields a list of the column labels.)

I understand that pd.DataFrame.plot inherits from matplotlib.pyplot.plot but I can't find the documentation for the colors keyword. Neither of the documentations for the two methods lists such a keyword.

2
  • couldnt you just initially put it in a dictionary and then pull out the values? Commented Nov 3, 2017 at 23:39
  • I'm not sure why, but for me only color argument to df.plot(), and not colors is the one that works. Commented May 8, 2022 at 15:30

2 Answers 2

38

If you create a dictionary mapping the column names to colors, you can build the color list on the fly using a list comprehension where you just get the color from the column name. This also allows you to specify a default color in case you missed a column.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame([[0, 1, 2], [0, 1, 2]], 
                  columns=['red zero line', 'blue one line', 'extra'])

color_dict = {'red zero line': '#FF0000', 'blue one line': '#0000FF'}

# use get to specify dark gray as the default color.
df.plot(color=[color_dict.get(x, '#333333') for x in df.columns])
plt.show()

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

12

You can specify the order of the columns before plotting with df[cols]:

import pandas as pd

cols = ['red zero line', 'blue one line', 'green two line']
colors = ['#BB0000', '#0000BB', 'green']
df = pd.DataFrame(columns=cols, data=[[0, 1, 2], [0, 1, 2], [0, 1, 3]])

df[cols].plot(colors = colors)

example plot

If you want to be sure columns and colors are strictly paired, you can always just zip ahead of time:

columns_and_colors = zip(cols, colors)
df[cols].plot(colors = [cc[1] for cc in columns_and_colors])

2 Comments

Note as at 2020, pandas is saying colours is being deprecated in plot "please use color"
as of 2021, it should be color, instead of colors. i.e. df.plot(color=['blue', 'red', 'yellow'].

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.