1

I have the following data frame my_df:

        my_1     my_2     my_3
--------------------------------
0         5       7        4
1         3       5       13
2         1       2        8
3        12       9        9
4         6       1        2

I want to make a plot where x-axis is categorical values with my_1, my_2, and my_3. y-axis is integer. For each column in my_df, I want to plot all its 5 values at x = my_i. What kind of plot should I use in matplotlib? Thanks!

1
  • Asking for charts can be the worst because you know what you want in your head but you don't have a good way to communicate that to us. However, you can draw a picture on a piece of paper and take a photo with your phone and load that up to the question. Unless of course, unutbu has already answered your question. (-: Commented Aug 31, 2017 at 19:28

1 Answer 1

2

You could make a bar chart:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'my_1': [5, 3, 1, 12, 6], 'my_2': [7, 5, 2, 9, 1], 'my_3': [4, 13, 8, 9, 2]})

df.T.plot(kind='bar')
plt.show()

enter image description here

or a scatter plot:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'my_1': [5, 3, 1, 12, 6], 'my_2': [7, 5, 2, 9, 1], 'my_3': [4, 13, 8, 9, 2]})

fig, ax = plt.subplots()
cols = np.arange(len(df.columns))
x = np.repeat(cols, len(df))
y = df.values.ravel(order='F')
color = np.tile(np.arange(len(df)), len(df.columns))
scatter = ax.scatter(x, y, s=150, c=color)
ax.set_xticks(cols)
ax.set_xticklabels(df.columns)
cbar = plt.colorbar(scatter)
cbar.set_ticks(np.arange(len(df)))
plt.show()

enter image description here

Just for fun, here is how to make the same scatter plot using Pandas' df.plot:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'my_1': [5, 3, 1, 12, 6], 'my_2': [7, 5, 2, 9, 1], 'my_3': [4, 13, 8, 9, 2]})

columns = df.columns
index = df.index
df = df.stack()
df.index.names = ['color', 'column']
df = df.rename('y').reset_index()
df['x'] = pd.Categorical(df['column']).codes
ax = df.plot(kind='scatter', x='x', y='y', c='color', colorbar=True, 
             cmap='viridis', s=150)
ax.set_xticks(np.arange(len(columns)))
ax.set_xticklabels(columns)
cbar = ax.collections[-1].colorbar
cbar.set_ticks(index)
plt.show()

Unfortunately, it requires quite a bit of DataFrame manipulation just to call df.plot and then there are some extra matplotlib calls needed to set the tick marks on the scatter plot and colorbar. Since Pandas is not saving effort here, I would go with the first (NumPy/matplotlib) approach shown above.

Sign up to request clarification or add additional context in comments.

1 Comment

I would prefer each number is a dot, instead of a bar. Is that possible?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.