2

I have a dataset with about 9800 entries. One column contains user names (about 60 individual user names). I want to generate a scatter plot in matplotlib and assign different colors to different users.

This is basically what I do:

import matplotlib.pyplot as plt
import pandas as pd

x = [5, 10, 20, 30, 5, 10, 20, 30, 5, 10, 20, 30]
y = [100, 100, 200, 200, 300, 300, 400, 400, 500, 500, 600, 600]
users =['mark', 'mark', 'mark', 'rachel', 'rachel', 'rachel', 'jeff', 'jeff', 'jeff', 'lauren', 'lauren', 'lauren']

#this is how the dataframe basicaly looks like    
df = pd.DataFrame(dict(x=x, y=y, users=users)

#I go on an append the df with colors manually
#I'll just do it the easy albeit slow way here

colors =['red', 'red', 'red', 'green', 'green', 'green', 'blue', 'blue', 'blue', 'yellow', 'yellow', 'yellow']

#this is the dataframe I use for plotting
df1 = pd.DataFrame(dict(x=x, y=y, users=users, colors=colors)

plt.scatter(df1.x, df1.y, c=df1.colors, alpha=0.5)
plt.show()

However, I don't want to assign colors to the users manually. I have to do this many times in the coming weeks and the users are going to be different every time.

I have two questions:

(1) Is there a way to assign colors automatically to the individual users? (2) If so, is there a way to assign a color scheme or palette?

2
  • Possible duplicate of Scatter plots in Pandas/Pyplot: How to plot by category Commented Jan 4, 2017 at 14:44
  • @tom I don't think so. I need a way to assign a color column to the data frame dynamically. The question you suggest relates to grouped plots and not the color. Commented Jan 4, 2017 at 14:47

1 Answer 1

3
user_colors = {}
unique_users = list(set(users)) 
step_size = (256**3) // len(unique_users)
for i, user in enumerate(unique_users):
    user_colors[user] = '#{}'.format(hex(step_size * i)[2:])

Then you've got a dictionary (user_colors) where each user got one unique color.

colors = [user_colors[user] for user in users]

Now you've got your array with a distinct color for each user

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! I think I understand what you do. However, can I apply it to a pandas data frame as well? How would that work?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.