4

I am using python and plotly in order to design a bar plot for the mean rating of certain categories in the data set I am using. I got the bar chart nearly how I want it however I would like to change the color for each specific bar in the plot but can't seem to find a clear way on how to do this online.

Data Set

from pandas import Timestamp
pd.DataFrame({'id': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5},
              
 'overall_rating': {0: 5, 1: 4, 2: 5, 3: 5, 4: 4},
 'user_name': {0: 'member1365952',
  1: 'member465943',
  2: 'member665924',
  3: 'member865886',
  4: 'member1065873'},
 'date': {0: Timestamp('2022-02-03 00:00:00'),
  1: Timestamp('2022-02-03 00:00:00'),
  2: Timestamp('2022-02-02 00:00:00'),
  3: Timestamp('2022-02-01 00:00:00'),
  4: Timestamp('2022-02-01 00:00:00')},
 'comments': {0: 'Great campus. Library is always helpful. Sport course has been brill despite Civid challenges.',
  1: 'Average facilities and student Union. Great careers support.',
  2: 'Brilliant university, very social place with great unions.',
  3: 'Overall it was very good and the tables and chairs for discussion sessions worked very well',
  4: 'Uni is nice and most of the staff are amazing. Facilities (particularly the library) could be better'},
 'campus_facilities_rating': {0: 5, 1: 3, 2: 5, 3: 4, 4: 4},
 'clubs_societies_rating': {0: 5, 1: 3, 2: 4, 3: 4, 4: 4},
 'students_union_rating': {0: 4, 1: 3, 2: 5, 3: 5, 4: 5},
 'careers_service_rating': {0: 5, 1: 5, 2: 5, 3: 5, 4: 3},
 'wifi_rating': {0: 5, 1: 5, 2: 5, 3: 5, 4: 3}})

Code Used

# Plot to find mean rating for different categories
fig = px.bar(df, y=[df.campus_facilities_rating.mean(), df.clubs_societies_rating.mean(),
                    df.students_union_rating.mean(), df.careers_service_rating.mean(), df.wifi_rating.mean()],
                x=['Campus Facilities', 'Clubs & Societies', 'Students Union', 'Careers & Services', 'Wifi'],
                labels={
                    "y": "Mean Rating (1-5)",
                    "x": "Category"},
                title="Mean Rating For Different Student Categories")

fig.show()

UPDATED ATTEMPT

# Plot to find mean rating for different categories
fig = px.bar(df, y=[df.campus_facilities_rating.mean(), df.clubs_societies_rating.mean(),
                    df.students_union_rating.mean(), df.careers_service_rating.mean(), df.wifi_rating.mean()],
                x=['Campus Facilities', 'Clubs & Societies', 'Students Union', 'Careers & Services', 'Wifi'],
                labels={
                    "y": "Mean Rating (1-5)",
                    "x": "Category"},
                title="Mean Rating For Different Student Categories At The University of Lincoln",
                color_discrete_map = {
                    'Campus Facilities' : 'red',
                    'Clubs & Societies' : 'blue',
                    'Students Union' : 'pink',
                    'Careers & Services' : 'grey',
                    'Wifi' : 'orange'})

fig.update_layout(barmode = 'group')

fig.show()

Output just gives all bars as blue.

3
  • Can you specify how you intend to map colors to bars? Do you have a list that contains the color for each bar in this plot? Do you have cutoff bar values for different colors? Should the bar value be represented by a color? Should groups be represented by a color? Here are several of these possibilities already covered. Commented Feb 5, 2022 at 18:48
  • @Mr.T I basically want to choose what colour i assign each x value, so e.g set Campus Facilities to Red, set Students Union to Blue etc. How would do this. Commented Feb 5, 2022 at 19:06
  • 1
    Please don't post images of code/data/error messages. Post the text directly here on SO. Nobody wants to type text from an image. Commented Feb 5, 2022 at 20:10

1 Answer 1

7

In general, you can use color_discrete_map in px.bar() to specify the color of each bar if you've defined a category such as color="medal" like this:

color_discrete_map={'gold':'yellow', 'silver':'grey', 'bronze':'brown'}

Plot:

enter image description here

Complete code for general approach with data sample:

import plotly.express as px

long_df = px.data.medals_long()

fig = px.bar(long_df, x="nation", y="count", color="medal", title="color_discrete_map={'gold':'yellow', 'silver':'grey', 'bronze':'brown'}",
            color_discrete_map={'gold':'yellow', 'silver':'grey', 'bronze':'brown'})

fig.update_layout(barmode = 'group')

fig.show()

Edit after OP provided data sample

In the case of your particular dataset and structure, you can't directly apply color='category since the different categories are spread accross several columns like this:

enter image description here

There's one way reach your goal using go.Figure() and fig.add_traces(), but since you seem most interested in px.bar(), we'll stick to plotly.express. In short go.Figure() would require no particular data wrangling to get what you want, but setting up the figure would be a bit more messy. When it comes to plotly.express and px.bar, the exact opposite is true. And once we've made some changes to your dataset, all you need to build the figure below is the following snippet:

fig = px.bar(dfg, x = 'category', y = 'value',
             color = 'category',
             category_orders = {'category':['Campus Facilities','Clubs & Societies','Students Union','Careers & Services','Wifi']},
             color_discrete_map = {'Campus Facilities' : 'red',
                                    'Clubs & Societies' : 'blue',
                                    'Students Union' : 'pink',
                                    'Careers & Services' : 'grey',
                                    'Wifi' : 'orange'})

enter image description here

Complete code with all data wrangling steps:

from pandas import Timestamp
import plotly.express as px
import pandas as pd
df = pd.DataFrame({'id': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5},
              
 'overall_rating': {0: 5, 1: 4, 2: 5, 3: 5, 4: 4},
 'user_name': {0: 'member1365952',
  1: 'member465943',
  2: 'member665924',
  3: 'member865886',
  4: 'member1065873'},
 'date': {0: Timestamp('2022-02-03 00:00:00'),
  1: Timestamp('2022-02-03 00:00:00'),
  2: Timestamp('2022-02-02 00:00:00'),
  3: Timestamp('2022-02-01 00:00:00'),
  4: Timestamp('2022-02-01 00:00:00')},
 'comments': {0: 'Great campus. Library is always helpful. Sport course has been brill despite Civid challenges.',
  1: 'Average facilities and student Union. Great careers support.',
  2: 'Brilliant university, very social place with great unions.',
  3: 'Overall it was very good and the tables and chairs for discussion sessions worked very well',
  4: 'Uni is nice and most of the staff are amazing. Facilities (particularly the library) could be better'},
 'campus_facilities_rating': {0: 5, 1: 3, 2: 5, 3: 4, 4: 4},
 'clubs_societies_rating': {0: 5, 1: 3, 2: 4, 3: 4, 4: 4},
 'students_union_rating': {0: 4, 1: 3, 2: 5, 3: 5, 4: 5},
 'careers_service_rating': {0: 5, 1: 5, 2: 5, 3: 5, 4: 3},
 'wifi_rating': {0: 5, 1: 5, 2: 5, 3: 5, 4: 3}})

df.columns = ['id', 'overall_rating', 'user_name', 'date', 'comments', 'Campus Facilities',
              'Clubs & Societies','Students Union','Careers & Services','Wifi']

dfm = pd.melt(df, id_vars=['id', 'overall_rating', 'user_name', 'date', 'comments'],
              value_vars=list(df.columns[5:]),
              var_name ='category')

dfg = dfm.groupby(['category']).mean().reset_index()

fig = px.bar(dfg, x = 'category', y = 'value', color = 'category',
             category_orders = {'category':['Campus Facilities','Clubs & Societies','Students Union','Careers & Services','Wifi']},
             color_discrete_map = {
                    'Campus Facilities' : 'red',
                    'Clubs & Societies' : 'blue',
                    'Students Union' : 'pink',
                    'Careers & Services' : 'grey',
                    'Wifi' : 'orange'})

fig.update_yaxes(title = 'Mean rating (1-5)')
fig.show()

Appendix: Why dfm and dfg?

px.bar(color = 'variable') assigns colors to unique occurences of a series or a pandas column named 'variable'. But the categories we're interested in your dataframe are spread accross several columns. So what

dfm = pd.melt(df, id_vars=['id', 'overall_rating', 'user_name', 'date', 'comments'],
              value_vars=list(df.columns[5:]),
              var_name ='category')

does, is to take the following columns:

enter image description here

and stack them into one column named variable like this:

enter image description here

But that is still the raw data, and you're not interested in that, but rather the mean of each group in that same column. And that is what

dfm.groupby(['category']).mean().reset_index()

gives us:

enter image description here

Take a look at pd.melt() and df.groupby() for further details.

Sign up to request clarification or add additional context in comments.

13 Comments

Can you link to color_discrete_map in the docs? When looking for something like a ListedColorMap or a keyword that takes a list, I always ended up here.
@Mr.T Sure! You'll find a little info on color_discrete_map on that exact page just a little bit down under Directly Mapping Colors to Data Values
@Mr.T The Plotly docs have got a bunch of little goodies about color spread around all over the place. I tried to gather the most important of them in the post Plotly: How to define colors in a figure using plotly.graph_objects and plotly.express?
Darn it. I was the entire time on this page, and now that I was looking for the link, I accidentally found the right one but didn't read it. Thanks.
@vestland I have updated the dataset could you take a look now please
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.