2

I have defined a function to get value_counts for each column with Count, Percentage % as below:

import pandas as pd
import seaborn as sns
import numpy as np

from IPython.display import display


df = sns.load_dataset("diamonds")

def valueCountDF(df):
    
    object_cols = list(df.select_dtypes(exclude=np.number).columns)
    numeric_cols = list(df.select_dtypes(include=np.number).columns)

    c = df[object_cols].apply(lambda x: x.value_counts(dropna=False)).T.stack().astype(int)

    p = (df[object_cols].apply(lambda x: x.value_counts(normalize=True,
                                                       dropna=False)).T.stack() * 100).round(2)

    cp = pd.concat([c,p], axis=1, keys=["Count", "Percentage %"])
    display(cp)

valueCountDF(df)

This code outputs:

                   Count  Percentage %
cut     Fair        1610          2.98
        Good        4906          9.10
        Ideal      21551         39.95
        Premium    13791         25.57
        Very Good  12082         22.40
color   D           6775         12.56
        E           9797         18.16
        F           9542         17.69
        G          11292         20.93
        H           8304         15.39
        I           5422         10.05
        J           2808          5.21
clarity I1           741          1.37
        IF          1790          3.32
        SI1        13065         24.22
        SI2         9194         17.04
        VS1         8171         15.15
        VS2        12258         22.73
        VVS1        3655          6.78
        VVS2        5066          9.39

It is hard for large datasets in Jupyter Notebooks with a white background to understand the above data.

So I want to try the pandas dataframe styler to style dataframe with the background color for each row index.

# Uses the full color range
display(cp.style.background_gradient(cmap='viridis'))

enter image description here

The above one gives background_gradient for the df excluding index. I need to color for each row index (cut, color, clarity) and their groups.

Precisely, I want to differentiate with colors like cut and cut group in one color, color, and color group in one color. Is there a way to do this?

Update:

Thanks to @r-beginners

Using the below css styler

table_css = [
    {
        "selector":"th.row_heading.level0",
        "props":[
            ("background-color", "darkseagreen"),
            ("color", "white")
        ]
    }
]
def valueCountDF(df):
    
    object_cols = list(df.select_dtypes(exclude=np.number).columns)
    numeric_cols = list(df.select_dtypes(include=np.number).columns)

    c = df[object_cols].apply(lambda x: x.value_counts(dropna=False)).T.stack().astype(int)

    p = (df[object_cols].apply(lambda x: x.value_counts(normalize=True,
                                                   dropna=False)).T.stack() * 100).round(2)

    cp = pd.concat([c,p], axis=1, keys=["Count", "Percentage %"])
    #cp.index.names = ['C3','grade']
    #print(cp.style.render())
    style = cp.style.background_gradient(cmap='viridis')
    style = style.set_table_styles(table_css)
    return style

valueCountDF(df)

Able to color the level0 index with only one color as below.

enter image description here

7
  • 1
    Background gradients are for numbers and cannot be handled by strings. I was able to add a color to the background of the index, as you asked. I'm not very good with HTML, so this is my limit. I've given the code to Colab for reference. Commented Aug 24, 2021 at 14:01
  • @r-beginners, this is what I need, but want to differentiate with colors like cut and cut group in one color, color and color group in one color..... Commented Aug 24, 2021 at 14:44
  • Is that possible?? Commented Aug 24, 2021 at 14:44
  • 1
    Hopefully you will get the answers you want. Comments with links to collaborations will be deleted. Commented Aug 26, 2021 at 8:23
  • 1
    Please try to run this code to understand the class name set in css. print(cp.style.render()) Commented Aug 26, 2021 at 8:27

1 Answer 1

0

Load Data

import pandas as pd
import seaborn as sns
import numpy as np
from matplotlib import colors

df = sns.load_dataset("diamonds")

Reset the index and rename axis.Update the code in the function valueCountDF

def valueCountDF(df):
    
    object_cols = list(df.select_dtypes(exclude=np.number).columns)
    numeric_cols = list(df.select_dtypes(include=np.number).columns)

    c = df[object_cols].apply(lambda x: x.value_counts(dropna=False)).T.stack().astype(int)

    p = (df[object_cols].apply(lambda x: x.value_counts(normalize=True,
                                                       dropna=False)).T.stack() * 100).round(2)

    cp = pd.concat([c,p], axis=1, keys=["Count", "Percentage %"])

    # Reset index and name the axis
    cp = cp.rename_axis(['Variable','Class']).reset_index()
    cp['Variable'] = np.where(cp['Variable'].duplicated(),'',cp['Variable'])

    return cp

Using np.broadcast_to and radd()

def colr(x):
    y = x.assign(k=x['Variable'].ne("").cumsum())
    d = dict(enumerate(colors.cnames))
    y[:] = np.broadcast_to(y['k'].map(d).radd('background-color:').to_numpy()[:,None]
                          ,y.shape)
    return y.drop("k",1)

Apply the style

val_count = valueCountDF(df)
val_count.style.apply(colr,axis=None).format({'Percentage %': '{:.1f}'})

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.