Conditional formatting of pandas DataFrame columns based on column header string

Question

I want to highlight cells if they are greater than a value dependent on the column header.

I want to 'read' the column header, if it is in the dictionary (CEPA_FW) then the corresponding value is returned. Then if any cells in that column are greater than this value, they are filled dark orange. My effort is below, but I am getting errors (ValueError: Length mismatch: Expected axis has 1 elements, new values have 4 elements).

df=pd.DataFrame(({'As':['0.001', 0, '0.001','0.06'], 'Zn': ['6','4','6','8'], 'Pb': ['0.006','0','0.005','0.005'], 'Yt': [1,0,0.002,6]}))
cols=df.columns

CEPA_FW=  {'Ag':0.05,'As' :0.05 ,'Ba':1.0,'B':1.0,'Cd' :0.01 ,'Cr' :0.05 ,'Co':0.001,'Cu' :1.0 ,'K':5.0,'Pb' :0.005 ,'Hg' :0.0002 ,'Mn':0.5,'Ni' :1.0 ,'Se':0.01,'Sn':0.5,'SO4':400.0,'Zn' :5.0}



def fill_exceedances(val):
    for header in cols:
        if header in CEPA_FW:
            for c in df[header]:
                fill = 'darkorange' if c> CEPA_FW[header] else ''
                return ['backgroundcolor: %s' % fill]

df.style.apply(fill_exceedances, axis = 1).to_excel('styled.xlsx', engine='openpyxl')

appropriate limit is returned what do you mean by this?

Rahul Agarwal
– Rahul Agarwal

2018-12-19 09:53:45 +00:00
Commented Dec 19, 2018 at 9:53 — Rahul Agarwal
– Rahul Agarwal, Commented Dec 19, 2018 at 9:53
@RahulAgarwal, I have edited to be clearer.

flashliquid
– flashliquid

2018-12-19 10:02:59 +00:00
Commented Dec 19, 2018 at 10:02 — flashliquid
– flashliquid, Commented Dec 19, 2018 at 10:02

jezrael · Accepted Answer · 2018-12-19 10:00:57Z

1

Use custom function for create DataFrame filled by styles by condition:

def fill_exceedances(x):
    color = 'orange'
    #get columns which are in keys of dict
    c = x.columns.intersection(CEPA_FW.keys())
    #filter columns and rename by dict
    df2 = x[c].rename(columns=CEPA_FW)
    #create boolean mask only for matched columns and compare
    mask = df2.astype(float).values > df2.columns[None,:].values
    #new DataFrame filled by no color
    df1 = pd.DataFrame('', index=x.index, columns=c)
    #set color by mask and add missing non matched columns names by reindex
    df1 = (df1.where(mask, 'background-color: {}'.format(color))
              .reindex(columns=x.columns, fill_value=''))

    return df1

df.style.apply(fill_exceedances, axis=None).to_excel('styled.xlsx', engine='openpyxl')

edited Dec 19, 2018 at 10:00

answered Dec 19, 2018 at 9:54

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

13 Comments

flashliquid Over a year ago

Thanks, when I replace' >' with '<' this works perfectly. It is quite an 'involved' solution though. Is it necessary to create a separate dataframe of just the columns that are dictionary keys? Anyway, I don't want to sound ungrateful, this is great, thank you.

jezrael Over a year ago

@flashliquid - yes, maybe there is also better solution, but here is really advantage working with DataFrame common way. It is especially good if some complicated solutions.

flashliquid Over a year ago

I stumbled on this answer you gave to a similar question a while back: stackoverflow.com/a/44942410/6108107 could this approach also work in my case? What would the function be?

jezrael Over a year ago

@flashliquid - yes, is possible change mask = df2.astype(float).values > df2.columns[None,:].values to mask = (df2.astype(float).apply(lambda x: x > x.name, axis=0).values), but performance is worse in large DataFrame

jezrael Over a year ago

@flashliquid - You can change mask to mask = df2.apply(pd.to_numeric, errors='coerce').fillna(np.inf).values > df2.columns[None,:].values

|

Collectives™ on Stack Overflow

Conditional formatting of pandas DataFrame columns based on column header string

1 Answer 1

13 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

13 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related