6

I am trying to conditionally format table data using Great Tables but not sure how to do it.

To highlight the color of all those cells (sort of heatmap) whose values is higher than Upper Range column.

Data:

import polars as pl

gt_sample_df = pl.DataFrame({'Test': ['Test A','Test B','Test C','Test D','Test Z','Test E','Test F','Test X',
'Test G','Test H','Test I','Test J'],
 'Lower Range': [35.3,2.5,85.0,0.0,None,3.0,200.0,None,3.0,400.0,None,7.0],
 'Upper Range': [79.5,3.5,150.0,160.0,160.0,5.0,None,200.0,5.0,1000.0,150.0,30.0],
 '2024-11-10': [43.0,3.14,135.82,162.7,None,None,206.0,None,4.76,519.52,134.4,26.88],
 '2024-08-03': [36.0,4.31,152.98,None,175.5,5.94,None,211.0,None,512.08,112.6,22.52],
 '2024-06-17': [47.0,3.38,158.94,None,182.0,4.87,None,229.0,None,550.24,115.3,23.06],
 '2024-02-01': [44.0,3.12,136.84,None,154.1,4.51,None,198.0,None,465.04,86.3,17.26],
 '2023-10-16': [45.0,3.11,140.14,None,162.0,4.6,None,207.0,None,501.44,109.3,21.86],
 '2023-05-15': [42.0,3.8,159.58,None,192.0,5.57,None,234.0,None,597.68,162.1,32.42]})

gt_sample_df

enter image description here

The various date columns in this dataframe gt_sample_df contain the results and I want to compare with the Upper Range and highlight those whose values are higher than Upper Range column. There can be n number of date columns with any date so I can't use static names for columns.

I have tried:

from great_tables import GT, md, style, loc, google_font

(GT(gt_sample_df)
 .tab_style(style=[style.text(color="Navy"), style.fill(color="PaleTurquoise")],
        locations=loc.body(columns=pl.exclude(["Test",'Lower Range','Upper Range']), 
                           rows=pl.col(lambda x:  x) > pl.col('Upper Range')))
        )
from great_tables import GT, md, style, loc, google_font

(GT(gt_sample_df)
 .tab_style(style=[style.text(color="Navy"), style.fill(color="red")],
        locations=loc.body(columns=[3:], 
                           rows=pl.col(lambda x:  x) > pl.col('Upper Range')))
        )

As I only want to highlight values greater in the date columns so I was trying to exclude first 3 columns in the column selection but it didn't work and I am not sure how to automatically compare values of all other Date columns to Upper Range column.

Update: Column selection I am able to do it but not able to select proper rows

columns_required = gt_sample_df.select(pl.exclude(["Test",'Lower Range','Upper Range'])).columns

(GT(gt_sample_df)
 .tab_style(style=[style.text(color="Navy"), style.fill(color="PaleTurquoise")],
        
        locations=loc.body(columns=columns_required, rows=pl.col(lambda x:  x) > pl.col('Upper Range')))
)
import polars.selectors as cs

(GT(gt_sample_df)
 .tab_style(style=[style.text(color="Navy"), style.fill(color="PaleTurquoise")],
        
        locations=loc.body(columns=cs.starts_with("20"), rows=pl.col(lambda x:  x) > pl.col('Upper Range')))
)
columns_required = gt_sample_df.select(pl.exclude(["Test",'Lower Range','Upper Range'])).columns

(GT(gt_sample_df)
 .tab_style(style=[style.text(color="Navy"), style.fill(color="PaleTurquoise")],
        for col_name in columns_required:
        
            locations=loc.body(columns=[col_name], 
                            rows=pl.col(col_name) > pl.col('Upper Range')))
)

this also didn't work.

Desired Output probably something like this:

enter image description here

5
  • Is this similar to stackoverflow.com/questions/79279190 Commented Dec 14, 2024 at 10:56
  • 1
    I think you have the wrong link Commented Dec 14, 2024 at 11:00
  • 1
    @jqurious I think you have posted the link of this post only. Commented Dec 14, 2024 at 11:07
  • 2
    Thanks @HenryHarbeck - it should have been stackoverflow.com/questions/79112091 Commented Dec 14, 2024 at 11:26
  • 1
    That does look pretty much the same indeed! Have linked it in my answer now Commented Dec 14, 2024 at 11:33

1 Answer 1

3

Edit: As of the last Great Tables release, this is now more natively supported with a mask argument to loc.body See the PR for further detail.

import polars as pl
import polars.selectors as cs
from great_tables import GT, loc, style

# define `gt_sample_df` as per example snippet

required_columns = gt_sample_df.drop("Test", "Lower Range", "Upper Range").columns

(
    GT(gt_sample_df)
    # `gt_sample_df.style` works here too
    .tab_style(
        style=[style.text(color="Navy"), style.fill(color="PaleTurquoise")],
        locations=loc.body(mask=pl.col(required_columns) > pl.col("Upper Range"))
        # or with a selector instead (not needing the required_columns variable)
        # locations=loc.body(
            # mask=cs.exclude("Test", "Lower Range", "Upper Range") > pl.col("Upper Range")
        # )
    )
)

Original answer: You were pretty close in your last snippet! For the locations argument to accept a list, it needs to be done in a list comprehension, or ahead of time, outside the tab_style method call.

Link to an example of this in the docs can be found here

import polars as pl
from great_tables import GT, loc, style

# define `gt_sample_df` as per example snippet

required_columns = gt_sample_df.drop("Test", "Lower Range", "Upper Range").columns

(
    GT(gt_sample_df)
    # `gt_sample_df.style` works here too
    .tab_style(
        style=[style.text(color="Navy"), style.fill(color="PaleTurquoise")],
        locations=[
            loc.body(columns=col_name, rows=pl.col(col_name) > pl.col("Upper Range"))
            for col_name in required_columns
        ],
    )
)

great table output

Edit: See this answer and this discussion on github for related solutions/discussions.

Disclaimer: I am no great tables expert, so it may be possible there is a better way to do this

Sign up to request clarification or add additional context in comments.

1 Comment

No worries, I am glad to help! @jqurious has also linked to a similar question / answer that provides more examples

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.