0

I was wondering if there is a way to highlight rows in Pandas dataframe based on values in some specific column? For example:

enter image description here

As can be seen above, in Col_4, values are different. Therefore, is it possible to highlight rows belonging to distinct values? Or, to make it more complex, highlight rows based on different values in multiple columns?

2
  • 1
    you can select rows with complex or easy selections ... but a dataframe is just a memory structure there is no "highlighting" if your asking how to output a csv with preformatted cells, that is not a feature of csvs ... other than that im not sure what you are asking Commented Sep 26, 2022 at 18:40
  • Have a look to openpyxl.styles. Regards. Commented Sep 26, 2022 at 18:47

1 Answer 1

1

With the following toy dataframe:

import pandas as pd

df = pd.DataFrame(
    {
        "col1": ["A", "F", "A", "F", "A", "A", "A", "A"],
        "col2": ["B", "B", "B", "B", "B", "G", "G", "B"],
        "col3": ["C", "H", "C", "H", "C", "I", "C", "I"],
        "col4": ["D", "E", "D", "E", "E", "D", "D", "E"],
    }
)

enter image description here

Here is one way to do it:

# Sort value and add an unique identifier to identical rows
df = df.sort_values(["col1", "col2", "col3", "col4"]).reset_index(drop=True)
df["hash"] = df.apply(lambda x: hash("".join(x)), axis=1)

# Attribut a unique unique color to each identifier
import random

colors = [
    f"#{random.randint(0,255):02X}{random.randint(0,255):02X}{random.randint(0,255):02X}"
    for _ in range(df.shape[0])
]

color_mapping = {}
for value in df["hash"].unique():
    color = colors.pop(0)
    if value not in color_mapping:
        color_mapping[value] = color
# Color rows (run in a Jupyter notebook)
df.style.apply(
    lambda v: [f"background-color: {color_mapping.get(v['hash'], '')}"] * df.shape[1],
    axis=1,
).hide_columns("hash")

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you. There is a small issue. It seem that if the number of unique hashes is greater than the number of the provided colors, the dataframes stops being colored.
Right. I've updated my answer to deal with this issue. Cheers.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.