1

I have a dataframe of the following form:

import pandas as pd
df = pd.DataFrame({'t': [0, 1, 2, 3, 4, 5, 6],
                   'l': [["c", "d"], ["a", "b"], ["c", "d"], ["a", "b"], ["c", "d"], ["c", "d"], ["c", "d"]]})

The column l consists of lists, where the list-entries can either be in the set {a,b,c,d}. I want to plot the contents of l for each value of t in the following manner which basically shows which of the four possible values {a,b,c,d} are acticated at a time t:

enter image description here

In order to create the above plot, what I did was to create the following dataframe based on df above (-1 is not activated, otherwise non-negative):

df_plot = pd.DataFrame({'t': [0, 1, 2, 3, 4,5,6],
                   'a': [-1, 0, -1, 0, -1,-1,-1],
                   'b': [-1, 1, -1, 1, -1,-1,-1],
                   'c': [2, -1, 2, -1, 2,2,2],
                   'd': [3, -1, 3, -1, 3,3,3]})

import numpy as np
ax = df_plot.plot(x="t", y=["a","b","c","d"],style='.', ylim=[-0.5,3.5], yticks=np.arange(0,3.1,1),legend=False)
labels = ["a","b","c","d"]
ax.set_yticklabels(labels)

This technically gives me what I want, however, I'd like to think that there is an easier and more professional way to plot this - is there a smarter way using one of Python's libraries?

6
  • So you want to know out of all combinations, which have been activated at the same time at some point, is that right? So {a,b},{c,d}. Or you need it for each point t? Commented Jan 30, 2019 at 9:37
  • @yatu For each point t I just want to mark which one of a,b,c or d have been activated. All possible combinations are possible, it is merely due to my laziness that the example above only has {a,b} and {c,d} Commented Jan 30, 2019 at 9:43
  • 2
    Well you will end up having some discrete representation in any case, your current solution seems fine to me. Commented Jan 30, 2019 at 9:51
  • @yatu Thanks - surprised there is no immediate way to do this automatically in any of Python's plotting libraries Commented Jan 30, 2019 at 9:52
  • 1
    @N08 If you are looking for something based just on Pandas check out my answer. Commented Jan 30, 2019 at 10:26

1 Answer 1

1

How about something like this:

# Reshape dataframe    
dff = df.l.apply(pd.Series).merge(df, right_index = True, left_index = True).drop(["l"], axis = 1).melt(id_vars = ['t'], value_name = "l").drop("variable", axis = 1)

# Plot dataframe
import matplotlib.pyplot as plt
plt.scatter(dff['t'], dff['l'])
# plt.grid(True)

enter image description here

More details about what is going on in the code i wrote can be found clicking this link : https://mikulskibartosz.name/how-to-split-a-list-inside-a-dataframe-cell-into-rows-in-pandas-9849d8ff2401

Note: it should work no matter how many items you have in the lists in column l.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.