1

I have a dataframe where one of the columns is a 16 element vector (stored as a list). In the past, I have found seaborn's lineplot highly useful for regression analysis on a scalar column. The vector column has me in a bind.

Consider a seaborn sample program:

import seaborn as sns
sns.set_theme(style="darkgrid")

# Load an example dataset with long-form data
fmri = sns.load_dataset("fmri")

# Plot the responses for different events and regions
sns.lineplot(x="timepoint", y="signal",
             hue="region", style="event",
             data=fmri)

it yields a figure, such as this enter image description here

If I add another column signal2 to fmri:

fmri['signal2'] = '[1,2,3,4,5,6]' 

(this is for representational purposes only)

In the dataset I have, there is a list of 16 floats in a column of the dataset. What I want to do is look at lineplot for:

sns.lineplot(x="<length of vector>", y="signal2",
             hue="region", style="event",
             data=fmri)

Basically, look at variations in the vector for different subsections of the dataset.

7
  • 2
    Could you add some example data and expected output/clearer explanation of what you want to achieve? See: minimal reproducible example and stackoverflow.com/questions/20109391/… Commented Mar 18, 2021 at 5:46
  • Thanks @Shaido. Apologies for not adding them earlier. I have added details to the problem now. Commented Mar 19, 2021 at 3:55
  • Maybe you could explode the list and then plot the result using lineplot? Or do you have some specific way you want to handle the different values in the lists? Commented Mar 19, 2021 at 5:55
  • Thanks for your reply @Shaido. By exploding a list you mean, have each element as a separate column ? Commented Mar 19, 2021 at 6:03
  • I was thinking of having each element as a separate row (if you have another column for the x-values). But depending on the final result that you want to achieve, separate rows could be that way to go. For your example list ([1,2,3,4,5,6]), how should that be plotted (and would other rows have the same list?)? If each index will be a separate line, then expanding to different columns would be best. Commented Mar 19, 2021 at 6:10

1 Answer 1

1

Assuming that you want one line for each index in the list, e.g., the value at the 0th index for all rows will create a single line. To do this, we need to first explode the lists while keeping track of the index of each list value.

First, I create an example column signal2 with a list of length 6 for each row:

fmri['signal2'] = list(np.random.random((len(fmri), 6)))

Note that if the list is actually a string (as in the question), we need to convert it to an actual list first. If the column already contains lists, this is not needed.

fmri['signal2'] = fmri['signal2'].str[1:-1].str.split(',')

Then, we explode the list and add the list indices using cumcount:

fmri = fmri.explode('signal2')
fmri['signal2'] = fmri['signal2'].astype(float)  # Needed if the elements are strings.
fmri['x'] = fmri.groupby(fmri.index).cumcount()

To plot the data as separate lines, set the hue parameter to be the list index column while keeping the x-axis as timepoint:

sns.lineplot(x="timepoint", y="signal2", hue="x", data=fmri)

Resulting plot:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.