1

I have two groups of points, but they also overlap, so I need to add jitter if I plot them with a scatterplot. I also want to connect matching points from each group (they all have a pair).

There are many questions that suggest:

data = [['abc', 'pre', 10], ['abc', 'post', 5], ['bce', 'pre', 10], ['bce', 'post', 5], ['cef', 'pre', 8], ['cef', 'post', 5]]
df = pd.DataFrame(data, columns=['ID', 'time', 'value'])
grouped = df.groupby('ID')

for name, group in grouped:
    sns.scatterplot(x='time', y='value', data=group, color='#3C74BC')
    sns.lineplot(x='time', y='value', data=group, color='#3C74BC')
plt.show()

It works ok, but it doesn't have jitter. If I add jitter via sns. stripplot(), the lines do not connect dots anymore and they are coming out of arbitrary places.

1
  • That was a mistake in the question. I don't have it in my actual code. I've added a test dataset. it's a bit problematic to add scatter manually because it's categorical at the moment, but I'll try to change that Commented Feb 2, 2023 at 16:18

1 Answer 1

1

The approach below makes following changes:

  • Convert the time to numeric (0 for 'pre' and 1 for 'post') via (df['time'] != 'pre').astype(float)
  • Add a random jitter to these values: + np.random.uniform(-0.1, 0.1, len(df)). Depending on how many values you have, you might change 0.1 to a larger value.
  • Use sns.lineplot with a marker to avoid the need of scatterplot.
  • Use hue='ID' to draw everything in one go.
  • As hue doesn't look to color=, use palette= with the same number of colors as there are different hue values.
  • Suppress the legend, as all hue values have the same color.
  • Assign tick labels to 0 and 1.
  • Set xlim to so the tick labels are at equal distances to the respective border.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

data = [['abc', 'pre', 10], ['abc', 'post', 5], ['bce', 'pre', 10], ['bce', 'post', 5], ['cef', 'pre', 8], ['cef', 'post', 5]]
df = pd.DataFrame(data, columns=['ID', 'time', 'value'])

df['time'] = (df['time'] != 'pre').astype(float) + np.random.uniform(-0.1, 0.1, len(df))

ax = sns.lineplot(x='time', y='value', data=df, hue='ID', marker='o',
                  palette=['#3C74BC'] * len(df['ID'].unique()), legend=False)
ax.set_xticks([0, 1], ['pre', 'post'])
ax.set_xlim(-0.2, 1.2)
plt.show()

sns.lineplot with jitter

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.