0

I want to plot 4 different scatter subplots in one main plot. The data are coming from a grouped dataframe which is read from a .csv file. The initial dataframe looks like this:

df.to_csv("File.csv", index=False)

df:

Category1 Category2 X Y
0 A x 4 5.1
1 B x 3 4.2
2 A y 2 7.1
3 A z 9 6.1
... ... ... ... ...
97 A z 4 5.1
98 A w 3 4.2
99 B y 2 7.1
100 B z 9 6.1

As you can see, category1 has only two kinds of values (A,B) while category2 has 4 kinds of values (x,y,z,w). the X and Y values are random and for display purpose only.

The grouped df was created using following command:

dfGrouped = df.groupby(["Category1 ","Category2"])

dfGrouped:

X Y
A x 4 5.1
A 7 9.1
y 3 4.2
3 4.2
3 4.2
z 2 7.1
w 9 6.1
... ... ... ...
B x 4 5.1
y 3 4.2
z 2 7.1
2 7.1
w 9 6.1

I tried to plot them individually, but it didn't work:

fig, ax = plt.subplots(figsize=(8, 6))
ax.margins(0.05)
for name, group in dfGrouped:
    ax.plot(group.X, group.Y, marker='o', linestyle='', ms=2, label=name)

I even tried to call the groups using get_group but I was not successful.

dfGrouped= dfGrouped.get_group(("A","x"))

Is there any way to plot 4 different scatter subplots (Based on "category2": x,y,z,w) in one main plot in a way that each plot contains 2 sets values with 2 different colors(Based on "Category1": A, B)?

2
  • What variables are you interested in plotting for each group? X and Y? Commented Aug 30, 2021 at 18:35
  • @liorr that is correct! Commented Sep 3, 2021 at 19:54

2 Answers 2

1

You could use seaborn.relplot:

import numpy as np
import seaborn as sns
# dummy data
df = pd.DataFrame({'Category1': np.random.choice(['A','B'], size=100),
                   'Category2': np.random.choice(['w','x', 'y', 'z'], size=100),
                   'x': np.random.random(size=100),
                   'y': np.random.random(size=100),
                   })
# plot
sns.relplot(data=df, x='x', y='y', col='Category2', col_wrap=2, hue='Category1')

Output: seaborn relplot

Sign up to request clarification or add additional context in comments.

Comments

0

I used scatter plot, below is another alternative:

**DataFrame**   

       col1     val1    val2    col2
  0     A      1000     5000     w
  1     A      3000     4000     w
  2     A      7000     5000     w
  3     A      3000     4000     w
  4     A      5000     6000     x
  5     A      5000     4000     x
  6     A      5000     9000     x
  7     A      6000     10000    x
  8     B      5000     6000     y
  9     B      5000     4000     y
  10    B      5000     9000     y
  11    A      6000     10000    y
  12    A      5000     6000     z
  13    B      5000     4000     z
  14    B      5000     9000     z
  15    A      6000     10000    z

Function

def plot_grouped_data():
  fig = plt.figure()
  fig.subplots_adjust(hspace=0.4, wspace=0.4)
  i=0
  for label, df in new.groupby('col2'):
    ax = fig.add_subplot(2,2,i+1)
    # print(label)
    sns.scatterplot(data=df,x= 'val1', y='val2', hue='col1', ax=ax)
    plt.title(f'Title={label}')
    plt.legend(loc="upper right")
    i += 1
  plt.show()

plot_grouped_data()

Output

plot

2 Comments

it works, thanks, but CPU consuming, right?
might be...since for loop is involved. Just an other answer!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.