How to visualize the pandas DataFrame with line graph?

Question

I have a pandas DataFrame that comes with informations, df.info() prints as following,

<class 'pandas.core.frame.DataFrame'>
Int64Index: 6661 entries, 0 to 6660
Data columns (total 3 columns):
value      6661 non-null float64
country    6477 non-null object
outlier    6661 non-null int64
dtypes: float64(1), int64(1), object(1)
memory usage: 208.2+ KB
None

df.columns.values prints as following,

[u'value' 'country' 'outlier']

df prints as following,

       value country  outlier
0     118.66   CHINA        0
1     120.83   CHINA        0
2      86.83   USA          0
3     112.15   CHINA        0
4     113.60   CHINA        0
5     114.32   CHINA        1
6     111.43   CHINA        0
7     117.22   CHINA        1
8     111.43   CHINA        0

- - - - - - - - - - - - - - -

- - - - - - - - - - - - - - -

6652  420.00     USA        0
6653  420.00     USA        0
6654  500.00     USA        0
6655  500.00     USA        0
6656  390.00     USA        1
6657  450.00     USA        0
6658  420.00     USA        0
6659  420.00     USA        1
6660  450.00     USA        0

The value for 1 in the outlier column is considered as outlier and I would like to visualize the value for respective countries w/o considering the outliers. I should mentioned, the indexes of the DF is not to be considered and I need to put own indexes for the respective countries. To clarify, the DF index of 2 is for the data for the USA (2 86.83 USA 0) and it will be the index zero data for US. The index 2 data for the China will be (3 112.15 CHINA 0) and so on.

I was tried to use the code snippet and it didn't work as expected.

import matplotlib.pyplot as plt
df.plot.bar()
df.plot()
plt.show(block=True)

How to do that properly ?

What type of plot are you looking for? there many ways to "visualize the value for respective countries". You must be more specific. — DYZ
– DYZ, Commented Mar 10, 2017 at 4:32
Please, have a look in the question. I would like to have simple line graph with values over the Y-axis and the indexes for the respective countries on the X-axis — Arefe
– Arefe, Commented Mar 10, 2017 at 4:33

jezrael · Accepted Answer · 2017-03-10 06:42:35Z

1

I think you can first filter values where outlier is 1 and then reshape dataframe by pivot:

df = df[df.outlier == 1]
df['g'] = df.groupby('country').cumcount()

df = df.pivot(index='g', columns='country', values='value')
print (df)
country   CHINA    USA
g                     
0        114.32  390.0
1        117.22  420.0

df.plot()

Another solution is groupby with unstack:

df = df[df.outlier == 1]
df = df.groupby('country')['value'].apply(lambda x: pd.Series(x.values)).unstack(0)

print (df)
country   CHINA    USA
0        114.32  390.0
1        117.22  420.0

df.plot()

answered Mar 10, 2017 at 6:42

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Arefe Over a year ago

Can we talk little in chat ? I need to ask you something. I partially solved the issue, but still need some modification.

Collectives™ on Stack Overflow

How to visualize the pandas DataFrame with line graph?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related