1

I have a dataframe with 600 rows in the shape of :

rpttime         metric          value
25/4/2018 15:45 UTIL_CPU        1.5
25/4/2018 15:44 IDLE            74
25/4/2018 15:41 REC_BYTES_S     0
25/4/2018 15:47 ENT_CPU         100
25/4/2018 15:44 ENT_CORE        1
25/4/2018 15:48 TRANS_BYTES     92
25/4/2018 15:43 PINNED          5425
25/4/2018 15:48 PAGING_PAG      0
25/4/2018 15:48 IOPS_IN         NULL
25/4/2018 15:47 TRANS_BYTES_S   23484
25/4/2018 15:43 PAGE_OUT        0
25/4/2018 15:42 IOPS_OUT        10

I want to plot line plots for the "value" column on y-axis against "rpttime" on x-axis as time-series for each individual items in the column "metric". There are around 20 individual items in the column "metric". There are some NULL values in the "value" column which are to be omitted which is the row is to omitted for each NULL value in the "value" column. There are atleast one NULL value for each individual items of the "metric" column. All line plots for those 20 individual items are to plotted in one graph. How to approach this ?

5
  • Can you post a clean data frame with which rows with value==Null, as it's ambiguous from the posted data frame due to both NA and numerical values present? Commented Jun 19, 2018 at 5:47
  • Updated the dataframe. The NULL value appears at least once for each individual items of the "metric" column. Commented Jun 19, 2018 at 5:56
  • So do you want to ignore the rpttime where value==Null while ploting series? Or fill it with some other value? Commented Jun 19, 2018 at 6:00
  • Ignore the whole row, wherever value==Null. Commented Jun 19, 2018 at 6:01
  • rpttime contain string like 25/4/2018 15:48 and the dataframe has an index column starting from 0 which isn't shown here. Commented Jun 19, 2018 at 8:30

2 Answers 2

1

If you want separate lines plotted for each metric-value, you first should groupby metric:

grpd = df.groupby('metric')

Then you can iterate over the created groups plotting each set of values against rpttime:

for name, data in grpd:
    plt.plot(data.rpttime.values, data.value.values, 'x-', label = name)
plt.legend()

Note: data you provided is not enough for a really impressive result though, as there are not multiple equals in metric:

enter image description here

PS: imported your data with

df = pd.read_fwf('wherever/file/may/roam', colspecs=[(None, 15), (16, 31), (32, None)])
df.rpttime = pd.to_datetime(df.rpttime)

PPS: NA is handeled automatically, i.e.: NULL-values just are not plotted (see IOPS_IN)

Sign up to request clarification or add additional context in comments.

Comments

0

You can directly use df.dropna() to drop Null values which aren't desired while plotting

import pandas as pd

# Original data frame
>>>df_origin

Original dataframe with Null values

# Drop NaN values using dropna
df_to_be_plotted = df_origin.dropna()
>>> df_to_be_plotted

enter image description here

2 Comments

Dropping NULL values is ok, any solution on plotting multiple time-series data of column "value" with individual items of "metric" column.
I guess you can take a view from original data frame for each unique item in metric column to plot a time series against values in value column. If you can elaborate more on plotting part in question it'll be helpful

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.