pandas plot conditional values seaborn

Question

My dataframe is,

      created_at            text
2017-03-01 00:00:01        power blah blah
2017-03-01 00:00:11        foo blah blah
2017-03-01 00:01:01        bar blah blah
2017-03-02 00:00:01        foobar blah blah
2017-03-02 00:10:01        hello world
2017-03-02 01:00:01        power blah blah

created_at is my index and its type is datetime64 which I can slice day by day easily. What I want to plot is that total number of entries day by day. I separate this dataframe into its category, and plot them in one graph. But I think there is better way to do without having multiple dataframes

a = df[df["text"].str.contains("power")]
b = df[df["text"].str.contains("foo")]
c = df[df["text"].str.contains("bar")]

fig = plt.figure()
ax = fig.add_subplot(111)

df.groupby(df["created_at"].dt.date).size().plot(kind="bar", position=0)
a.groupby(a["created_at"].dt.date).size().plot(kind="bar", position=0)
b.groupby(b["created_at"].dt.date).size().plot(kind="bar", position=0)
c.groupby(c["created_at"].dt.date).size().plot(kind="bar", position=0)

plt.show()

I am learning Seaborn, so if solution is related to Seaborn, it would be nice, but it does not have to stick to it. Thanks in advance!

In case your categories are mutually exclusive, just add a "category" column and iterate over df.groupby('category'). Otherwise, the best you can do to clean up your code is use a for loop. — Gustavo Bezerra
– Gustavo Bezerra, Commented Mar 16, 2018 at 6:35

jeschwar · Accepted Answer · 2018-03-16 14:52:03Z

Since you want to group-by days consider converting df.index to type pd.DatetimeIndex so you can use df.resample() as shown below:

# your original dataframe:
df = pd.read_json({"text":{"1488326401000":"power blah blah","1488326411000":"foo blah blah","1488326461000":"bar blah blah","1488412801000":"foobar blah blah","1488413401000":"hello world","1488416401000":"power blah blah"}})

# convert index to DatetimeIndex
df.index = pd.to_datetime(df.index)

# create function to do your calculations; not sure if this is exactly what you want
def func(df_):
    texts = ['power', 'foo', 'bar']
    d = dict()

    for text in texts:
        d[text] = df_['text'].str.contains(text).sum()

    return pd.Series(d)

# create your dataframe for plotting by resampling your data by each day and then applying the `func`
df_plot = df.resample('D').apply(func)

# do the plotting
df_plot.plot(kind='bar')

Collectives™ on Stack Overflow

pandas plot conditional values seaborn

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related