5

I have a long list of status codes by month, sth like:

stats = pd.DataFrame(
    [
         ['2016-01', 200, 'xxx.com'],
         ['2016-01', 400, 'xxx.com'],
         ['2016-01', 200, 'xxx.com'],
         ['2016-02', 200, 'xxx.com']
    ],
    columns=['day', 'status_code', 'url']
)

I want to finally plot a few line charts with one line for each status code. I already found out that this table holds the correct information:

pivot = stats.pivot_table(index=['day', 'status_code'], aggfunc=len)

Looks like:

                        url
month   status_code     
2016-01 200            2
        400            1
2016-02 200            1

or as image:

as image

So it's somewhat the information I need.

However:

1.) I already fail at accessing that information. What's e.g. the syntax for getting the number of urls with status code 200 for 2016-01?

2.) How would i plot that? I want to draw multiple lines where x-axis is the month and the y-axis is the status-code-count.

3.) Why is the outer right column named 'url' anyway? I didn't include the url in my pivot table.

1
  • 1 problem per question, this is too broad. 1. pivot.loc[('2016-02',200)].sum() pass a tuple to access the multi-index and call sum. 2. you'd have to either convert the index to a datetime and access the month using .month or strip the month out and plot. 3. you called pivot_table with an aggfunc and it did this on the remaining columns so it reuses the column names not sure why this is a mystery to you Commented Mar 21, 2016 at 11:25

1 Answer 1

5

You can use crosstab():

stats = pd.DataFrame(
    [
         ['2016-01', 200, 'xxx.com'],
         ['2016-01', 400, 'xxx.com'],
         ['2016-01', 200, 'xxx.com'],
         ['2016-02', 200, 'xxx.com']
    ],
    columns=['day', 'status_code', 'url']
)

df = pd.crosstab(stats.day, stats.status_code)

df.plot()
Sign up to request clarification or add additional context in comments.

1 Comment

That's super awesome. It looks like that crosstab is doing essentally the same as pivot = stats.pivot_table(index='month', columns='status_code', values='url', aggfunc=len)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.