1

I am starting with a Pandas data frame that looks like this:

   Type Date Number
1   A    x     y
2   B    x     y
3   A    x     y
4   B    x     y
5   A    x     y

I want to create separate time series for Type A data and Type B data separately. What is the most efficient way of doing this?

I am considering creating two different data frames from this where each data from only has data from one type and then converting each of the separate data frames to a series. However I don't know how to do this either.

Extended question: Is there a way to do this if you don't even know how many different types there are?

So far I tried checking to see if the type is of the type I want by using df["type"] == A, and this doesn't give me a full data frame back just a data frame saying if the type was true or false.

Additional information:

My goal is to create separate pandas time series using the date and number data for type A and type B separately.

I tried the following:

df.groupBy("Type").apply(lambda x: x.Date)

The above function works but only returns one column.

df.groupBy("Type").apply(lambda x: (x.Date, x.Number))

The above function doesn't work and returns something that is not what I want at all.

Expected Output:

Type    Date Number
 A   1   x     y
     3   x     y
     5   x     y
 B   2   x     y
     4   x     y
1
  • Please add an expected output. Commented Oct 17, 2015 at 8:36

1 Answer 1

1

If you want to group dates by type and put them into a separate Series, you can do the following.

Group by type: grouped = df.groupby('Type')

Get the date from every group: dates = grouped.apply(lambda x:x.Date)

dates now looks like this:

Type   
A     1    x
      3    x
      5    x
B     2    x
      4    x

You can access Series by type name: dates.A, dates.B etc

So far I tried checking to see if the type is of the type I want by using df["type"] == A, and this doesn't give me a full data frame back just a data frame saying if the type was true or false.

df["type"] == A gives you a boolean mask which you can plug back into a dataframe: df[df["type"] == A] But this is a very basic pandas operation, take a look at the official tutorial, there are a lot of examples: http://pandas.pydata.org/pandas-docs/stable/tutorials.html

Sign up to request clarification or add additional context in comments.

6 Comments

This is helpful! How do I get dates to have 3 columns including one for the x and one for the y?? I want to create a time series for the dates which are in x and the numbers which are in y.
I tried this: grouped.apply(lambda x: [x.Date, x.Number]) and it didn't work.
Can you post an example how should output look like?
I posted some additional information above! Basically my end goal is to have 2 time series with date and the number information for each group separately. I'm not actually totally sure what a time series looks like.
I added what I wanted the output to look like.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.