Converting a dataframe of 2 columns to a series of 2 columns in Python

Question

I am trying to work on some time series data and am quite new to pandas dataframe. I have a dataframe with two columns as below:

+---+-----------------------+-------+--+
|   |           0           |   1   |  |
+---+-----------------------+-------+--+
| 1 | 2018-08-02 23:00:00   | 456.8 |  |
| 2 | 2018-08-02 23:01:00   | 457.9 |  |
+---+-----------------------+-------+--+

I am trying to convert it into a series with two columns as it is in the dataframe. How can it be done? as pd.series is converting the dataframe to a series of one column.

Could you explain what you imagine a "series with two columns" would look like? — Jon Clements
– Jon Clements, Commented Aug 16, 2018 at 8:43
Like below: The data type needs to be series :Index 0 2018-08-02 23:00:00 456.8 2018-08-02 23:01:00 457.9 Sorry I am not able to comment correctly — Bineeta Saikia
– Bineeta Saikia, Commented Aug 16, 2018 at 8:49

Rob · Accepted Answer · 2018-08-16 08:48:59Z

10

There is no such thing as a pandas Series with two columns. My guess is that you want to generate a Series with column 0 as the index and column 1 as the values. You can get that by setting the index and extracting the column of interest (assuming your DataFrame is in df):

df.set_index(0)[1]

answered Aug 16, 2018 at 8:48

Rob

3,5231 gold badge21 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Jon Clements Over a year ago

Or pd.Series(df[0], df[1]) (maybe even pd.to_datetime(df[1]) if needed)

kxr Over a year ago

using like pd.Series(df.col1, df.col2) produces a Series with NaNs here. `pd.Series(df.col1.values, df.col2)' works - just looses the Name attribute. Though pd.Series(df.col1) produces no NaN's ... Seems buggy (pandas version 1.0.4, PY3.7)

Stefan_EOX Over a year ago

It is not buggy, but intended. Current dev docs clarify the behaviour: If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values. pandas.pydata.org/docs/dev/reference/api/…

Stefan_EOX · Accepted Answer · 2021-10-29 08:15:41Z

1

As stated in comments using "pd.Series(df.col1, df.col2) produces a Series with NaNs". The reason is that the Series will be reindexed with the object passed as the index argument. Current dev docs clarify:

If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.

To circumvent reindexing this can be done:

pd.Series(df[0].values, index=df[1])

Since df[0].values is a pd.array, rather than a dict-like pd.Series, nothing will be reindexed and df[1] will be set as index as-is.

answered Oct 29, 2021 at 8:15

Stefan_EOX

1,5831 gold badge19 silver badges42 bronze badges

Comments

L Tyrone · Accepted Answer · 2024-07-21 03:39:20Z

0

Assuming google.csv has 2 columns date and amount:

pd.read_csv("google.csv", index_col = "Date").squeeze("columns")

edited Jul 21, 2024 at 3:39

L Tyrone

8,36123 gold badges34 silver badges47 bronze badges

answered Jul 19, 2024 at 13:28

Swapneel Sawant

112 bronze badges

Collectives™ on Stack Overflow

Converting a dataframe of 2 columns to a series of 2 columns in Python

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related