1

I am trying to work on some time series data and am quite new to pandas dataframe. I have a dataframe with two columns as below:

+---+-----------------------+-------+--+
|   |           0           |   1   |  |
+---+-----------------------+-------+--+
| 1 | 2018-08-02 23:00:00   | 456.8 |  |
| 2 | 2018-08-02 23:01:00   | 457.9 |  |
+---+-----------------------+-------+--+

I am trying to convert it into a series with two columns as it is in the dataframe. How can it be done? as pd.series is converting the dataframe to a series of one column.

2
  • 3
    Could you explain what you imagine a "series with two columns" would look like? Commented Aug 16, 2018 at 8:43
  • Like below: The data type needs to be series :Index 0 2018-08-02 23:00:00 456.8 2018-08-02 23:01:00 457.9 Sorry I am not able to comment correctly Commented Aug 16, 2018 at 8:49

3 Answers 3

10

There is no such thing as a pandas Series with two columns. My guess is that you want to generate a Series with column 0 as the index and column 1 as the values. You can get that by setting the index and extracting the column of interest (assuming your DataFrame is in df):

df.set_index(0)[1]
Sign up to request clarification or add additional context in comments.

3 Comments

Or pd.Series(df[0], df[1]) (maybe even pd.to_datetime(df[1]) if needed)
using like pd.Series(df.col1, df.col2) produces a Series with NaNs here. `pd.Series(df.col1.values, df.col2)' works - just looses the Name attribute. Though pd.Series(df.col1) produces no NaN's ... Seems buggy (pandas version 1.0.4, PY3.7)
It is not buggy, but intended. Current dev docs clarify the behaviour: If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values. pandas.pydata.org/docs/dev/reference/api/…
1

As stated in comments using "pd.Series(df.col1, df.col2) produces a Series with NaNs". The reason is that the Series will be reindexed with the object passed as the index argument. Current dev docs clarify:

If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.

To circumvent reindexing this can be done:

pd.Series(df[0].values, index=df[1])

Since df[0].values is a pd.array, rather than a dict-like pd.Series, nothing will be reindexed and df[1] will be set as index as-is.

Comments

0

Assuming google.csv has 2 columns date and amount:

pd.read_csv("google.csv", index_col = "Date").squeeze("columns")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.