4

I have a dataframe that has a timezone-aware index, and when I try to insert a new row into this dataframe, it doesn't work, instead changing the type of the index to 'object' (ie, it can't add in the row into the current type).

MRE below:

import pandas as pd

df = pd.DataFrame({"time": ["2021/06/06 12:00:00"], "col1": [2]})
df.index = pd.to_datetime(df['time'])
df = df.drop('time', axis=1)
df.index = df.index.tz_localize('UTC')

# Doesn't work, index is now 'object' as this is considered a string
# row = pd.Series(name='2021/06/05 12:00:00')
# df = df.append(row)


# Also doesn't work, as timezones differ
row = pd.Series(name=pd.Timestamp('2021/06/05 12:00:00'))
df = df.append(row)

print(df.index)

I understand I can do the following:

tz = df.index[0].tz
row = pd.Series(name=pd.Timestamp('2021/06/05 12:00:00', tz=tz))

However I'm sceptical of whether this would work when the units differ, or maybe even some other property of a pandas Timestamp I don't know of, so would ideally like to completely copy over the Timestamp configuration of the index to the new timestamp I'm inserting.

If anyone would happen to know how to insert a new row into this dataframe whilst keeping the index type intact that would be greatly appreciated.

1
  • 1
    Why not assign tzin Timestamp: pd.Series(name=pd.Timestamp('2021/06/05 12:00:00', tz = "UTC"))? Commented Nov 26, 2021 at 5:16

3 Answers 3

1
+50

Since DatetimeTZ only supports a single timezone, the key is to make sure the new Timestamp always matches your existing df.index.tz.

When inserting a new Timestamp:

  • Use tz=df.index.tz if it's from the same time zone
  • Use astimezone(df.index.tz) if it's from a different time zone

Examples

  • Given your MRE df:

    df
    #                            col1
    # time                           
    # 2021-06-06 12:00:00+00:00     2
    
    df.index.dtype
    # datetime64[ns, UTC]
    
  • If you know your new Timestamp is in the same tz, just set it in the constructor like you mentioned:

    ts1 = pd.Timestamp('2021/06/05 12:00:00', tz=df.index.tz)
    df.loc[ts1] = 4
    #                            col1
    # time                           
    # 2021-06-06 12:00:00+00:00     2
    # 2021-06-05 12:00:00+00:00     4
    
    df.index.dtype
    # datetime64[ns, UTC]
    
  • If your new Timestamp is in a different tz (which was your main concern), convert it using astimezone (e.g., from Eastern US back to UTC):

    ts2 = pd.Timestamp('2021/06/05 12:00:00', tz='US/Eastern').astimezone(df.index.tz)
    df.loc[ts2] = 6
    #                            col1
    # time                           
    # 2021-06-06 12:00:00+00:00     2
    # 2021-06-05 12:00:00+00:00     4
    # 2021-06-05 16:00:00+00:00     6
    
    df.index.dtype
    # datetime64[ns, UTC]
    
  • Note that I'm using loc because it's faster and simpler, but append still works:

    ts3 = pd.Timestamp('2021/06/05 12:00:00', tz='US/Central').astimezone(df.index.tz)
    row = pd.Series([8], name=ts3, index=['col1'])
    df = df.append(row)
    #                            col1
    # time                           
    # 2021-06-06 12:00:00+00:00     2
    # 2021-06-05 12:00:00+00:00     4
    # 2021-06-05 16:00:00+00:00     6
    # 2021-06-05 17:00:00+00:00     8
    
    df.index.dtype
    # datetime64[ns, UTC]
    
Sign up to request clarification or add additional context in comments.

Comments

0

Timestamp objects are certainly of type 'object', and it is possible you cannot escape the index to take this type if you use Timestamps as index.

By "timestamp configuration" I understand "timezone", and that you want to insert new times in the same timezone.

  • Either you know the timezone from the start and data of all rows come from a source known to be from the same timezone. In that case you can specify the same timezone when you instantiate the Timestamps.

  • Or you do not know the timezone of the data in the first place, and you have new data coming from another source, so potentially from a different timezone, that you certainly know about. Mixing timestamps from different timezones is not a problem, since timezones should be taken into account when you compare them with one another. You can still localize them all later to a timezone you prefer.

Comments

0

You could also do something like this. DateTimeIndexes are objects.

for time_zone in ['MST', 'EST', 'GMT']:
        row = pd.Timestamp('2021/06/05 12:00:00', tz=time_zone)
        df.loc[row] = 1

# to whichever timezone you want
df.index = pd.to_datetime(df.index, utc=True).tz_convert('EST')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.