Format numpy array of timestamps into a concatenated string

Question

I have an array of unix timestamps:

d = {'timestamp': [1551675611, 1551676489, 1551676511, 1551676533, 1551676554]}
df = pd.DataFrame(data=d)
timestamps = df[['timestamp']].values

That I would like to format into a concatenated string, like so:

'1551675611;1551676489;1551676511;1551676533;1551676554'

So far I have prepared this:

def format_timestamps(timestamps: np.array) -> str:
    timestamps = ";".join([f"{timestamp:f}" for timestamp in timestamps])
    return timestamps

Running:

format_timestamps(timestamps)

Gives the following error:

TypeError: unsupported format string passed to numpy.ndarray.__format__

Since I'm new to python I'm having trouble understanding how I can fix the error

replace "{timestamp:f}" with "{timestamp[0]}", does it work? — cs95
– cs95, Commented Dec 16, 2020 at 11:30

juanpa.arrivillaga · Accepted Answer · 2020-12-16 11:30:49Z

2

It's because in your list comprehension, timestamp is a numpy.ndarray object. Just flatten first and convert to string:

>>> ";".join(timestamps.flatten().astype(str))
'1551675611;1551676489;1551676511;1551676533;1551676554'

answered Dec 16, 2020 at 11:30

juanpa.arrivillaga

97.7k14 gold badges141 silver badges190 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

cs95 · Accepted Answer · 2020-12-16 11:34:06Z

2

Since you have pandas, why not consider a pandaic solution with str.cat:

df['timestamp'].astype(str).str.cat(sep=';')
# '1551675611;1551676489;1551676511;1551676533;1551676554'

If NaNs or invalid data are a possibility, you can handle them with pd.to_numeric:

(pd.to_numeric(df['timestamp'], errors='coerce')
   .dropna()
   .astype(int)
   .astype(str)
   .str.cat(sep=';'))
# '1551675611;1551676489;1551676511;1551676533;1551676554'

Another idea is to iterate over the list of timestamps and join:

';'.join([f'{t}' for t in  df['timestamp'].tolist()])
# '1551675611;1551676489;1551676511;1551676533;1551676554'

edited Dec 16, 2020 at 11:34

answered Dec 16, 2020 at 11:32

cs95

406k106 gold badges745 silver badges798 bronze badges

1 Comment

juanpa.arrivillaga Over a year ago

.str.cat ah, always forget about that guy.

JPI93 · Accepted Answer · 2020-12-16 12:10:16Z

Why the error?

You're getting this error because of how you extract the 'timestamp' column values with the following line:

timestamps = df[['timestamp']].values

Accessing DataFrame column values passing a list of column names as here will return a multi-dimensional ndarray with the top-level containing ndarray objects containing values for each column name listed for each row in the DataFrame. This approach is generally only useful when selecting multiple columns by name.

The error is being thrown by your function because eachtimestamp here:

";".join([f"{timestamp:f}" for timestamp in timestamps])

Is an ndarray containing a single value when timestamps is defined as in your original post - where a str value would be desirable/expected.

Accounting for the error

To remedy this error in your code, simply replace:

timestamps = df[['timestamp']].values

With:

timestamps = df['timestamp'].values

By passing a single str to extract a single column from your DataFrame, timestamps will here be defined as a one-dimensional ndarray with 'timestamp' column values for each row stored within - which will pass through your original format_timestamps without error.

`format_timestamps`

Running format_timestamps(timestamps) using the above approach and your original implementation of format_timestamps will return:

'1551675611.000000;1551676489.000000;1551676511.000000;1551676533.000000;1551676554.000000'

This is better (no errors at least) but still not quite what you want. This root of this issue is that you are passing f as a format specifier when joining timestamp values, this will format each value as a float when in actuality you want to format each value as an int (format specifier d).

You can either, change your format specifier from f to d in your function definition.

def format_timestamps(timestamps: np.array) -> str:
    timestamps = ";".join([f"{timestamp:d}" for timestamp in timestamps])
    return timestamps

Or simply not pass a format specifier - as timestamps values are already numpy.int64 type.

def format_timestamps(timestamps: np.array) -> str:
    timestamps = ";".join([f"{timestamp}" for timestamp in timestamps])
    return timestamps

Running format_timestamps(timestamps) using either definition above will return what you're after:

'1551675611;1551676489;1551676511;1551676533;1551676554'

Valentin Macé · Accepted Answer · 2020-12-16 11:30:30Z

1

A quick fix to your code would be:

def format_timestamps(timestamps: np.array) -> str:
    timestamps = ";".join([f"{timestamp[0]}" for timestamp in timestamps])
    return timestamps

Here I only replaced timestamp:f with timestamp[0], so you get each timestamp as a scalar instead of an array

answered Dec 16, 2020 at 11:30

Valentin Macé

1,2922 gold badges14 silver badges28 bronze badges

Collectives™ on Stack Overflow

Format numpy array of timestamps into a concatenated string

4 Answers 4

Comments

1 Comment

Why the error?

Accounting for the error

`format_timestamps`

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

Why the error?

Accounting for the error

format_timestamps

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related

`format_timestamps`