Python strip \n tabs from string

Question

I have the following string which includes time and date along with \n with numbers. I want only date time value. Input:

str1 = '1    2016-04-30 00:30:00\n2    2016-04-30 02:00:00\n3    2016-04-30 02:00:00\n4    2016-04-30 03:16:00\n5    2016-04-30 08:27:18\n6    2016-04-30 10:10:00\n7    2016-04-30 10:27:00\n8    2016-04-30 13:00:00\n9    2016-04-30 14:00:00\n10   2016-04-30 16:00:00\n11   2016-04-30 16:30:00\n12   2016-04-30 16:30:00\n13   2016-04-30 17:18:00\n14   2016-04-30 19:00:00\n15   2016-04-30 19:30:00\n16   2016-04-30 22:00:00\n17   2016-04-30 23:12:00\n18   2016-04-30 23:30:00\n19   2016-04-30 23:50:00\n20   2016-04-30 23:50:00\n21   2016-04-30 23:50:00\nName: CrimeDate, dtype: datetime64[ns]'

output:

'2016-04-30 00:30:00,2016-04-30 02:00:00,2016-04-30 02:00:00,2016-04-30 03:16:00,2016-04-30 08:27:18,2016-04-30 10:10:00,2016-04-30 10:27:00,2016-04-30 13:00:00,2016-04-30 14:00:00,2016-04-30 16:00:00,2016-04-30 16:30:00,2016-04-30 16:30:00,2016-04-30 17:18:00,2016-04-30 19:00:00,2016-04-30 19:30:00,2016-04-30 22:00:00,2016-04-30 23:12:00,2016-04-30 23:30:00,2016-04-30 23:50:00,2016-04-30 23:50:00,2016-04-30 23:50:00'

I have tried the following ways to fix the problem:

str1 = str1.split(',')[0]
ini_string=' '.join(str1.split())[0:-16]
res = ini_string.replace(' ', ',')

but this is not working. Is there any better way to get the desired results. I am doing this in python 3.

You appear to have a pandas.DataFrame therefore you should try to use pandas operations to perform the conversion. See my suggestion for example. — mhawke
– mhawke, Commented Mar 5, 2021 at 6:00

Tim Biegeleisen · Accepted Answer · 2021-03-05 05:16:13Z

4

I would keep it simple here and just use re.findall:

str1 = '1    2016-04-30 00:30:00\n2    2016-04-30 02:00:00\n3    2016-04-30 02:00:00\n4    2016-04-30 03:16:00\n5    2016-04-30 08:27:18\n6    2016-04-30 10:10:00\n7    2016-04-30 10:27:00\n8    2016-04-30 13:00:00\n9    2016-04-30 14:00:00\n10   2016-04-30 16:00:00\n11   2016-04-30 16:30:00\n12   2016-04-30 16:30:00\n13   2016-04-30 17:18:00\n14   2016-04-30 19:00:00\n15   2016-04-30 19:30:00\n16   2016-04-30 22:00:00\n17   2016-04-30 23:12:00\n18   2016-04-30 23:30:00\n19   2016-04-30 23:50:00\n20   2016-04-30 23:50:00\n21   2016-04-30 23:50:00\nName: CrimeDate, dtype: datetime64[ns]'
matches = re.findall(r'\b\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\b', str1)
print(matches)

This prints:

['2016-04-30 00:30:00', '2016-04-30 02:00:00', '2016-04-30 02:00:00', '2016-04-30 03:16:00',
 '2016-04-30 08:27:18', '2016-04-30 10:10:00', '2016-04-30 10:27:00', '2016-04-30 13:00:00',
 '2016-04-30 14:00:00', '2016-04-30 16:00:00', '2016-04-30 16:30:00', '2016-04-30 16:30:00',
 '2016-04-30 17:18:00', '2016-04-30 19:00:00', '2016-04-30 19:30:00', '2016-04-30 22:00:00',
 '2016-04-30 23:12:00', '2016-04-30 23:30:00', '2016-04-30 23:50:00', '2016-04-30 23:50:00',
 '2016-04-30 23:50:00']

answered Mar 5, 2021 at 5:16

Tim Biegeleisen

526k32 gold badges324 silver badges399 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Tim Roberts Over a year ago

Good answer; I was going to use re.sub, but this is better.

Tim Biegeleisen Over a year ago

@TimRoberts re.split might also be viable, but there is some junk at the end of the input which would probably interfere.

Umer Over a year ago

Thanks, I was trying re.split and obviously junk at the end creating some issues. re.findall is a good approach. thank you

mhawke · Accepted Answer · 2021-03-05 05:58:31Z

Your data is a pandas DataFrame that has been converted to a string. For example you might have done this:

>>> str(df['CrimeDate'])
'0    2016-04-30 00:30:00\n1    2016-04-30 02:00:00\n2    2016-04-30 02:00:00\n3    2016-04-30 03:16:00\n4    2016-04-30 08:27:18\n5    2016-04-30 10:10:00\n6    2016-04-30 10:27:00\n7    2016-04-30 13:00:00\n8    2016-04-30 14:00:00\n9    2016-04-30 16:00:00\n10   2016-04-30 16:30:00\n11   2016-04-30 16:30:00\n12   2016-04-30 17:18:00\n13   2016-04-30 19:00:00\n14   2016-04-30 19:30:00\n15   2016-04-30 22:00:00\n16   2016-04-30 23:12:00\n17   2016-04-30 23:30:00\n18   2016-04-30 23:50:00\n19   2016-04-30 23:50:00\n20   2016-04-30 23:50:00\nName: CrimeDate, dtype: datetime64[ns]'

Assuming that you have access to the DataFrame you could convert the column to a comma separated list like this:

>>> df['CrimeDate'].to_csv(header=False, index=False, line_terminator=',')[:-1]
'2016-04-30 00:30:00,2016-04-30 02:00:00,2016-04-30 02:00:00,2016-04-30 03:16:00,2016-04-30 08:27:18,2016-04-30 10:10:00,2016-04-30 10:27:00,2016-04-30 13:00:00,2016-04-30 14:00:00,2016-04-30 16:00:00,2016-04-30 16:30:00,2016-04-30 16:30:00,2016-04-30 17:18:00,2016-04-30 19:00:00,2016-04-30 19:30:00,2016-04-30 22:00:00,2016-04-30 23:12:00,2016-04-30 23:30:00,2016-04-30 23:50:00,2016-04-30 23:50:00,2016-04-30 23:50:00'

The [:-1] removes the trailing comma added by to_csv().

Another way would be to use str.join():

>>> ','.join(str(dt) for dt in df['CrimeDate'])

but the first method avoids iteration over the DataFrame column, keeping the processing in pandas.

rpb · Accepted Answer · 2021-03-05 05:34:53Z

0

without_strip = str1.replace("\n", "")

2016-04-30 00:30:002 2016-04-30 02:00:003 2016-04-30 02:00:004
2016-04-30 03:16:005 2016-04-30 08:27:186 2016-04-30 10:10:007
2016-04-30 10:27:008 2016-04-30 13:00:009 2016-04-30 14:00:0010 2016-04-30 16:00:0011 2016-04-30 16:30:0012 2016-04-30 16:30:0013 2016-04-30 17:18:0014 2016-04-30 19:00:0015 2016-04-30 19:30:0016 2016-04-30 22:00:0017 2016-04-30 23:12:0018 2016-04-30 23:30:0019 2016-04-30 23:50:0020 2016-04-30 23:50:0021 2016-04-30 23:50:00Name: CrimeDate, dtype: datetime64[ns]

or to remove the 1 at the beginning.

listToStr = ' '.join(map(str, without_strip.split()[1:]))

edited Mar 5, 2021 at 5:34

answered Mar 5, 2021 at 5:29

rpb

3,3073 gold badges32 silver badges72 bronze badges

Collectives™ on Stack Overflow

Python strip \n tabs from string

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related