There is this one column that has two variables stored namely date and company name. My goal is to seperate these two variables into two columns.
date_time/full_company_name
- 2020-05-19Lopez-Wallace
- 2020-05-12Smith-Simon
- 2020-10-02Jenkins Inc
- 2020-07-06Moore-Weiss
My approach so far was:
df['date_time'] = [i[:10] for i in df['date_time/full_company_name']]
df['full_company_name'] = [i[10:] for i in df['date_time/full_company_name']]
df.drop('date_time/full_company_name', axis=1, inplace=True)
The code above has worked well, however there are a number of botched data entries in the data set such as:
- 0Lopez, Barton and Jones
- NaNBrown, Singleton and Harrell
- 84635Ball-Thomas
I have thought about some possible solutions such as using a loop with a bunch of if statements to handle the exceptions or perhaps inserting some kind of delimiter into the string and then using string.split('_'). But these workarounds are fairly cumbersome.
I can't help but wonder if there is a more generic function or method available out there.