So I have a few hundred files that I need to run a python script on. All the files are Excel CSV's. Everything is running just as it should except for dates. When I convert the csv's to a dataframe using pd.read_csv, the original dates are converted to some other format.
When I read the CSV's I am distinguishing date values by slashes, e.g. only values with '/' will be dates. However, since my data is being converted, my slashes are converted to dashes ('-') or the date format is changed to something else.
Here is some sample data:
csv_df = pd.read_csv(csv,keep_default_na=False, index_col=False, parse_dates=False, dtype=basestring)
Original Pandas Conversion Original 2 Pandas Conversion 2
1/1/1900 0:00 1900/01/01 00:00:00 6/2/2017 2017-06-02
1/1/1900 0:00 1900/01/01 00:00:00 6/2/2017 2017-06-02
1/1/1900 0:00 1900/01/01 00:00:00 12/13/2016 2016-12-13
1/1/1900 0:00 1900/01/01 00:00:00 12/13/2016 2016-12-13
1/1/1900 0:00 1900/01/01 00:00:00 5/24/2017 2017-05-24
1/1/1900 0:00 1900/01/01 00:00:00 5/24/2017 2017-05-24
I tried changing the dtype to object, but no fix. The parse_dates parameter is false so that shouldnt be it either. I believe it is Excel that's automatically changing the dates, but Im not sure what to do. I also cant exactly specify the columns because all the CSV's have different data in them. Any advice or help with this issue would be greatly appreciated.