1

I have huge data of time series and I am facing problem in changing the time conventions.

Below are different types and I am trying to make them all to one format. Not able to find any guidance accordingly. It is more like a data pre processing/ cleaning process that I am trying to do. So that the next execution process with python and pandas goes smooth. Changing manually is practically impossible need a fix with python script.

The input files are of two types in CSV format.

A three column and multiple rows where col[0] is date-time definitely and rest are other data. Column header is not constant every input file is given some name so cannot use headers.

09/30/2015 12:00 PM,abcsd,434235
09/30/2015 12:30 PM,taer,45824
09/30/2015 13:00 PM,hshfe,4894

The input file with multiple columns and multiple rows

no.,30-09-2015 12:00 PM,30-09-2015 13:00 PM
1111,2345,2342

Types

1. 09/30/2015 12:00:00 
2. 30/09/2015 12:00
3. 09/30/2015 12:00 PM
4. 30/09/2015 12:00 PM
5. 30-09-2015 12:00:00
6. 30-09-2015 12:00 PM

The above listed are the types and I want to bring them all to one format as:

1. 30/09/2015 12:00

or 

2. 09/30/2015 12:00

I could not find proper guidance in document too. So could not try out any code so far.

Thanks for the valuable suggestions

3
  • is it already a pandas column? What is Types ? Commented Sep 30, 2015 at 6:44
  • dd-mm vs mm-dd will be ambiguous if the day is less than 13. How do you expect to handle that? Commented Sep 30, 2015 at 6:46
  • @tzaman made few edits and improved explaining. Yes I do expect to check with current time so the ambiguous situation to be handled. Commented Sep 30, 2015 at 6:55

1 Answer 1

1

You need to read them all into a common datetime object, then print them all out from that object.

Unfortunately the best way to read in multiple formats is to have a list of possible formats and just try using each one.

For example:

import datetime

POSSIBLE_FORMATS = ['%h%m%s', ...]

for date in dates:
    for format in POSSIBLE_FORMATS:
        try:
            formatted = datetime.strptime(date, format)
            print formatted # will be the same format every time
            break # found it, stop trying formats
        except:
            pass # wrong format, keep trying formats
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.