0

Separate out all the timestamps from the other content present in the text file. For example:

a.txt

2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart

"2019-07-17T07:11:14.894Z" "mgremove datestring"    asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z"     "mgremove datestring"     asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z"     "mgremove datestring"     asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z"      "mgremove datestring"     asfasnfs: remove datepart

17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
"mgremove datestring"     asfasnfs: remove datepart check the value
                         "mgremove datestring"     asfasnfs: remove datepart check the value

My solution does it for first 4 lines in the text but it is not generic. I want to make it generic such that it detects the timestamps automatically from the start of the line.

with open("\a.txt") as f:
    for line in f:
        date_string = " ".join(line.strip().split()[:4])
        print(date_sting, line)

Expected solution:

date_string = 2019/01/31-11:56:23.288258 line = 2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
date_string = 2019/01/31-11:56:23.288258 line = 2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
date_string = 2019/01/31-11:56:23.288258 line = 2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
date_string = 2019/01/31-11:56:23.288258 line = 2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
date_string = "2019-07-17T07:11:14.894Z" line = "2019-07-17T07:11:14.894Z"      "mgremove datestring"     asfasnfs: remove datepart
date_string = "2019-07-17T07:11:14.894Z" line = "2019-07-17T07:11:14.894Z"      "mgremove datestring"     asfasnfs: remove datepart
date_string = "2019-07-17T07:11:14.894Z" line = "2019-07-17T07:11:14.894Z"      "mgremove datestring"     asfasnfs: remove datepart
date_string = "2019-07-17T07:11:14.894Z" line = "2019-07-17T07:11:14.894Z"      "mgremove datestring"     asfasnfs: remove datepart
date_string = 17 Jul 2019 07:01:10 line = 17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
date_string = 17 Jul 2019 07:01:10 line = 17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
date_string = 17 Jul 2019 07:01:10 line = 17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
date_string = 17 Jul 2019 07:01:10 line =  asfasnfs: remove datepart
date_string = 17 Jul 2019 07:01:10 line =  asfasnfs: remove datepart

Text file might include other timestamps pattern as well. Is there any way to detect the timestamp in the start of the line and fetch it? And if there is not date present in the start of the line then take the date from last line.

1 Answer 1

1

With contents of the a.txt:

2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart

"2019-07-17T07:11:14.894Z" "mgremove datestring"    asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z"     "mgremove datestring"     asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z"     "mgremove datestring"     asfasnfs: remove datepart
"2019-07-17T07:11:14.894Z"      "mgremove datestring"     asfasnfs: remove datepart

17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
asfasnfs: remove datepart
                               asfasnfs: remove datepart

This script:

def get_date_string(line):
    rv = ''
    words = line.split()
    while words:
        rv += words.pop(0) + ' '
        if len(rv) > 18:
            break
    return rv.strip()

with open('file.txt', 'r') as f_in:
    last_date_string = ''

    for line in f_in:
        line = line.strip()
        if not line:
            continue

        date_part = get_date_string(line)
        if date_part == line:
            print('date string={: <30} line={}'.format(last_date_string, line))
        else:
            print('date string={: <30} line={}'.format(date_part, line))
            last_date_string = date_part

Prints:

date string=2019/01/31-11:56:23.288258     line=2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
date string=2019/01/31-11:56:23.288258     line=2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
date string=2019/01/31-11:56:23.288258     line=2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
date string=2019/01/31-11:56:23.288258     line=2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart
date string="2019-07-17T07:11:14.894Z"     line="2019-07-17T07:11:14.894Z" "mgremove datestring"    asfasnfs: remove datepart
date string="2019-07-17T07:11:14.894Z"     line="2019-07-17T07:11:14.894Z"     "mgremove datestring"     asfasnfs: remove datepart
date string="2019-07-17T07:11:14.894Z"     line="2019-07-17T07:11:14.894Z"     "mgremove datestring"     asfasnfs: remove datepart
date string="2019-07-17T07:11:14.894Z"     line="2019-07-17T07:11:14.894Z"      "mgremove datestring"     asfasnfs: remove datepart
date string=17 Jul 2019 07:01:10           line=17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
date string=17 Jul 2019 07:01:10           line=17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
date string=17 Jul 2019 07:01:10           line=17 Jul 2019 07:01:10      "mgremove datestring"     asfasnfs: remove datepart
date string=17 Jul 2019 07:01:10           line=asfasnfs: remove datepart
date string=17 Jul 2019 07:01:10           line=asfasnfs: remove datepart
Sign up to request clarification or add additional context in comments.

7 Comments

Thank you so much for solution but this condition fails if we have timestamp:` "17 Jul 2019 07:01:10"`. Updated question.
@user15051990 updated my answer. As I don't know every format of date string, I presume they must be at least larger that 18 characters.
Thanks alot. I will check for other timestamps and try to tweak the code. But thanks again.
One more question, there are many lines which have blanks\have some string(not date string), in those cases I want date entry from last line. I have have updated question, please have a look.
Actually in some of the cases line contains only string, so in those case date_part==line gets failed. Is there some other way to do this? Updated question! Thank you so much!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.