2

I have parsed a file and I need some help in splitting a data in it. Below is my data:

Block of data:

blank space
20/06/25 12:19:33 ERROR datasources
20/06/25 21:12:23 ERROR  sadasdfsd
blank space
blank space    
20/06/25 12:19:33 WARN  asda
20/06/25 21:12:23 ERROR asdasdfsd
20/06/25 12:20:33 WARN  asda
blank space

I have mentioned 'blank space' for better understanding.In my data there will be empty space there

The code I tried:

def parse_log_contents(text,full_text_lines,filter_content_types=None):
    #print(text) #Above block of data
    messages = re.compile('^(?=\d+/)',flags=re.MULTILINE).split(text)
    print(messages)

The output I got:

['']
['20/06/25 12:19:33 ERROR datasources\n20/06/25 21:12:23 ERROR  sadasdfsd']
['']
['']
['20/06/25 12:19:33 WARN  asda\n20/06/25 21:12:23 ERROR asdasdfsd\n20/06/25 12:20:33 WARN  asda']
['']

Expected Output:

['']
['', '20/06/25 12:19:33 ERROR datasources\n', '20/06/25 21:12:23 ERROR  sadasdfsd']
['']
['']
['', '20/06/25 12:19:33 WARN  asda\n','20/06/25 21:12:23 ERROR asdasdfsd\n','20/06/25 12:20:33 WARN  asda']
['']

I use python 2.7 in Linux Environment

In my output you can see I wasn't able to split the errors by delimiter comma(,).

Also I need a empty '' in front of those messages which I will need later for other processing .

Please help me to sort this issue.Thanks a lot!

12
  • How are you calling your function? Commented Jul 14, 2020 at 18:15
  • I call that from another function .I can add my complete code if you want for better understanding Commented Jul 14, 2020 at 18:18
  • When I test your code it is already splitting the lines closer to what you want: repl.it/repls/DrabWetPreprocessor#main.py Commented Jul 14, 2020 at 18:20
  • Yeah I also got the right answer when I execute it in my PC but when I work in another environment I was not able to get the right answer. I think this type of format is not getting executed correctly there. Is there any alternate method to do this ? Commented Jul 14, 2020 at 18:24
  • Make sure you are using raw string starting with r. Then you could try splitting with r'(^|\n)(?=\d+/)' instead. Commented Jul 14, 2020 at 18:31

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.