4

I have a log output, summarized below. I need to parse the Final Input, which goes across multiple lines. I cannot find a regex expression that works.

04/10/2019 02:52:59 PM INFO: Model Details:
04/10/2019 02:53:12 PM INFO: Final Input: [  220.12134       3.7499998    75.00001     111.44428      22.500004
    37.5          73.361534  1000.709    ]
04/10/2019 02:53:12 PM INFO: Difference: [ 11.974823 647.91406 ]
04/10/2019 02:53:12 PM INFO: Number: 169
04/10/2019 02:53:12 PM INFO: Time: 13.554227686000004 seconds

I'd like a numpy array output:

[220.12134, 3.7499998, 75.00001, 111.44428, 22.50000437.5, 73.361534, 1000.709]

Using the following code, I can get this to work for single lines:

log_file_path = some_log.log
#regex = '\[(.*?)\]'
regex2 = '(Final Input: \[)(.*?)(\]|\n)'

with open(log_file_path, 'r') as file:
    all_log_file = file.read()
    a = re.findall(regex2, all_log_file)
    print(a)

file.close()
#x = list(map(float, a.split()))

I get the following output, which is missing the Final Input values on the next line (I can parse the output below into a numpy array):

[('Final Input: [', '  220.12134       3.7499998    75.00001     111.44428      22.500004', '\n')]

1 Answer 1

1

Use a non-greedy specifier, along with re.DOTALL, which signifies that . includes \n:

import re

regex2 = '(Final Input: \[.+?\])'

a = re.findall(regex2, text, re.DOTALL)
a

Output:

['Final Input: [  220.12134       3.7499998    75.00001     111.44428      22.500004\n    37.5          73.361534  1000.709    ]']
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.