0

I've got a log file like below:

sw2 switch_has sw2_p3.
sw1 transmits sw2_p2
/* BUG: axiom too complex: SubClassOf(ObjectOneOf([NamedIndividual(#t_air_sens2)]),DataHasValue(DataProperty(#qos_type),^^(latency,http://www.xcx.org/1900/02/22-rdf-syntax-ns#PlainLiteral))) */
/* BUG: axiom too complex: SubClassOf(ObjectOneOf([NamedIndividual(#t_air_sens2)]),DataHasValue(DataProperty(#topic_type),^^(periodic,http://www.xcx.org/1901/11/22-rdf-syntax-ns#PlainLiteral))) */
...

what I'm interested in, is to extract specific words from /* BUG... lines and write them into separate file, something like below:

t_air_sens2 qos_type latency
t_air_sens2 topic_type periodic
...

I can do this with the help of awk and regex in shell like below:

awk -F'#|\\^\\^\\(' '{for (i=2; i<NF; i++) printf "%s%s", gensub(/[^[:alnum:]_].*/,"",1,$i), (i<(NF-1) ? OFS : ORS) }' output.txt > ./LogErrors/Properties.txt

How can I extract them using Python? (shall I use regex again, or..?)

2 Answers 2

1

You can of course use regex. I would read line by line, grab the lines the start with '/* BUG:', then parse those as needed.

import re

target = r'/* BUG:'
bugs = []
with open('logfile.txt', 'r') as infile, open('output.txt', 'w') as outfile:
    # loop through logfile
    for line in infile:
        if line.startswith(target):
            # add line to bug list and strip newlines
            bugs.append(line.strip())
            # or just do regex parsing here
            # create match pattern groups with parentheses, escape literal parentheses with '\'
            match = re.search(r'NamedIndividual\(([\w#]+)\)]\),DataHasValue\(DataProperty\(([\w#]+)\),\^\^\(([\w#]+),', line)
            # if matches are found
            if match:
                # loop through match groups, write to output
                for group in match.groups():
                    outfile.write('{} '.format(group))
                outfile.write('\n')

Python has a pretty powerful regex module built-in: re module

You can search for a given pattern, then print out the matched groups as needed.

Note: raw strings (r'xxxx') let you use unescaped characters.

Sign up to request clarification or add additional context in comments.

Comments

0

I have tried with following way and get the specific lines of the log file.

target =["BUGS"] # array with specific words

with open('demo.log', 'r') as infile, open('output.txt', 'w') as outfile:

    for line in infile:

        for phrase in target:

            if phrase in line:

                outfile.write('{} '.format(line)) 

This will output lines that include the words in the target and output is written in the output.txt file.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.