0

I got a file that has lots of different events from some service, I want to break those events in to different lines, and remove some "words & elements" Example of log file:

"Event1":{"Time":"2022-12-16 16:04:16","Username":"[email protected]","IP_Address":"1.1.1.1","Action":"Action1","Data":"Datahere"},"Event2":{"Time":"2022-12-16 16:03:59","Username":"[email protected]","IP_Address":"1.1.1.1","Action":"Action2","Data":"Datahere"},"Event3":{"Time":"2022-12-16 15:54:56","Username":"[email protected]","IP_Address":"1.1.1.1","Action":"Action3","Data":"Datahere"},

As you see they all start with "EventX", At the end I want to see:

{"Time":"2022-12-16 16:04:16","Username":"[email protected]","IP_Address":"1.1.1.1","Action":"Action1","Data":"Datahere"}
{"Time":"2022-12-16 16:03:59","Username":"[email protected]","IP_Address":"1.1.1.1","Action":"Action2","Data":"Datahere"}
{"Time":"2022-12-16 15:54:56","Username":"[email protected]","IP_Address":"1.1.1.1","Action":"Action3","Data":"Datahere"},

As you see "EventX": and "," are removed and each event is now a new line at the file.

Just a beginner here with Python and cannot figure this one out.

Thanks

tried combining re.search & re.findall without luck, Also tried to find a way to copy only things between {} and add those later and again no luck here.

2
  • is your desired output "just the text" as you show or are you hoping to make a list of dictionaries that have the data elements shown... a data structure output? Commented Dec 16, 2022 at 23:04
  • Hey, just as text. All the rest is done by the service that receives this data Commented Dec 16, 2022 at 23:57

1 Answer 1

1

This construct below works and makes a list of dictionaries from your data. You could smash down some of this syntax with list or dictionary comprehensions, but it isn't needed.

If you are having trouble with testing the regex expressions, this site is invaluable.

Code

import regex as re

data = '''"Event1":{"Time":"2022-12-16 16:04:16","Username":"[email protected]","IP_Address":"1.1.1.1","Action":"Action1","Data":"Datahere"},"Event2":{"Time":"2022-12-16 16:03:59","Username":"[email protected]","IP_Address":"1.1.1.1","Action":"Action2","Data":"Datahere"},"Event3":{"Time":"2022-12-16 15:54:56","Username":"[email protected]","IP_Address":"1.1.1.1","Action":"Action3","Data":"Datahere"},'''

splitter = r'"Event\d+":{(.*?)}'  # a search pattern to capture the stuff in braces

# tokenize the data source...
tokens = re.findall(splitter, data)

#print(tokens)


# now we can operate on the tokens and split them up into key-value pairs and put them into a list
result = []
for token in tokens:
    # make an empty dictionary to hold the row elements
    line_dict = {}
    # we can split the line (token) by comma to get the key-value pairs
    pairs = token.split(',')
    for pair in pairs:
        # another regex split needed here, because the timestamps have colons too
        splitter = r'"(.*)"\s*:\s*"(.*)"'    # capture two groups of things in quotes on opposite sides of colon
        parts = re.search(splitter, pair)
        key, value = parts.group(1), parts.group(2)
        line_dict[key] = value
    # add the dictionary of line elements to the result
    result.append(line_dict)

for d in result:
    print(d)

Output:

{'Time': '2022-12-16 16:04:16', 'Username': '[email protected]', 'IP_Address': '1.1.1.1', 'Action': 'Action1', 'Data': 'Datahere'}
{'Time': '2022-12-16 16:03:59', 'Username': '[email protected]', 'IP_Address': '1.1.1.1', 'Action': 'Action2', 'Data': 'Datahere'}
{'Time': '2022-12-16 15:54:56', 'Username': '[email protected]', 'IP_Address': '1.1.1.1', 'Action': 'Action3', 'Data': 'Datahere'}

=========

Edit:

If you are having trouble getting the data out of the file, try something like this (and experiment...it isn't clear exactly how your file is formatted/linebreaks, etc.

f_name = 'logfile.txt'

# use a context manager (look it up)
with open(f_name, 'r') as src:
    data = src.readlines()

# check it!
print(data)
Sign up to request clarification or add additional context in comments.

3 Comments

This seems like the way to go, but when changed the data to be a file "open("myfile.txt", "w")" getting the error: Exception has occurred: TypeError expected string or bytes-like object, got '_io.TextIOWrapper' File "/log/testlines", line 16, in <module> tokens = re.findall(splitter, data)
You should be opening the file in read mode, not write mode. Use 'r'. And you will need to read the data.... See my edit.
Getting another error now, expected string or bytes-like object, got 'list'. Regarding the list its just a long text file with the included log inside nothing else. Again thanks for helping.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.