0

I am trying to write a simple python program to read a log file and extract specific values I have the following log line I want to look out for

2022-12-02 13:13:10.539 [metrics-writer-1] [INFO ] metrics - type=GAUGE, name=Topic.myTopic1.TotalIncomingBytes.Count, value=20725269

I have many topics such as myTopic2, myTopic3 etc

I want to be able to detect all such lines which show the total incoming bytes for various topics and extract the value. Is there any easy and efficient way to do so ? basically I want to be able to detect the following pattern

2022-12-02 13:13:10.539 [metrics-writer-1] [INFO ] metrics - type=GAUGE, name=Topic.${}.TotalIncomingBytes.Count, value=${}

Ignoring the timestamp ofcourse

1
  • If it were me, I'd look for lines where '[INFO ] metrics' in line, then split on ' - ' (space dash space), then split the second have on ", ", and split those parts on = to get name/value pairs. Now you can store them in a dictionary. Commented Dec 2, 2022 at 5:31

2 Answers 2

1

Maybe something like this:

resultLines = []
resultSums = {}
with open('recent.logs') as f:
    for idx, line in enumerate(f):
        pieces = line.rsplit('.TotalIncomingBytes.Count, value=', 1)
        if len(pieces) != 2: continue

        value = pieces[1]

        pieces = pieces[0].rsplit(' [metrics-writer-1] [INFO ] metrics - type=GAUGE, name=Topic.', 1)
        if len(pieces) != 2: continue

        topic = pieces[1]
        value = int(value)

        resultLines.append({
            'idx': idx,
            'line': line,
            'topic': topic,
            'value': value,
        })

        if topic not in resultSums:
            resultSums[topic] = 0
        resultSums[topic] = resultSums[topic] + value

for topic, value in resultSums.iteritems():
    print(topic, value)
Sign up to request clarification or add additional context in comments.

Comments

0

Here's the way I would do it. This could also be done with a regular expression.

data = """\
2022-12-02 13:13:10.539 [metrics-writer-1] [INFO ] metrics - type=GAUGE, name=Topic.myTopic1.TotalIncomingBytes.Count, value=20725269
2022-12-02 13:13:10.539 [metrics-writer-1] [INFO ] metrics - type=GAUGE, name=Topic.myTopic1.TotalIncomingBytes.Count, value=20725269
2022-12-02 13:13:10.539 [metrics-writer-1] [INFO ] metrics - type=GAUGE, name=Topic.myTopic1.TotalIncomingBytes.Count, value=20725269
"""

counts = {}

for line in data.splitlines():
    if '[INFO ] metrics' in line:
        parts = line.split(' - ')
        parts = parts[1].split(', ')
        dct = {}
        for part in parts:
            key,val = part.split('=')
            dct[key] = val
        if dct['name'] not in counts:
            counts[dct['name']] = int(dct['value'])
        else:
            counts[dct['name']] += int(dct['value'])

print(counts)

Output:

{'Topic.myTopic1.TotalIncomingBytes.Count': 62175807}

Here's a regex version:


pattern = re.compile(r".* - type=([^,]*), name=([^,]*), value=([^,]*)")
counts = {}

for line in data.splitlines():
    if '[INFO ] metrics' in line:
        parts = pattern.match(line)
        if parts[2] not in counts:
            counts[parts[2]] = int(parts[3])
        else:
            counts[parts[2]] += int(parts[3])

print(counts)

2 Comments

[metrics-writer-1] [INFO ] metrics - type=GAUGE, name=Topic.${}.TotalIncomingBytes.Count, value=${} I want to detect the above pattern so type=GAUGE and TotalIncomingBytes will stay constant
I assume you can figure out how to modify my code to do that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.