0

I have a Message like this. The message is derived after doing .ParseFromString(). These messages are being transferred via ZMQ in protobuf format.

    summaries {
  key: "node_A"
  value {
    value_A {
      data_1: 29994
      data_2: 0.07402841001749039
      data_3: -6.621330976486206e-05
    }
    some_activity {
      sys_activity {
        key: "arch_prctl"
        value: 174
      }
      sys_activty {
        key: "execve"
        value: 174
      }
      sys_activity {
        key: "fork"
        value: 261
      }
      some_events_A: 174
      some_events_B: 261
    }
    new_activity {
      sys_new_activity {
        key: "close"
        value: 232
      }
      sys_new_activity {
        key: "open"
        value: 116
      }
      some_new_events: 116
    }
    more_activity {
    }
    error_activity {
    }
    some_alerts {
    }
  }
}

I need to return the output as

some_activity: ["arch_prctl","execve","fork"],
some_events_A: 174
some_events_B: 261

I am able to get value_A fields by using like value_A.data_1

But I am finding it hard to return the remaining nested fields. I tried to use json.dumps but it gives me JSON object is not a serializable error.

The number of sys_activty in some_activity varies and its not always 3 values as given below.

Let me know if the question is unclear. Let's assume that the service sending this message is not editable and we only have the option to read and give the required output on the client-side. Thanks in advance

10
  • 2
    This is not valid JSON. Commented Feb 19, 2021 at 17:53
  • Yeah that what its making it difficult for me. I am receiving these messages via ZMQ like this. Commented Feb 19, 2021 at 17:58
  • Maybe they provide a parser library. Commented Feb 19, 2021 at 18:03
  • 2
    This is not JSON, but you could write a parser with pyparsing. Commented Feb 19, 2021 at 18:09
  • 1
    You could try to replace with regex. Yes, this is not json but good enough to use it Commented Feb 19, 2021 at 18:14

2 Answers 2

1

You can try this and please don't hit me because of this ugly solution

import regex as re
import json

jstring = ""
f = open("brokenjson.txt", "r")
for x in f:
    a = x.strip()
    if a[-1] != "{":
        a += ","
    else:
        a = a.replace("{", ": {") # add `:` to key : value pair

    if ":" in a:
        b = a.split(":")
        if "\"" not in b[0]:
            a = "\"" + b[0].strip() + "\": " + b[1] # key don't have double quotes "
    jstring += a

jstring = jstring.replace(",}", "}") # remove trailing commas

if jstring[-1] == ",":
    jstring = jstring[:-1]  # check if trailing commas at the end or not

if jstring[0] != "{" and jstring[0] != "[":
    jstring = "{" + jstring + "}"  # add bracket

result = json.loads(jstring)

distActivity = {}
distEvent = {}
for key, val in result["summaries"]["value"].items():
    if "activity" in key:
        if key not in distActivity:
            distActivity[key] = []
        for k,v in val.items():
            if "activity" in k:
                distActivity[key].append(v["key"])
            if "event" in k:
                distEvent[k] = v

print(distActivity)
print(distEvent)

Because your sample have some duplicate keys so I only got this result

{'some_activity': ['fork'], 'new_activity': ['open'], 'more_activity': [], 'error_activity': []}
{'some_events_A': 174, 'some_events_B': 261, 'some_new_events': 116}
Sign up to request clarification or add additional context in comments.

1 Comment

Hi Thanks a lot for this. Yeah i was able to use MessageToJson() to get the right format. Thanks
0

Using MessageToJson() to convert protobuf instead of ParseFromString() solved the issue

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.