0

trying to create a message alert from alertmanager(my instance is having pod restarts in production namespace), using lambda to trigger SNS messaging in case the alert jumps in. so its- alertmanager(via webhooks configured to api gateway endpoint) > apigateway > lambda > SNS > my SMS.

my problem is how to extract different alertmanager firing. from the records in lambda? the volume of body in message is pretty big, and i need something to extract a specific message to trigger the lambda to send SNS.

should i simply go and stream edit the body/and trigger according to my own description in the alertmanager receiver config?

or is there a faster better way to do it?

the part of the helm chart for prometheus:

additionalPrometheusRules: 
  - name: custom-pod-restarts
    groups:
        - name: pod-restarts
          rules:
            - alert: HighFrontendRestarts
              expr: increase(kube_pod_container_status_restarts_total{namespace="production"}[1h]) > 2
              for: 10m
              labels:
                severity: critical
              annotations:
                summary: "High restart rate in frontend pod"
                description: "Pod {{ $labels.namespace }}/{{ $labels.pod }} in the frontend deployment has restarted more than 2 times in the last hour."
            - alert: HighBackendRestarts
              expr: increase(kube_pod_container_status_restarts_total{namespace="production"}[1h]) > 2
              for: 10m
              labels:
                severity: critical
              annotations:
                summary: "High restart rate in backend pod"
                description: "Pod {{ $labels.namespace }}/{{ $labels.pod }} in the backend deployment has restarted more than 2 times in the last hour."

the alertmanager chart part of the receiver/webhook:

- name: 'aws-lambda-webhook'
       webhook_configs:
       - url: 'https://XXXXXXX.execute-api.us-east-2.amazonaws.com/tested-production/alerts'
         send_resolved: true
    route:
      group_by: ['namespace']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
      receiver: 'aws-lambda-webhook'
      routes:
      - receiver: 'aws-lambda-webhook'
        matchers:
          - alertname = "Watchdog"
    templates:
    - '/etc/alertmanager/config/*.tmpl'

the Lambda part to extract the message from alertmanager:

record = event.get('body', [])
        print("printing indent 2 of event :", json.dumps(event, indent=2))
        if record :
            print(f"these are the records: ${record[0]}")
        else:
            print("no records, or unprinteable this way")    
        message = event['body'][0]['Sns']['Message']
# add condition here in lambda that detects/search/extracts from record 

where can i read how the message is formatted and processed in the lambda? again, since alertmanager can be firing lots of alerts, and i just want specific ones to trigger SNS.

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.