3

I'm trying to grep 3 fields for the strings a, b and c. I know that this can be done with

grep -E 'a|b|c'

However, I also want to grep for the strings x, y and z, including the following line. I know that this can be done with

grep -A1 'x'

So my question is, is it possible to combine all of these into a single command? E.g. something like (i know this command doesn't work, just an example)

grep -E 'a|b|c' -A1 'x|y|z'

If there is a better way without grep, or even using python that would be helpful, I just resorted to using grep as I thought it would be faster than reading a file line by line with python. Cheers!

EDIT: So I have a big file with recurring sections, it looks something like this:

{
    "source_name": [
        "$name"
    ],
    "source_line": [
        52
    ],
    "source_column": [
        1161
    ],
    "source_file": [
        "/somerandomfile"
    ],
    "sink_name": "fwrite",
    "sink_line": 55,
    "sink_column": 1290,
    "sink_file": "/somerandomfile",
    "vuln_name": "vuln",
    "vuln_cwe": "CWE_862",
    "vuln_id": "17d99d109da8d533428f61c430d19054c745917d0300b8f83db4381b8d649d83",
    "vuln_type": "taint-style"
}                      

And this section between the {} repeats in the file. So what I'm trying to grep is the line below source_name, source_line and source_file along with the vuln_name, sink_file and sink_line. So sample Output should be:

    "source_name": [
        "$name"
    "source_line": [
        52
    "source_file": [
        "/somerandomfile"
    "sink_line": 55,
    "sink_file": "/somerandomfile",
    "vuln_name": "vuln",
3
  • Why is there a need to combine these commands? Commented Nov 20, 2018 at 14:36
  • @JonahBishop makes my life a bit easier by having the output follow each other, instead of being split up. If that makes any sense Commented Nov 20, 2018 at 14:39
  • 1
    Try grep -Poz 'a|b|c|(x|y|z).*\R.*' file Commented Nov 20, 2018 at 14:59

2 Answers 2

1

This python script should be able to do the job, and it allows for some ad-hoc customization that would be hard to get into a dense grep-command:

my_grep.py

import re
import sys

first = re.compile(sys.argv[1])
second = re.compile(sys.argv[2])
with open(sys.argv[3]) as f:
  content = f.readlines()

for idx in range(len(content)):
  first_match = first.search(content[idx])
  if first_match:
    print(content[idx])
  second_match = second.search(content[idx])
  if second_match and (idx+1) < len(content):
    print(content[idx])
    print(content[idx+1])

You can generate your desired output like this:

 python my_grep.py 'sink_line|sink_file|vuln_name' 'source_name|source_line|source_file' input_file

Given that your input file is called input_file.

Sign up to request clarification or add additional context in comments.

1 Comment

This works fine and makes it easy for me to modify the output to my liking or assign output to variables. Thanks dude!
0

AWK

awk supports range patterns which match everything from pattern1 until pattern2:

awk '/(aaa|bbb|ccc)/,/[xyz]/' data.txt

PYTHON

Python allows you to compile regular expressions for speed and you can call the script as a single command by putting it in a file.

import re

pattern1 = re.compile("a|b|c")
pattern2 = re.compile("x|y|z")
saw_pattern1 = False

with open("data.txt", "rb") as fin:
    for line in fin:
        if saw_pattern1 and pattern2.match(line):
            print("do stuff")
        saw_pattern1 = pattern1.match(line)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.