0

I am new to Python and stackoverflow, very new.

I want to extract the destination port:

2629  >  0 [SYN] Seq=0 Win=512 Len=100
0  >  2629 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
0  >  2633 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0

I want to retrieve destination ports for every line: '0' , '2629', '2633' using python regex and ignore the rest (the number that appears after '>' and before '['.

re.findall("\d\d\d\d\d|\d\d\d\d|\d\d\d|\d\d|\d", str)

but this is very generic one. What is the best regex for such scenario?

2
  • You're trying to parse the output of some program. Why not do the packet capture in Python directly? E.g. stackoverflow.com/questions/4948043/… Commented Nov 23, 2019 at 1:26
  • 1
    if you have string then split it using space and get third element line.split(' ')[2] Commented Nov 23, 2019 at 1:33

2 Answers 2

1

You could use the split function on string for this specific case. A quick implementation would be:

dest_ports = []
lines = [
    "2629  >  0 [SYN] Seq=0 Win=512 Len=100", 
    "0  >  2629 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0", 
    "0  >  2633 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0"
]

for line in lines:
  dest_ports.append(line.split('>  ')[1].split(' [')[0])

Which would yield the answer:

dest_ports = ['0', '2629', 2633']

Sign up to request clarification or add additional context in comments.

Comments

0

you could use a regex like this:

dff=io.StringIO("""2629  >  0 [SYN] Seq=0 Win=512 Len=100  
0  >  2629 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0  
0  >  2622  [RST, ACK] Seq=1 Ack=1 Win=0 Len=0  
0  >  2633 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0""") 

dff.seek(0) 
for line in dff: 
     print(re.search('(^\d+\s+\>\s+)(\d+)', line).groups()[1]) 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.