-1

I need to extract a string command output, I want to use a regex to get a string from the whole output, specifically the middle string between two separators:

This is a shell command output and I'm printin a string __SEPARATOR__ the first has simple quotes and the last no quotes:

ls -l && echo '__SEPARATOR__'\r\ntotal 36\r\ndrwxr-xr-x 3 VMlinux2 VMlinux2 4096 Sep 27 09:26 \x1b[0m\x1b[01;34mDesktop\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mDocuments\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mDownloads\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mMusic\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mPictures\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mPublic\x1b[0m\r\ndrwxr-xr-x 3 VMlinux2 VMlinux2 4096 Sep 26 13:11 \x1b[01;34msnap\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mTemplates\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mVideos\x1b[0m\r\n__SEPARATOR__\r\nVMlinux2@ubuntu:~$ 

My regex:

'__SEPARATOR__'(.*)__SEPARATOR__

So I'm trying to get the middle string between those separators

regex = r"'__SEPARATOR__'(.*)__SEPARATOR__"
text = re.search(regex, output).group(1)
print(text)

But I got an error:

AttributeError: 'NoneType' object has no attribute 'group'

I tried with simple texts like:

I'm want to get the '__SEPARATOR__' middle text __SEPARATOR__ from this text

And it works well, also I tried removing break lines and all others but same error.

What I'm doing wrong? or which approach can I take for this issue?

5
  • 1
    That's the command output? It looks more like the command itself. Commented Oct 11, 2019 at 23:50
  • Have you tried verifying that output is exactly what you think it is? Can you show the code that sets it? Commented Oct 11, 2019 at 23:50
  • 1
    Your regexp has two underscores before and after SEPARATOR, but the output only has a single underscore. Commented Oct 11, 2019 at 23:51
  • Yes, sorry the output and the regex has two undercores, my bad editing.. Commented Oct 11, 2019 at 23:57
  • But still same error Commented Oct 11, 2019 at 23:58

1 Answer 1

-1

Basically, the problem is that the "." symbol does not match lines break. What you need to do is or this with the possible line breaks you could get. A (somewhat) complete solution would be something like:

>>> regex = r"'__SEPARATOR__'((.|\r|\n|\r\n|\n\r)*)__SEPARATOR__"
>>> mo = re.search(regex, output)
>>> mo.group(1)
'\r\ntotal 36\r\ndrwxr-xr-x 3 VMlinux2 VMlinux2 4096 Sep 27 09:26 \x1b[0m\x1b[01;34mDesktop\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mDocuments\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mDownloads\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mMusic\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mPictures\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mPublic\x1b[0m\r\ndrwxr-xr-x 3 VMlinux2 VMlinux2 4096 Sep 26 13:11 \x1b[01;34msnap\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mTemplates\x1b[0m\r\ndrwxr-xr-x 2 VMlinux2 VMlinux2 4096 Sep 26 13:10 \x1b[01;34mVideos\x1b[0m\r\n'
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.