1

Im using Python 2.7, BeautifulSoup4, regex, and requests on windows 7.

I've scraped some code from a website and I am having problems parsing and extracting the bits I want and storing them in a dictionary. What I'm after is text that is presented as follows in the code:

@CAD_DTA\">I WANT THIS@G@H@CAD_LBL

there are about 50-60 short strings I want to extract and store and they are all preceded by @CAD_DTA\"> and followed by @G@H@CAD_LBL in the code. These strings are all of variable length

I've tried:

re.search('@CAD_DTA\">(.+?)@G@H@CAD_LBL',result.text)

where result is the output of s.post(url, data = cookie, headers = {'referer': my_referer})

Ive also tried passing str(result.text)

but re.search keeps returning None. It's odd because if I literally copy and paste the content of result.text into a string and pass that through re.search it works fine.

Ive tried using re.search('@CAD_DTA">(.+?)@G@H@CAD_LBL',result.text) in case the \ is being treated as an escape or something. I dunno.

Can someone point me in the right direction?

2
  • Is there a literal backslash before the double quote? re.search(r'@CAD_DTA\\">(.+?)@G@H@CAD_LBL',result.text) should work then. Commented Jun 22, 2015 at 16:52
  • That works! Thanks. I had tried the double backslash but without the 'r'. Anyway to reference the location that the string was found? So I can then go and search again starting at that position. Commented Jun 22, 2015 at 17:10

1 Answer 1

1

In order to match the string with a literal backlash, you need to double-escape it in a raw string, e.g.:

re.search(r'@CAD_DTA\\">(.+?)@G@H@CAD_LBL',result.text)
          ^          ^

In order to get the index of the found match, you can use start([group]) of re.MatchObject

IDEONE demo:

import re
obj = re.search(r'@CAD_DTA\\">(.+?)@G@H@CAD_LBL', 'Some text here...@CAD_DTA\\">I WANT THIS@G@H@CAD_LBL')
print obj.start(1)
print obj.group(1)
Sign up to request clarification or add additional context in comments.

1 Comment

I am also happy to help someone using appropriate tools for concrete tasks :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.