regex not finding http header

Question

I have a ('stolen':) Python code that use regex to parse all HTTP headers.

It is like this:

parser = re.compile(r'\s*(?P<key>.+\S)\s*:\s+(?P<value>.+\S)\s*')
header_list = [(key, value) for key, value in parser.findall(http_headers)]

Normally this works great, but the following header is not found:

Access-Control-Allow-Origin: *

I think it can have something to do with the asterisk, but I'm not sure. I think the regex part:

P<value>.+\S

is used to match and group . any character + one or more times followed by \S any non-whitespace. Isn't asterisk a part of that?

Any ideas?

Your regex does not work as you expect it to. Ideally, If you must use regex I would write it a different way. But you need to change the .+ to .* in your second group. — hwnd
– hwnd, Commented Feb 20, 2015 at 0:50

brandonscript · Accepted Answer · 2015-02-20 00:52:08Z

2

The problem here is actually quite simple. The final .+ expects any character, then followed by a \S another single character. tl;dr: it only matches 2 or more characters after the regex.

Use a * to look for 0 or more characters (plus the \S) instead:

\s*(?P<key>.+\S)\s*:\s+(?P<value>.*\S)\s*
#                                 ^ * instead of +

answered Feb 20, 2015 at 0:52

brandonscript

73.8k35 gold badges179 silver badges240 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Phiplex Over a year ago

Works great! This was my very first question and I'm very happy and a little surprised over the fast response I got from all who answered. Thank you so much!

Collectives™ on Stack Overflow

regex not finding http header

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related