I'm trying to write a regex in python to get F1 to F8 fields from a line that looks like this:
LineNumber(digits): F1, F2, F3, ..., F8;
F1 to F8 can have lowercase/uppercase letters and hyphens.
For example:
Header
Description
21: Yes, No, Yes, No, Ye-s, N-o, YES, NO;
Footer
What I've tried so far is
matched = re.match(r'\d+: ([a-zA-Z-]*, ){7}(.*);', line) which matches the lines with the above format. However, when I call matched.groups() to print the matched fields, I only get F7, and F8 while the expected output is a list containing F1, to F7, plus F8.
I have a few questions regarding this regex:
I guess
groups()method returns the fields that were grouped in the regex using(...). Why don't I get F1 to F6 in the output while they are grouped using(...)and have matched the regex?What is a better regex I can write to exclude
,from F1 to F7? (A short explanation of the suggested regex is much appreciated)
line.split(': ')[1].