Regex match and split string in C#

Question

I have some ASCII documents in the following format:

[section heading]
paragraphs......

[section heading]
paragraphs......
...

Note: heading text are always enclosed in some specific pattern (e.g. [ ] in the above example)

I want to split the file into separate sections (each with a heading and the content).

What would be the most efficient way to parse the above document?

Using Regex.Match() I can extract the headings, but not the subsequent text content.

Using Regex.Split() I can grab the content, but not the related headings.

Is it possible combine these two Regex methods to parse the document? Are there better ways to achieve the same?

vks · Accepted Answer · 2015-08-19 18:33:54Z

1

(\[[^\]]*\])\n([\s\S]*?)(?=\n\[|$)

You can try this.Grab the group 1 and group 2.See demo.

https://regex101.com/r/gU4aG0/1

answered Aug 19, 2015 at 18:33

vks

68.1k11 gold badges96 silver badges132 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Eric Leibenguth · Accepted Answer · 2015-08-19 16:36:44Z

1

Try this:

string search = "\[([\w ]+)\]([^\[]*)";
foreach (Match match in Regex.Matches(yourtext, search))
    {
        string heading = match.Groups[1];
        string text = match.Groups[2];
    }

The regular expression capture both the heading and the paragraph. Thanks to capturing groups (between parentheses), you can extract both of them by iterating over the matches.

edited Aug 19, 2015 at 16:36

answered Aug 19, 2015 at 16:31

Eric Leibenguth

4,2873 gold badges29 silver badges51 bronze badges

Collectives™ on Stack Overflow

Regex match and split string in C#

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related