How do I access named capturing groups in a .NET Regex?

Question

I'm having a hard time finding a good resource that explains how to use Named Capturing Groups in C#. This is the code that I have so far:

string page = Encoding.ASCII.GetString(bytePage);
Regex qariRegex = new Regex("<td><a href=\"(?<link>.*?)\">(?<name>.*?)</a></td>");
MatchCollection mc = qariRegex.Matches(page);
CaptureCollection cc = mc[0].Captures;
MessageBox.Show(cc[0].ToString());

However this always just shows the full line:

<td><a href="/path/to/file">Name of File</a></td>

I have experimented with several other "methods" that I've found on various websites but I keep getting the same result.

How can I access the named capturing groups that are specified in my regex?

Backreference should be in the format (?<link>.*) and not (?<link>.*?) — Rashmi Pandit
– Rashmi Pandit, Commented May 25, 2009 at 14:05
FYI: If you are trying to store a named capture group inside an xml file then the <> will break it. You can use (?'link'.*) instead in this case. Not entirely relevant to this question but I landed here from a Google search of ".net named capture groups" so I'm sure other people are as well... — rtpHarry
– rtpHarry, Commented Apr 13, 2011 at 11:45
StackOverflow link with nice example: stackoverflow.com/a/1381163/463206 Also, @rtpHarry, No the <> will not break it. I was able to use the myRegex.GetGroupNames() collection as the XML element names. — radarbob
– radarbob, Commented Jun 29, 2012 at 17:23

Paolo Tedesco · Accepted Answer · 2016-01-19 17:49:00Z

298

Use the group collection of the Match object, indexing it with the capturing group name, e.g.

foreach (Match m in mc){
    MessageBox.Show(m.Groups["link"].Value);
}

edited Jan 19, 2016 at 17:49

user3638471

answered May 25, 2009 at 12:18

Paolo Tedesco

57.6k34 gold badges153 silver badges199 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Thomas Weller Over a year ago

Don't use var m, since that would be an object.

BenSwayne · Accepted Answer · 2012-04-29 19:37:25Z

127

You specify the named capture group string by passing it to the indexer of the Groups property of a resulting Match object.

Here is a small example:

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
        String sample = "hello-world-";
        Regex regex = new Regex("-(?<test>[^-]*)-");

        Match match = regex.Match(sample);

        if (match.Success)
        {
            Console.WriteLine(match.Groups["test"].Value);
        }
    }
}

edited Apr 29, 2012 at 19:37

BenSwayne

17k3 gold badges60 silver badges77 bronze badges

answered May 25, 2009 at 12:18

Andrew Hare

353k75 gold badges649 silver badges642 bronze badges

Comments

Rashmi Pandit · Accepted Answer · 2009-05-25 14:01:39Z

11

The following code sample, will match the pattern even in case of space characters in between. i.e. :

<td><a href='/path/to/file'>Name of File</a></td>

as well as:

<td> <a      href='/path/to/file' >Name of File</a>  </td>

Method returns true or false, depending on whether the input htmlTd string matches the pattern or no. If it matches, the out params contain the link and name respectively.

/// <summary>
/// Assigns proper values to link and name, if the htmlId matches the pattern
/// </summary>
/// <returns>true if success, false otherwise</returns>
public static bool TryGetHrefDetails(string htmlTd, out string link, out string name)
{
    link = null;
    name = null;

    string pattern = "<td>\\s*<a\\s*href\\s*=\\s*(?:\"(?<link>[^\"]*)\"|(?<link>\\S+))\\s*>(?<name>.*)\\s*</a>\\s*</td>";

    if (Regex.IsMatch(htmlTd, pattern))
    {
        Regex r = new Regex(pattern,  RegexOptions.IgnoreCase | RegexOptions.Compiled);
        link = r.Match(htmlTd).Result("${link}");
        name = r.Match(htmlTd).Result("${name}");
        return true;
    }
    else
        return false;
}

I have tested this and it works correctly.

answered May 25, 2009 at 14:01

Rashmi Pandit

23.9k17 gold badges75 silver badges114 bronze badges

2 Comments

Magnus Smith Over a year ago

Thanks for reminding me that curly braces can access the groups. I prefer to stick to ${1} to keep things even simpler.

Mariano Desanze Over a year ago

This completely answers the question, but has some problems that are too long to explain in here, but I explained and corrected those in my answer below

tinamou · Accepted Answer · 2017-07-28 10:24:26Z

3

Additionally if someone have a use case where he needs group names before executing search on Regex object he can use:

var regex = new Regex(pattern); // initialized somewhere
// ...
var groupNames = regex.GetGroupNames();

answered Jul 28, 2017 at 10:24

tinamou

2,2993 gold badges25 silver badges29 bronze badges

Comments

Mariano Desanze · Accepted Answer · 2019-10-29 18:26:31Z

This answers improves on Rashmi Pandit's answer, which is in a way better than the rest because that it seems to completely resolve the exact problem detailed in the question.

The bad part is that is inefficient and not uses the IgnoreCase option consistently.

Inefficient part is because regex can be expensive to construct and execute, and in that answer it could have been constructed just once (calling Regex.IsMatch was just constructing the regex again behind the scene). And Match method could have been called only once and stored in a variable and then linkand name should call Result from that variable.

And the IgnoreCase option was only used in the Match part but not in the Regex.IsMatch part.

I also moved the Regex definition outside the method in order to construct it just once (I think is the sensible approach if we are storing that the assembly with the RegexOptions.Compiled option).

private static Regex hrefRegex = new Regex("<td>\\s*<a\\s*href\\s*=\\s*(?:\"(?<link>[^\"]*)\"|(?<link>\\S+))\\s*>(?<name>.*)\\s*</a>\\s*</td>",  RegexOptions.IgnoreCase | RegexOptions.Compiled);

public static bool TryGetHrefDetails(string htmlTd, out string link, out string name)
{
    var matches = hrefRegex.Match(htmlTd);
    if (matches.Success)
    {
        link = matches.Result("${link}");
        name = matches.Result("${name}");
        return true;
    }
    else
    {
        link = null;
        name = null;
        return false;
    }
}

Tore Aurstad · Accepted Answer · 2024-07-27 21:04:33Z

A quick guide for regexes in .NET is available here:

https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference

Access regex matches is done via groups and captures.

Example of extension method for access of all capture values inside matches below.

public static class MatchCollectionExtensions{
    
    public static IEnumerable<string> GetCapturedValues(this MatchCollection matches){
        foreach (Match match in matches){
            foreach (Group group in match.Groups){
                foreach (Capture capture in group.Captures){
                    yield return capture?.Value;
                }
            }
        }
    }
    
}

Also, using Linqpad is a great resource for learning stuff in C#.

Using the Dump method will show the structure of objects.

Example from the question sample code below.

string page = """
<td><a href="/path/to/file">Name of File</a></td> 
""";
Regex qariRegex = new Regex("<td><a href=\"(?<link>.*?)\">(?<name>.*?)</a></td>");
MatchCollection mc = qariRegex.Matches(page);
CaptureCollection cc = mc[0].Captures;

mc.Dump();

//mc[0].Groups[1].Captures[0].Value.Dump();
//mc[0].Groups[2].Captures[0].Value.Dump();

foreach (var element in mc.GetCapturedValues())
{
    Console.WriteLine(element);
}

Output of your regex using extension method gave the following result after iterating and running Console.WriteLine :

<td><a href="/path/to/file">Name of File</a></td>
/path/to/file
Name of File

Adjusting the extension method to instead build a Dictionary of Group name as key and capture values inside should be fairly straightforward, for example creating a key in Dictionary concatenating Group name with capture index and then using capture value as the value of dictionary entry.

No, I wrote the extension method myself inside Linqpad. Yes, it only outputs the captured values if any and ignores the name of them. But as my answer texts, you could build a dictionary instead and use that group name as a key identifier, considering to concatenate group key name with capture index inside the group.

Criselmof · Accepted Answer · 2025-05-12 12:07:54Z

I found this question when I wanted to iterate over only the explicit group names but not the numbers. I used the group names for replacement data.

For specific group names, Paolo Tedesco and Andrew Hare gave already the solution

Match match = regexPattern.Match(page);
Group capturedLinkGroup = match.Groups["link"];
…

System.Text.RegularExpressions.Match.Groups

and Rashmi Pandit gave the solution

Match match = regexPattern.Match(page);
string capturedLink = match.Result("${link}");

However, for iterating over the names (not the numbers), the Regex.GetGroupNames() is inappropriate. It gives the names (and number strings) of all capturing groups.

This is even the case, when using RegexOptions.ExplicitCapture which retains the named groups and the whole match as group "0".

Therefore, I used

string[] groupNames = regexPattern.GetGroupNames()
        .Where(name => !int.TryParse(name, out _));

(for defensive programming)

but smaller coupled code can look like this as well:

Regex regexPattern = new Regex(…, RegexOptions.ExplicitCapture);
…
string[] groupNames = regexPattern.GetGroupNames()[1 ..];

Collectives™ on Stack Overflow

How do I access named capturing groups in a .NET Regex?

7 Answers 7

1 Comment

Comments

2 Comments

Comments

Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

1 Comment

Comments

2 Comments

Comments

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related