2

I wanna parse namespaces in c# cs file, for example using System.Collections.Generic I want to capture groups (System) (Collections) (Generic).

So far i wrote this regular expression: "[ .]?(\w*?)(?=[.;])"

but it also marks every words which suits this pattern. picture with result

So I have to add condition that line begins with "using".

I tried to add this "using[ .]?(\w*?)(?=[.;])" but it will only capture first namespace.

picture with result

There is input text

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Text.RegularExpressions;

string someString;
Console.ReadLine();

Update:

I'm sorry I didn't mentioned it first, but there is one more thing, the same thing will happen with Methods, for example, Console.ReadLine() shouldnt return ReadLine. The same for all dots that are not in using

8
  • Can't you use two regexes? Or not using regex at all? Commented Mar 10, 2019 at 8:38
  • using regex is obligatory. Commented Mar 10, 2019 at 8:39
  • What about using 2 regexes? One for getting everything after using, then another for getting the subnamespaces. Commented Mar 10, 2019 at 8:40
  • No.I need to write it in one regex Commented Mar 10, 2019 at 8:43
  • And which flavour of regex? Commented Mar 10, 2019 at 8:44

4 Answers 4

2

To start to match repeating patterns from a specific point you will find \G token helpful:

(?m)(?:^using +|\G(?!^)\.)\K\w+

See live demo here

Regex breakdown:

  • (?m) Enable multiline mode
  • (?: Start of non-capturing group
    • ^using + Match using at start of line following spaces
  • | Or
    • \G(?!^) Start match from where previous match ended
    • \. Match a period
  • ) End of non-capturing group
  • \K Reset output
  • \w+ Match a sequence of word characters
Sign up to request clarification or add additional context in comments.

3 Comments

Super nice answer, I didn't know \G and \K. So powerful, I will I could vote more than one time. By the way, wouldn't (?m)(?:^using +|\G\.)\K\w+ work fine as well?
Thank you. No, omitting lookahead will cause engine to match .someString at the very beginning of input string. See here regex101.com/r/feOUrE/1
I see didn't take into account this possibility! Thanks
0

you can use this (using |[.])(\w+)

Online demo

1 Comment

It potentially takes much more. Look here regex101.com/r/LHBjNo/5
0

You can use the regex:

(?<=^using\s)((?:\w+)(?:[.](?:\w+))*)(?=;)

input:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Text.RegularExpressions;

string something;
abc.something;
Console.WriteLine(".test.');

matches:

System
System.Collections.Generic
System.Linq
System.Text
System.IO
System.Text.RegularExpressions

then use the function on each match to extract each intermediate module:

$submodules= explode(".", $match);

demo:

https://regex101.com/r/p0K3dN/4/

Code sample:

$input="

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Text.RegularExpressions;

string something;
abc.something;
Console.WriteLine('.test.');

";

preg_match_all('/(?<=using\s)(?:\w+)(?:[.](?:\w+))*(?=;)/m', $input, $matches);

foreach($matches as $modules)
    foreach($modules as $module)
            print_r(explode(".",$module)); 

Result:

Array
(
    [0] => System
)
Array
(
    [0] => System
    [1] => Collections
    [2] => Generic
)
Array
(
    [0] => System
    [1] => Linq
)
Array
(
    [0] => System
    [1] => Text
)
Array
(
    [0] => System
    [1] => IO
)
Array
(
    [0] => System
    [1] => Text
    [2] => RegularExpressions
)

4 Comments

This is the same as @MJNBelief's answer.
@revo: Nice pick, I have answered to this question way too quickly. I have edited my answer now.
It's not working using System.Collections.Generic; doesn't capture Collections
I'm sorry, but i need namespaces to be already captured in regular expression, you just get all namespaces in one group
0

Updated: The following regex

(?<=using\s)(\w*(?=[.;]))|\G(\w*(?=[;.]))

will give you result as below

(?<=using\s) Positive Lookbehind using and \s whitespace

(\w*(?=[.;])) matches any word character befor . or ;

\G asserts position at the end of the previous match.

(\w+(?=[.;])) repeat matches any word character befor . or ;

Check demo here

3 Comments

Yeah, i add this note to my question
@A.Ryshkov Updated
@revo Thanks, Updated.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.