Efficient way to process substrings from a string

Question

I have the following string (actually with more elements)

string content= "  {
                  element1: one
                  element2: two
                  element3:three
                   }
                   {
                   element1: uno
                   element2: dos
                   element3:tres
                   }";

and I need to process this string element by element (an element is what is inside a { and a } , in the above case two elements but it can contain more)

Now, I am thinking of doing the usual IndexOf to find { and } and then extracting the substring one by one.

My question is: is there a more efficient way of doing this?

Seems like you should use a JSON-parser. Don´t re-invent the wheel. — MakePeaceGreatAgain
– MakePeaceGreatAgain, Commented Apr 9, 2019 at 7:47
Compared to tring-operations JSON-deserializing is far easier, in particular when you have some nested elements as well. — MakePeaceGreatAgain
– MakePeaceGreatAgain, Commented Apr 9, 2019 at 7:50
You want something more efficient, but when anyone suggests to use the best way you say: "No, I want string-replace"? So you either want an afficient way or you want to learn how to use string-manipulation. For the latter however I won´t suggest to use some JSON-input. — MakePeaceGreatAgain
– MakePeaceGreatAgain, Commented Apr 9, 2019 at 7:54
You can always go through the string, one char at a time and build your objects, don't think there can be anymore efficient way of doing it (if efficient mean less resources). Your string also lacks a seperator between 2 internal items, like element1 and element2, suggest you think it more about it and I would suggest go with JSON if you can reformat the source string — peeyush singh
– peeyush singh, Commented Apr 9, 2019 at 8:14
There are no commas. Is this a line-delimited format? Are the brackets guaranteed to appear on their own lines as well? — John Wu
– John Wu, Commented Apr 9, 2019 at 8:26

John Wu · Accepted Answer · 2019-04-09 08:46:24Z

1

Assuming your format doesn't have any more quirks to it (e.g. delimiters or escape sequences) you can parse that string with a bit of LINQ.

    var data = content.Replace("}","").Replace("\r\n","\n").Split('{')
        .Select
        ( 
            block => block.Split('\n')
            .Where
            ( 
                line => !string.IsNullOrWhiteSpace(line)
            )
            .Select
            ( 
                line => line.Split(new char[] {':'}, 2)
            )
            .ToDictionary
            ( 
                fields => fields[0].Trim(), 
                fields => fields[1].Trim()
            )
        )
        .ToList();

    foreach (var list in data)
    {
        foreach (var entry in list)
        {
            Console.WriteLine("{0}={1}", entry.Key, entry.Value);
        }
    }

Output:

element1=one
element2=two
element3=three
element1=uno
element2=dos
element3=tre

DotNetFiddle

answered Apr 9, 2019 at 8:46

John Wu

52.5k8 gold badges50 silver badges92 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Panagiotis Kanavos Over a year ago

That would be very expensive unless the string was converted to a ReadOnlySpan<char> first. Each string operation in this code generates new temporary strings

Collectives™ on Stack Overflow

Efficient way to process substrings from a string

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related