regexp parsing in matlab

Question

I have a cell array 3x1 like this:

name1 = text1
name2 = text2
name3 = text3

and I want to parse it into separate cells 1x2, for example name1 , text1. In future I want to treat text1 as a string to compare with other strings. How can I do it? I am trying with regexp and tokens, but I cannot write a proper formula for that, if someone can help me with it please, I will be grateful!

(\w+)\s=\s(\w+) <- Do you need something like this?

oopbase
– oopbase

2012-08-20 11:30:58 +00:00
Commented Aug 20, 2012 at 11:30 — oopbase
– oopbase, Commented Aug 20, 2012 at 11:30

Community · Accepted Answer · 2017-05-23 12:20:08Z

4

This code

input = {'name1 = text1';
         'name2 = text2';
         'name3 = text3'};

result = cell(size(input, 1), 2);
for row = 1 : size(input, 1)
    tokens = regexp(input{row}, '(.*)=(.*)', 'tokens');
    if ~isempty(tokens)
        result(row, :) = tokens{1};
    end
end

produces the outcome

result = 
    'name1 '    ' text1'
    'name2 '    ' text2'
    'name3 '    ' text3'

Note that the whitespace around the equal sign is preserved. You can modify this behaviour by adjusting the regular expression, e.g. also try '([^\s]+) *= *([^\s]+)' giving

result = 
    'name1'    'text1'
    'name2'    'text2'
    'name3'    'text3'

Edit: Based on the comments by user1578163.

Matlab also supports less-greedy quantifiers. For example, the regexp '(.*?) *= *(.*)' (note the question mark after the asterisk) works, if the text contains spaces. It will transform

input = {'my name1 = any text1';
         'your name2 = more text2';
         'her name3 = another text3'};

into

result = 
    'my name1'      'any text1'    
    'your name2'    'more text2'   
    'her name3'     'another text3'

edited May 23, 2017 at 12:20

CommunityBot

11 silver badge

answered Aug 20, 2012 at 11:31

Mehrwolf

8,6052 gold badges28 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

berndh Over a year ago

is there any way to remove white spaces in your first approach? second one is not working taht good, but first is ideal just need to remove the spaces to be able to compare strings. and by the way, how do you recommend me to learn regular expressions and tokens? help on matlab is not enough in this manner, too few examples to understand how to combine all the 'signs'.

berndh Over a year ago

I know this expression will do the trick for a single string, but how to get into the cell of these strings? regexprep(s,'[^\w'']','')

Mehrwolf Over a year ago

@user1578163: Try tokens{1}{1} and tokens{1}{2} to get directly to the tokens. Why isn't the second regexp working as expected? What strings are failing?

Mehrwolf Over a year ago

@user1578163: I learned regular expressions from the SED tutorial grymoire.com/Unix/Sed.html.

Collectives™ on Stack Overflow

regexp parsing in matlab

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related