0

I have a cell array 3x1 like this:

name1 = text1
name2 = text2
name3 = text3

and I want to parse it into separate cells 1x2, for example name1 , text1. In future I want to treat text1 as a string to compare with other strings. How can I do it? I am trying with regexp and tokens, but I cannot write a proper formula for that, if someone can help me with it please, I will be grateful!

1
  • (\w+)\s=\s(\w+) <- Do you need something like this? Commented Aug 20, 2012 at 11:30

1 Answer 1

4

This code

input = {'name1 = text1';
         'name2 = text2';
         'name3 = text3'};

result = cell(size(input, 1), 2);
for row = 1 : size(input, 1)
    tokens = regexp(input{row}, '(.*)=(.*)', 'tokens');
    if ~isempty(tokens)
        result(row, :) = tokens{1};
    end
end

produces the outcome

result = 
    'name1 '    ' text1'
    'name2 '    ' text2'
    'name3 '    ' text3'

Note that the whitespace around the equal sign is preserved. You can modify this behaviour by adjusting the regular expression, e.g. also try '([^\s]+) *= *([^\s]+)' giving

result = 
    'name1'    'text1'
    'name2'    'text2'
    'name3'    'text3'

Edit: Based on the comments by user1578163.

Matlab also supports less-greedy quantifiers. For example, the regexp '(.*?) *= *(.*)' (note the question mark after the asterisk) works, if the text contains spaces. It will transform

input = {'my name1 = any text1';
         'your name2 = more text2';
         'her name3 = another text3'};

into

result = 
    'my name1'      'any text1'    
    'your name2'    'more text2'   
    'her name3'     'another text3'
Sign up to request clarification or add additional context in comments.

4 Comments

is there any way to remove white spaces in your first approach? second one is not working taht good, but first is ideal just need to remove the spaces to be able to compare strings. and by the way, how do you recommend me to learn regular expressions and tokens? help on matlab is not enough in this manner, too few examples to understand how to combine all the 'signs'.
I know this expression will do the trick for a single string, but how to get into the cell of these strings? regexprep(s,'[^\w'']','')
@user1578163: Try tokens{1}{1} and tokens{1}{2} to get directly to the tokens. Why isn't the second regexp working as expected? What strings are failing?
@user1578163: I learned regular expressions from the SED tutorial grymoire.com/Unix/Sed.html.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.