0

I have to creade a function that takes two inputs: a cell array of strings (let's call it txt) and a single string (let's call it str). The function has to remove each element of the cell vector txt whose string is either identical to str or contains str as a substring. For the moment I've tried with the following:

function c = censor( txt,str )
    c = txt;
    n = length(c);
    for i = 1:n
        a = c{ i };
        a( a == str ) = [];
        c{i} = a;
    end
end

But it doesn't work, it gives the error that Matrix dimensions must agree. I understand that it might be because str has more than one character, but I don't know how to find if str is contained in any of the strings of the cell array txt.

2
  • 2
    Take a look at strfind. se.mathworks.com/help/matlab/ref/strfind.html Commented May 21, 2015 at 12:16
  • Nice, it does what I need, but the problem is when the coincidence appears more than one time. I don't know how to remove each element once I find all the coincidences. Commented May 21, 2015 at 12:33

1 Answer 1

3

As Anders pointed out, you want to use strfind to look for strings inside other strings. Here is a way you could write your function. Basically apply strfind on the whole txt cell array, and then remove entries in which there was a match.

Code:

function censor(txt,str)
clc
clear

%// If no input are supplied..demo
if nargin ==0

    str = 'hello';
    txt = {'hellothere' 'matlab' 'helloyou' 'who are you' 'hello world'};
end

IsItThere = strfind(txt,str)

Now IsItThere is a cell array with some 1's and empty cells:

IsItThere = 

    [1]    []    [1]    []    [1]

Let's fill empty cells with 0, so we can perform logical indexing later:

IsItThere(cellfun('isempty',IsItThere))={0}

Find the indices in which a match occured:

IndicesToRemove = find(cell2mat(IsItThere))

IndicesToRemove =

     1     3     5

And remove cells:

txt(IndicesToRemove) = [];

txt now looks like this:

txt = 

    'matlab'    'who are you'

end

You can combine a few steps together if you like, but I hope that was clear enough :)

Here is the whole code that you can copy/paste in a .m file:

function censor(txt,str)
clc
clear

%// If no input are supplied..demo
if nargin ==0

    str = 'hello';
    txt = {'hellothere' 'matlab' 'helloyou' 'who are you' 'hello world'};
end

IsItThere = strfind(txt,str)

IsItThere(cellfun('isempty',IsItThere))={0}

IndicesToRemove = find(cell2mat(IsItThere))

txt(IndicesToRemove) = [];
txt
end
Sign up to request clarification or add additional context in comments.

2 Comments

Your method works with your example but for some reason it does not work when each element of the cell array is a sentence and str appears more than once. Any idea on how to improve this method? It is really good but I don't know how to make it work for sentences with more than one coincidence. For example, using the first 8 lines of the US national anthem and str = 'the', I get: IsItThere = [24] [30] [43] [6] [1x2 double] [1x2 double] [0] [1x4 double]
@Hec46 When there are multiple occurrences strfind gives the index where the str starts. You can use something like IsItThere(~cellfun('isempty', IsItThere)) = {1} before IsItThere(cellfun('isempty', IsItThere)) = {0}

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.