Most frequent element in a string array, MATLAB

Question

I have a string array, for instance:

arr = ['hello'; 'world'; 'hello'; 'again'; 'I----'; 'said-'; 'hello'; 'again']

How can I extract the most frequent string, which is 'hello' in this example?

Hugh Nolan · Accepted Answer · 2013-07-03 14:12:45Z

12

First step, use a cell array rather than string array:

arr = {'hello', 'world'; 'hello', 'again'; 'I----', 'said-'; 'hello', 'again'};

Second, use unique to get the unique strings (this doesn't work on a string array, which is why I suggest the cell):

[unique_strings, ~, string_map]=unique(arr);

Then use mode on the string_map variable to find the most common values:

most_common_string=unique_strings(mode(string_map));

answered Jul 3, 2013 at 14:12

Hugh Nolan

2,51915 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Eitan T Over a year ago

+1: but there's no need for cell arrays. You can use unique(arr, 'rows').

Hugh Nolan Over a year ago

Oh great, thanks! I don't use them very often, didn't know this function.

Hugh Nolan Over a year ago

Just a note about string arrays and the above comment: in this instance, the string would need to be reformatted so that each string was a separate row, rather than trying to have two strings on a single line - this only works as a cell, otherwise Matlab considers the whole line a single concatenated string, i.e. the initial arr in the question is equivalent to ['helloworld','helloagain';,'I----said-';'helloagain']

Eitan T Over a year ago

All strings in the question are concatenated vertically with a semicolon. I think you copy-pasted arr wrong.

Hugh Nolan Over a year ago

Oh weird. Thanks. Don't know how that happened.

|

Rody Oldenhuis · Accepted Answer · 2013-07-03 15:08:24Z

-1

It is better to use cell arrays and regexp function; the behavior of string arrays may not be what you expect.

arr = {'hello', 'world'; 'hello', 'again'; 'I----', 'said-'; 'hello', 'again'};

If you use

hellos = sum(~cellfun('isempty', regexp(arr, 'hello')));

it will return the number of 'hello''s in cell array arr.

edited Jul 3, 2013 at 15:08

Rody Oldenhuis

38.1k7 gold badges54 silver badges99 bronze badges

answered Jul 3, 2013 at 14:15

innoSPG

4,6561 gold badge31 silver badges42 bronze badges

2 Comments

Eitan T Over a year ago

-1: The question is about finding the most frequent string, not a specific predetermined string.

kelm Over a year ago

And even if you were looking for a specific string, regexp would be a bit overkill. strcmp can be used to identify equal strings in a cell array.

Collectives™ on Stack Overflow

Most frequent element in a string array, MATLAB

2 Answers 2

6 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related