4

I am struggling to convert character arrays containing asterisks ('*') into numeric doubles.

I have a cell array of character vectors based on data imported from a .dat file. For example, the cell array C contains a column of cells (e.g., C{1,1}, C{2,1}, ... C{n,1}), each of which containing a character vector, e.g., C{1,1} contains:

'23.000          *          *      1.000      1.000      1.000     34.000      5.065      6.719'

When I try to convert C{1,1} to a numeric double, MATLAB returns an empty double, e.g.,

new_double = str2num(C{1,1})

new_double =

     []

When I remove the asterisk manually, the code works:

 new_double = str2num(C{1,1})

 new_double =

   23.0000    1.0000    1.0000    1.0000   34.0000    5.0650    6.7190

All I want to do is read the data into a double array for further processing. I don't care if the command ignores the asterisks or replaces them with NaNs - the data with asterisks are not important to me. What is important is that I read data from the last two columns, e.g., 5.065 6.71. Unfortunately, I cannot index them since they are embedded within a character vector.

I have also tried using:

c2 = C{1,1};
new_double = sscanf(c2,'%f%'); 

But it stops reading at the asterisk, e.g.,

new_double =

    23

I have searched far and wide, the only useful post being: https://uk.mathworks.com/matlabcentral/answers/127847-how-to-read-csv-file-with-asterix However, I can't use this method because I am working from a character vector rather than delimited data.

2 Answers 2

3

Here's another way:

C{1,1} = '23.000          *          *      1.000      1.000      1.000     34.000      5.065      6.719';
result = str2double(strsplit(C{1}));

This gives

result =
   23.0000       NaN       NaN    1.0000    1.0000    1.0000   34.0000    5.0650    6.7190

This works as follows:

  1. strsplit splits the string at spaces. This gives a cell array of substrings formed by contiguous non-space characterse;
  2. str2double converts each of thsoe cells into a number, and gives a numeric vector as result, with NaN at entries that cannot be interpreted as numbers.

An advantage of using str2double over str2num is that the former doesn't internally use eval, so it cannot run potentially dangerous code.

Sign up to request clarification or add additional context in comments.

3 Comments

Clever. Certainly beats the regex approach. I didn't know str2double could do that.
@rayryeng Thanks! Using a regex is a nice approach too
Thank you both for your informative and speedy responses. I learned a lot from both. Time to spread the word about 'strsplit'!!
2

Let's do both. For the first case where you want to ignore the asterisks, you can remove them from the string and perform str2num as normal. Defining your data:

C{1,1} = '23.000          *          *      1.000      1.000      1.000     34.000      5.065      6.719';

... you can use regular expressions to potentially remove multiple asterisks that are in sequence (like if you had **, ***, etc.) and change them to the empty string with regexprep:

out = regexprep(C, '*+', '');

What this says is that for all strings in your cell array C, we replace any existing sequence of * with the empty string.

In this case, we get:

>> out = regexprep(C, '*+', '')

out =

  cell

    '23.000                          1.000      1.000      1.000     34.000      5.065      6.719'

You can go ahead and invoke str2num accordingly. Should you decide to replace the asterisks with NaN for example, just use regexprep again but specify NaN instead of the blank string:

out = regexprep(C, '*+', 'NaN');

We get:

>> out = regexprep(C, '*+', 'NaN');

out =

  cell

    '23.000          NaN          NaN      1.000      1.000      1.000     34.000      5.065      6.719'

The point is to replace the affected parts of your string with something else, and regexprep can certainly help.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.