0

I am trying to read the following data into MATLAB:

'0.000000 1  18EFFA59x  Rx D 8  AD  09  02  00  00  00  00  30'  
'0.004245 1  14EFF01Cx  Rx D 6  DB  00  FF  FF  00  71'  
'0.004640 1  CEF801Cx   Rx D 3  3F  00  3B'  
'0.005130 1  14EF131Cx  Rx D 6  DB  00  FF  FF  00  71'  
'0.005630 1  CEF801Cx   Rx D 3  3F  00  C3'  
'0.010015 1  18EFFA59x  Rx D 8  AD  07  01  00  00  00  00  30'  
'0.014145 1  CF004F0x   Rx D 8  F0  FF  7D  00  00  FF  FF  FF'  
'0.015060 1  18EFFA59x  Rx D 8  AD  07  02  00  00  00  00  30'  
'0.018235 1  18EF1CF0x  Rx D 8  F2  1E  05  FF  FF  00  71  FF'  
'0.018845 1  18EA5941x  Rx D 3  09  FF  00'  

I can easily read in each line as a string - but to make post-processing more efficient I'd like to separate each line by its delimiter - which is whitespace. In other words, the end result should be a non-singleton cell array. I can't seem to find a very efficient way of doing this. Efficiency is important because these files are several million lines long and processing in MATLAB with strings/cells takes a long time.

Any help would be appreciated. Thanks.

3
  • what have you already tried? Is f1=fopen(file.txt); textscan(f1,'%s','delimiter',' '); not efficient enough? What should your resulting cell array look like? Commented Jul 22, 2015 at 22:09
  • or use the import data tool and have it export a script to import. You can make it import them into individual vectors or an array using that utility. it then generates a script to function that you can modify Commented Jul 23, 2015 at 0:50
  • If you can read each line as string, then just use strsplit to split it by space. Commented Jul 23, 2015 at 6:30

1 Answer 1

0

You appear to have fixed-width fields, so I would treat it as such and let textscan do the most of the pre-processing for you by turning off delimiters and whitespace and defining the field widths and types explicitly:

test = {...
    '0.000000 1  18EFFA59x  Rx D 8  AD  09  02  00  00  00  00  30'
    '0.004245 1  14EFF01Cx  Rx D 6  DB  00  FF  FF  00  71'
    '0.004640 1  CEF801Cx   Rx D 3  3F  00  3B'
    '0.005130 1  14EF131Cx  Rx D 6  DB  00  FF  FF  00  71'
    '0.005630 1  CEF801Cx   Rx D 3  3F  00  C3'
    '0.010015 1  18EFFA59x  Rx D 8  AD  07  01  00  00  00  00  30'
    '0.014145 1  CF004F0x   Rx D 8  F0  FF  7D  00  00  FF  FF  FF'
    '0.015060 1  18EFFA59x  Rx D 8  AD  07  02  00  00  00  00  30'
    '0.018235 1  18EF1CF0x  Rx D 8  F2  1E  05  FF  FF  00  71  FF'
    '0.018845 1  18EA5941x  Rx D 3  09  FF  00'};

test = strjoin(test', '\n');

C = textscan(test, '%8.6f %2u %11s %4s %2s %2u %33s', 'delimiter', '','whitespace','');

col1 = C{1};
col2 = C{2};
col3 = strtrim(C{3});
col3 = cellfun(@(x)hex2dec(x(1:end-1)), col3); % for instance.
col4 = strtrim(C{4});
col5 = strtrim(C{5});
col6 = C{6};
col7 = strtrim(C{7});

In the real world, you'd substitute the text string for a file id. For the last variable-length field, just read the whole thing in, making sure you specify the maximum possible length. MATLAB will read a field until it gets to the end or reaches a newline character (in fact, I made the last field width 1 larger, just to make sure). Each field is then aggregated into a cell. I also took the liberty of converting the third field from hex to decimal to show how you might post-process the numbers further.

As a further note, if you really do have gigantic files and need maximum speed, you could skip the strtrim step on the character fields by specifying %*ns where n is the desired field width, for any known gaps such as the 2 character gap between columns 3 and 4. The star says to ignore that field. I find this way of doing things a bit more readable and intuitive, however, and leaves a small margin of error in case one of the fields, such as the 4th, occasionally has a 3 character entry.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.