3

Suppose I have a plaintext file test.dat:

foo bar baz
qux ham spam

I know want to load this into Octave (or Matlab if necessary) as a two-dimensional cell array, preserving the structure encoded in whitespace and newlines. According to my understanding of the documentation, the following should be the way to go:

format = '%s';
file = fopen('test.dat');
data = textscan(file,format);
fclose(file);
disp(data);

However this only loads the data as a one-dimensional array:

{
  [1,1] = 
  {
    [1,1] = foo
    [2,1] = bar
    [3,1] = baz
    [4,1] = qux
    [5,1] = ham
    [6,1] = spam
  }
}

Explicitly specifying Delimiter, Whitespace, and EndOfLine does not help (what’s the point of the latter then?); neither does using other loading functions like textread or dlmread. What does work is using format = '%s%s%s' in the above but this requires that I somehow identify the number of columns, which the function should be able to do itself.

Thus I ask: Is there any built-in function that does what I want? I am not interested in ways to write such a function myself – I am confident that I can do this, but that’s exactly what I want to avoid (as I need to use this for demonstrating good practice, and thus not re-inventing the wheel).

Related Q&As (that all work with knowing the number of columns):

2
  • If you use %s as a format, textscan will treat the whole line as one string, so yes you do need to know the number of columns. Your only other option is to scan each line at a time using fgetl and then parse the resulting line using whatever separator you have to split each line into separate strings. Commented Jan 26, 2018 at 14:27
  • @am304: If you use %s as a format, textscan will treat the whole line as one string – No, it doesn’t. It loads each of the six elements individually; just the arrangement gets lost. Commented Jan 26, 2018 at 14:29

4 Answers 4

5

You can use readtable

data = readtable('test.txt', 'ReadVariableNames', false, 'Delimiter', ' ')

Output:

Var1     Var2      Var3 
_____    _____    ______

'foo'    'bar'    'baz' 
'qux'    'ham'    'spam'

If you wanted a cell, not a table, you could use

data = table2cell( data );

>> data = {'foo'    'bar'    'baz' 
           'qux'    'ham'    'spam'}

I'm not sure that readtable is an Octave method, it seems to be on GitHub but I have no installation to check. It was introduced to Matlab in 2013b.


You could use lower level actions, reading the lines one by one

fid = fopen('test.txt','r');
data = {};
while ~feof(fid)
    line = fgets(fid);       % Read line
    A = strsplit(line, ' '); % Split on spaces
    data(end+1, :) = A;      % Append to output
end
fclose(fid);

>> data = {'foo'    'bar'    'baz' 
           'qux'    'ham'    'spam'}

This method assumes each row of data will have the same number of elements (same number of delimiters in each line). If you can't assume that, then a safer way would be to do data{end+1,1} = A, then splitting the lines afterward.

The only function used in this method which isn't low level file I/O is strsplit. This is a built-in for Octave and Matlab.

Sign up to request clarification or add additional context in comments.

Comments

3

In Octave you can use csv2cell from the package io:

pkg load io
result = csv2cell('test.dat',' ')

Comments

0

I would suggest that you have a look at fgetl() or fgets() functions. Basically you read the lines of the file and then you can apply your code with textscan() and get the "columns".

2 Comments

This does not address my question which was explicitly about not implementing this myself. @Wolfie: This is not a valid comment either (it does nothing comments are for).
@Wrzlprmft You're right, I've deleted the comment as I planned to, was just trying to help Ari learn how the site works. I've also updated my answer to use Octave built-in methods
0

I had the same problem. readtable.m was slow for me in Matlab, and fgetl examples are resizing in a loop. But perhaps an acceptable solution is based on this forum post: https://de.mathworks.com/matlabcentral/answers/476483-how-to-use-textscan-on-a-cell-array-without-a-loop

So, at least in newer Matlab:

fid=fopen(file,'r');
data=textscan(fid,'%s','Delimiter','\r\n');
fclose(fid);
data=split(data{1},';',1); 

I haven't tested split.m for speed with large data though.

1 Comment

Sorry, I forgot to add a cell array transpose to bring the data back to col/row shape.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.