I have a file which contains repeating strings. File is very large so I give a simple example:
a b c
w a g
b v f
I want to extract a b to an array. How can I do this in MATLAB?
Try using TEXTSCAN. You can split the file by '\n' and then by whitespace with cell2mat.
fid = fopen('your_string_file.ext');
input = textscan(fid, '%s', 'delimiter', '\n');
cellmatrix = cell2mat(input{1});
cellmatrix =
a b c
d f a
b v f
Then if there is a specific pattern you want you can walk the cellmatrix. Assuming you want the a b pattern within a single row you could do the following:
pattern = ['a', 'b'];
patindex = 1;
dims = size(cellmatrix);
for i=1:dims(1)
patindex = 1;
for j=1:dims(2)
if strcmp(cellmatrix(i,j), ' ')
continue
end
if strcmp(cellmatrix(i,j), pattern(patindex))
patindex = patindex+1;
if patindex > length(pattern)
FOUND... store location/do what you want
patindex = 1;
end
else
patindex = 1;
end
end
end
You can change your check to find whatever pattern you want from the matrix.
This assumes your file will fit into memory -- if it's too large to fit in half your memory you'll need to do something much trickier with incremental passes and file writing.
After you have the cellmatrix from the answer 1!, you can use strcmp to create a true/falls matrix regarding you pattern:
strcmp(cellmatrix,'a')
If your file is very large, so it doesnt fit into you memory, try to read the file line-by-line using fgets:
fid = fopen('VERYBIGFILE');
tline = fgets(fid);
while ischar(tline)
disp(tline)
tline = fgets(fid);
%% DO SOME STUF WITH THE LINE
end
fclose(fid);
a b) relates to the repeating strings in your document?strfinddo that for you?