0

I am working in MATLAB with a structure containing numeric arrays of different sizes, with rows like these:

        SCD     |  HTD    |  EHD    | CSD
 T = [ 300*256  | 300*62  | 305*80  | 305*256 ...
       200*256  | 400*62  | 105*80  | 505*256 ...]

and T contains many rows (size(T) = [1,965]). What I would like to do, is for each column, to compute the mean of the component over rows. I currently do it like this:

Tmean = [] ;
for i = 1 : size(T,2)
     A = T(i).SCD ;
     Tmean(i).SCD = mean(table2array(A));      
end

And we need to do this over all columns. Is it possible to do this without using many loops?

The output of T(1) and T(2) look like this:

  T(1)

  ans = 

    SCD: [305x256 table]
    HTD: [305x62 table]
    EHD: [305x80 table]
    DCD: [337x51 table]
    CSD: [305x256 table]
    CLD: [305x120 table]
    movieId: 89

   T(2)

   ans = 

    SCD: [263x256 table]
    HTD: [263x62 table]
    EHD: [263x80 table]
    DCD: [732x9 table]
    CSD: [263x256 table]
    CLD: [263x120 table]
    movieId: 93

I expect Tmean_SCD for T(1) to look like a [1*256] array and the same for T(2) and all. Because all columns in the first field have 256 columns we can place them in an array with 256 columns and 965 rows.

11
  • Wait, I think I misunderstood the question. The T(i).SCD contains an array of values you want averaged? For each i a new array? And then do this for all columns in the struct? Commented Mar 26, 2016 at 19:59
  • Exactly, I answered below your reply. Thanks Commented Mar 26, 2016 at 20:01
  • Okay, deleted my answer, I'll see if I can figure it out for you and post a new answer if I do. Commented Mar 26, 2016 at 20:04
  • 2
    It would help if you post a short code on how to populate say the first two structs T(1) and T(2).. Commented Mar 26, 2016 at 20:09
  • 1
    ah, now it's starting to make sense. So I assume you want to compute this mean for each field? Can you also show us what you expect Tmean to look like (again show the expected Tmean(1) and Tmean(2)).. Commented Mar 26, 2016 at 20:19

2 Answers 2

2

Here is one solution:

Tmean = struct();
fields = {'SCD', 'HTD', 'EHD', 'DCD', 'CSD', 'CLD'};
for i=1:numel(fields)
    Tmean.(fields{i}) = cell2mat(cellfun(@(t) mean(table2array(t)), ...
        {T.(fields{i})}, 'Uniform',false)');
end

To test it, I generated this sample structure array resembling your data in shape (random values):

T = struct();
for i=1:10  % your data is 965
    T(i).SCD = array2table(rand(randi([2 20]), 256));
    T(i).HTD = array2table(rand(randi([2 20]), 62));
    T(i).EHD = array2table(rand(randi([2 20]), 80));
    T(i).DCD = array2table(rand(randi([2 20]), 51));
    T(i).CSD = array2table(rand(randi([2 20]), 256));
    T(i).CLD = array2table(rand(randi([2 20]), 120));
    T(i).movieId = i;
end

The actual result:

>> Tmean
Tmean = 
    SCD: [10x256 double]
    HTD: [10x62 double]
    EHD: [10x80 double]
    DCD: [10x51 double]
    CSD: [10x256 double]
    CLD: [10x120 double]

a scalar struct, each field is a matrix of size 956-by-(columnSize)

Sign up to request clarification or add additional context in comments.

2 Comments

if you find it too complicated, you can just unroll those dynamic field accesses into hardcoded struct.field statements. Also the cellfun is a loop in disguise
Thank you very much Amro, interesting solution !
0

Going with the request of a vectorized solution, here's an almost vectorized one. Almost because we still have one arrayfun, which is basically a wrapper to loop, but in our case it's used to give us the sizes of the input arrays only, so no or minimal computation involved there. The implementation would look like this -

accum_data = table2array(vertcat(T(:).SCD))
csums = cumsum(accum_data,1)
lens = arrayfun(@(n) size(T(n).SCD,1),1:size(T,2))
cut_idx = cumsum(lens)
sums = [csums(cut_idx(1),:) ; diff(csums(cut_idx,:),[],1)]
Tmean_SCDOut = bsxfun(@rdivide,sums,lens(:))

For performing the same averaging operation on other fields similarly, you need to iterate through them likewise.

2 Comments

you would still need to do this for the other fields: HTD, EHD, DCD, CSD, and CLD; hence the other loop :)
@Amro Yeah I was going with the Tmean from the question that only involves .SCD. I am assuming OP would take care of the other fields likewise :) I gotta add that as a note maybe.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.