4

I would like to calculate the correlation matrix using linq, with a single phrase. How can I do that (if it is possible)?

Assume I have already an array of size N called volatilites and Returns is a jagged array, with N arrays all of the same size.

I am also using:

using stats = MathNet.Numerics.Statistics.ArrayStatistics

and this is the code that I want to make in LINQ:

double[,] correlation_matrix = new double[N,N];
for (int i=0; i<N;i++){
    for (int j = i + 1; j < N; j++){
        correlation_matrix [i,j]= stats.Covariance(Returns[i], Returns[j]) / (volatilities[i] * volatilities[j]); // stores it to check values       
    }
}

thanks!

9
  • 3
    I don't believe there are any linq operators that will create multi-dimensional arrays so I don't think you can do this as a LINQ one liner. Commented Jul 9, 2015 at 15:06
  • 3
    What's wrong with this simple approach (for loops)? Commented Jul 9, 2015 at 15:10
  • 4
    Linq isn't always the panacea you think. If it works, and is readable, then leave it alone :) Commented Jul 9, 2015 at 15:14
  • 1
    @Chris you can use Aggregate to fill out 2d array to LINQ-fy that code into single statement, but I doubt anyone would see it as improvement. Commented Jul 9, 2015 at 15:18
  • 1
    I'd agree with others. Your current approach is very clear on what it does. I'm sure you could write something that is LINQy to do this but it wouldn't be more clear than this. Commented Jul 9, 2015 at 15:22

2 Answers 2

6

If you let yourself have an array of arrays, you can do

var correlation_matrix = 
    Returns.Select((r_i, i) => 
        Returns.Where((r_j, j) => j > i).Select((r_j, j) =>
            stats.Covariance(r_i, r_j) / (volatilities[i] * volatilities[j])
        ).ToArray()
    ).ToArray();

If you want to use ranges (per your comment), you can do

var N = Returns.Length;
var correlation_matrix = 
    Enumerable.Range(0, N).Select(i => 
        Enumerable.Range(i + 1, N - i - 1).Select(j =>
            stats.Covariance(Returns[i], Returns[j]) / (volatilities[i] * volatilities[j])
        ).ToArray()
    ).ToArray();    

That's not to say you should do this. The loop version is both more readable and more performant.

Sign up to request clarification or add additional context in comments.

4 Comments

I'm almost reluctant to +1 this since as you say the original is so much better. But you do say the original is better so +1. :)
Thanks @Jerry! how about a version that takes two ranges going from 0 to N-1, i.e. {0,1,2,3,... N-1} and does it? Wouldn't it more readable?
@Escachator note that this code has off-by-one difference compared to your original sample. Not sure which one is correct so.
Alexei is right- I've corrected the count of the inner Enumerable.Range.
2

Per OP request Enumerable.Aggregate version with 2d array as result:

var correlation_matrix = 
   Enumerable.Range(0, N).Select(i => 
       Enumerable.Range(i + 1, N - i - 1).Select(j => 
         new {
            i, j, // capture location of the result
            Data = i + j } // compute whatever you need here
       )
   )
   .SelectMany(r => r) // flatten into single list
   .Aggregate(
       new double[N,N], 
       (result, item) => { 
           result[item.i, item.j] = item.Data; // use pos captured earlier
           return result; // to match signature required by Aggregate
        });

Side note: this is essentially exercise in using LINQ and not code that you should be using in real code.

  • code have to capture position into anonymous object causing a lot of unnecessary allocations
  • I think this version is significantly harder to read compared to regular for version

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.