I am using MKL LAPACKE_sgelss to compute least square solution to $Ax = b_j$, where $j \in [0 , nrhs) $, A is $m \times n$ matrix and $n < m < nrhs$.
I somehow managed to extract the solution and even the singular values of A. What I do not know is how to extract residual sum of squares. There is cryptic sentence in the documentation:
If m≥n and rank = n, the residual sum-of-squares for the solution in the i-th column is given by the sum of squares of modulus of elements n+1:m in that column.
My code does not seem to extract them properly. To be honest, I don't understand very well what is the meaning of the word "modulus". And why they speak about the column when in the row major order it would make more sense to be stored in the row.
The following pseudocode seems to work:
int m = 100;
int n = 10;
int nrhs = 1000;
float* A = new float[m * n]();
float* b = new float[m * nrhs](); // Prealocated with zeros
float* results = new float[n * nrhs]();
uint32_t i = 0;
for(uint32_t i = 0; i != m; i++)
{
//Fill A to write corresponding row into the address [A + i * n, A + i*n + n)
//Fill b to write corresponding row into the address [b + i * nrhs, b + i*nrhs + nrhs)
//in this row are the values of i-th element of each b_j vector in order
}
int inf;
std::string err;
float* s = new float[n];
float rcond = 1e-6; // 1e-9 too small, 1e-5 causing problems maybe to big
lapack_int rank;
inf = LAPACKE_sgelss(LAPACK_ROW_MAJOR, m, n, nrhs, A, n, b, nrhs, s, rcond, &rank);
if(s[1] < 1e-7)
{
LOGE << io::xprintf("Highly rank deficient matrix!");
}
if(rank < n)
{
LOGE << io::xprintf("Rank deficient matrix!");
}
std::string msg
= io::xprintf("Reported numerical rank of A is %d, singular values %f", rank, s[0]);
for(uint32_t I = 1; I < n; I++)
{
msg = io::xprintf("%s, %f", msg.c_str(), s[I]);
}
LOGI << io::xprintf("%s.", msg.c_str());
delete[] s;
for(uint32_t i = 0; i != n; i++)
{
std::copy(b + i *nrhs, b+ (i+1)*nrhs, results+i*nrhs);
}
//results array [results+i*nrhs, results+i*nrhs+nrhs) shall now contain i-th element of solution x for all right hand sides
float* rss = new float[nrhs];
std::fill(rss, rss + nrhs, 0.0f);
double modulus, sum;
for(uint32_t i = 0; i != nrhs; i++)
{
sum = 0.0;
for(uint32_t j = n; j != m; j++)
{
modulus = b[j * nrhs + i];
sum += modulus*modulus;
}
rss[i] = std::sqrt(sum);
}
Can you help me to understand how these j-th moduli are actually computed?