Compress a sparse matrix using Compressed sparse row (CSR, CRS or Yale format).
These are all the same form of compression (ignore new Yale).
Input may be any 2d data structure (list of lists, etc): e.g
[[0 0 0 0],
[5 8 0 0],
[0 0 3 0],
[0 6 0 0]]
And the output should be three 1d data structures (list etc), that denote the outputs A, IA and JA, for example
[5, 8, 3, 6]
[0, 0, 2, 3, 4]
[0, 1, 2, 1,]
The process is described by wikipedia:
The array
Ais of lengthNNZand holds all the nonzero entries ofMin left-to-right top-to-bottom ("row-major") order.The array
IAis of lengthm + 1. It is defined by this recursive definition:
IA[0] = 0
IA[i] = IA[i − 1] + (number of nonzero elements on the (i − 1)-th row in the original matrix)Thus, the first
melements ofIAstore the index intoAof the first nonzero element in each row ofM, and the last elementIA[m]storesNNZ, the number of elements inA, which can be also thought of as the index inAof first element of a phantom row just beyond the end of the matrixM. The values of thei-th row of the original matrix is read from the elementsA[IA[i]]toA[IA[i + 1] − 1](inclusive on both ends), i.e. from the start of one row to the last index just before the start of the next.[5]The third array,
JA, contains the column index inMof each element ofAand hence is of lengthNNZas well.
If your language doesn't support actual data structures, input and output may be text.
Test cases
Input 1:
[[0 0 0 0],
[5 8 0 0],
[0 0 3 0],
[0 6 0 0]]
Output 1:
[ 5, 8, 3, 6 ]
[ 0, 0, 2, 3, 4 ]
[ 0, 1, 2, 1, ]
Input 2
[[10 20 0 0 0 0],
[0 30 0 40 0 0],
[0 0 50 60 70 0],
[0 0 0 0 0 80]]
Output 2:
[ 10 20 30 40 50 60 70 80 ]
[ 0 2 4 7 8 ]
[ 0 1 1 3 2 3 4 5 ]
Input 3:
[[0 0 0],
[0 0 0],
[0 0 0]]
Output 3:
[ ]
[ 0 0 0 0 ]
[ ]
Input 4:
[[1 1 1],
[1 1 1],
[1 1 1]]
Output 4:
[ 1 1 1 1 1 1 1 1 1 ]
[ 0 3 6 9 ]
[ 0 1 2 0 1 2 0 1 2 ]
Input 5:
[[0 0 0 0],
[5 -9 0 0],
[0 0 0.3 0],
[0 -400 0 0]]
Output 5:
[ 5, -9, 0.3, -400 ]
[ 0, 0, 2, 3, 4 ]
[ 0, 1, 2, 1, ]
Assume inputs may contain any real number, you need not consider mathematical symbols or exponential representation (e.g. 5,000 will never be entered as 5e3). You will not need to handle inf, -inf, NaN or any other 'pseudo-numbers'. You may output a different representation of the number (5,000 may be output as 5e3 if you so choose).
Scoring
This is a code-golf, fewest bytes wins.
IA[0] = 0completely unnecessary? It's only needed to defineIA[i] = IA[i − 1]..., yet we could simply state that ifi-1 < 0to use 0. That is, IA[0] is always equal to 0, therefor it can be compressed out (yes, I realize that this is a critique of the algorithm, not this challenge). \$\endgroup\$