0

I have a csv file generated from another program which looks like this:

45, 133, 148, 213,  65,  26,  22,  73
 84,  51,  41, 249,  25, 167, 102,  72
217, 198, 117, 123, 160,   9, 210, 211
230,  64,  37, 215,  91,  76, 240, 163
123, 169, 197,  16, 225, 160,  68,  65
 89, 247, 170,  88, 173, 206, 158, 235
144, 138, 188, 164,  84,  38,  67,  29
 98,  23, 106, 159,  96,   7,  77,  67
 
142, 140, 240,  56, 176,   0, 131, 160
241, 199,  96, 245, 213, 218,  51,  75
 22, 226,  81, 106,  94, 252, 252, 110
  0,  96, 132,  38, 189, 150, 162, 177
 95, 252, 107, 181,  72,   7,   0, 247
228, 207, 203, 128,  91, 158, 164, 116
 70, 124,  20,  37, 225, 169, 245, 103
103, 229, 186, 108, 151, 170,  18, 168

 52,  86, 244, 244, 150, 181,   9, 146
115,  60,  50, 162,  70, 253,  43,  94
201,  72, 132, 207, 181, 106, 136,  70
 92,   7,  97, 222, 149, 145, 155, 255
 55, 188,  90,  58, 124, 230, 215, 229
231,  60,  48, 150, 179, 247, 104, 162
 45, 241, 178, 122, 149, 243, 236,  92
186, 252, 165, 162, 176,  87, 238,  29

There is always a space following each 8x8 integer matrix.

I need to read each 8x8 matrix into a Python program, generate an operation on it, and then write the result that has the same format. The result will be 8x8 matrix of floats, with space following each 8x8 matrix.

How do I do these 2 things in Python 3.x? I could read the file bit by bit, but perhaps Python has a robust way to do this using small amount of code.

8
  • python have a csv lib: docs.python.org/3/library/csv.html did you try it ? Commented Jul 5, 2022 at 10:25
  • Also, you say these are integer matrices, and then that they are matrices of floats. Which one do you want? Commented Jul 5, 2022 at 10:35
  • The input data is from csv file and contains 8x8 matrices of integers. The output data is 8x8 matrices of float. Commented Jul 5, 2022 at 10:37
  • I have not used the csv package. Usually a csv file a table or lists of values. Here I have 8x8 matrices followed by space. The format is still using comma to separate values in the same row. However, I do not think that this really matches what a csv file would usually store. Commented Jul 5, 2022 at 10:39
  • Does my answer provide what you are looking for? Commented Jul 5, 2022 at 10:39

4 Answers 4

1

It's actually quite easy to do that with list / generator comprehension. I've spaced out things on multiple lines so it's more readable, but that's a personal preference.

def read_matrices(file):
    with open(file) as f:
        return [
            [
                [
                    float(coeff)
                    for coeff in line.split(",")
                ]
                for line in matrix.split("\n")
                if line.replace(" ", "") != ""
            ]
            for matrix in f.read().split("\n\n")
        ]

def write_matrices(matrices, file):
    text = "\n\n".join(
        "\n".join(
            ",".join(str(coeff) for coeff in line)
            for line in matrix
        )
        for matrix in matrices
    )

    with open(file, "w") as f:
        f.write(text + "\n") # If you want it to be newline-terminated
Sign up to request clarification or add additional context in comments.

4 Comments

I see, the solution relies on generator and list comprehension. What exactly does read_matrices, return? Is it a list of 8x8 matrices? If so, how do I know how many matrices have been read from the file and access each of the matrices?
@Quantum0xE7 yes, it's a list of matrices. For instance, if you have a = read_matrices("path/to/file"), then the number of matrices is just len(a), and the matrices can be accessed as elements of a list: a[0] to get the first one, a[1] for the second one and so on. You can also iterate over a list for matrix in a: ....
I see, so then all I need to do to is, for matrix in matrices; and then access the elements of the matrix using [row][col] as index.
@Quantum0xE7 Yes; and the second function expects a list of matrices of the same format.
1

If you already know that your matrices have 8 rows, you can use pandas.read_csv to load all the data in a numpy array, and just reshape it afterwards.

If you don't know beforehand the number of rows for each matrix, pandas.read_csv will make rows of all NaN for blank lines, which will allows you to infer the number of rows per matrix, and do the reshape:

import numpy as np
import pandas as pd

def read_csv(file, num_rows=None):
    if num_rows is not None:
        df = pd.read_csv(file, header=None, skip_blank_lines=True)
        arr = df.values
    else:
        df = pd.read_csv(file, header=None, skip_blank_lines=False)
        num_rows = extract_matrices_num_rows(df)
        valid_idxs = np.delete(
            np.arange(len(df)), np.arange(num_rows, len(df), num_rows + 1)
        )
        arr = df.iloc[valid_idxs].values

    return arr.reshape(-1, num_rows, arr.shape[-1])

def extract_matrices_num_rows(df):
    blank_lines_indices = all_nans_indices(df)
    blank_lines_indices = [-1, *blank_lines_indices, len(df)]
    num_rows = np.diff(blank_lines_indices) - 1
    num_rows = set(num_rows)
    if len(num_rows) > 1:
        raise ValueError(
            f"Matrices detected to have various number of rows: {num_rows}"
        )
    return num_rows.pop()

def all_nans_indices(df):
    return list(df[df.isnull().all(axis=1)].index)

Quick check that it works equally in both cases:

file = "data.csv"

assert np.array_equal(read_csv(file), read_csv(file, num_rows=8))

Comments

1

perhaps Python has a robust way to do this using small amount of code

actualy it has. as an option you can use pandas module. here is an example:

import pandas as pd

df = pd.read_csv('mtrx.csv', header=None, chunksize=9)
for i, matrix in enumerate(df):
    matrix.mul(10**i).fillna('').to_csv('mtrx1.csv', index=False, header=False, mode='a')

this code multiplies each matrix by 10 to the power of i and the result file looks like:

45,133.0,148.0,213.0,65.0,26.0,22.0,73.0
84,51.0,41.0,249.0,25.0,167.0,102.0,72.0
217,198.0,117.0,123.0,160.0,9.0,210.0,211.0
230,64.0,37.0,215.0,91.0,76.0,240.0,163.0
123,169.0,197.0,16.0,225.0,160.0,68.0,65.0
89,247.0,170.0,88.0,173.0,206.0,158.0,235.0
144,138.0,188.0,164.0,84.0,38.0,67.0,29.0
98,23.0,106.0,159.0,96.0,7.0,77.0,67.0
 ,,,,,,,
1420.0,1400.0,2400.0,560.0,1760.0,0.0,1310.0,1600.0
2410.0,1990.0,960.0,2450.0,2130.0,2180.0,510.0,750.0
220.0,2260.0,810.0,1060.0,940.0,2520.0,2520.0,1100.0
0.0,960.0,1320.0,380.0,1890.0,1500.0,1620.0,1770.0
950.0,2520.0,1070.0,1810.0,720.0,70.0,0.0,2470.0
2280.0,2070.0,2030.0,1280.0,910.0,1580.0,1640.0,1160.0
700.0,1240.0,200.0,370.0,2250.0,1690.0,2450.0,1030.0
1030.0,2290.0,1860.0,1080.0,1510.0,1700.0,180.0,1680.0
,,,,,,,
5200,8600,24400,24400,15000,18100,900,14600
11500,6000,5000,16200,7000,25300,4300,9400
20100,7200,13200,20700,18100,10600,13600,7000
9200,700,9700,22200,14900,14500,15500,25500
5500,18800,9000,5800,12400,23000,21500,22900
23100,6000,4800,15000,17900,24700,10400,16200
4500,24100,17800,12200,14900,24300,23600,9200
18600,25200,16500,16200,17600,8700,23800,2900

upd

as for lines with commas it means that those rows in csv file have no data, i.e. empty rows.

4 Comments

This doesn't quite look like what the OP wanted...
@BlackBeans it reads matrices separately from a csv file, does operations with every matrix (multiply in this code for example) and writes result matrices to a new csv file. I believe that is what OP wanted. or not?
It's fine for the reading part and for the processing part, but the output produced is not quite what they want, there are commas between the matrices.
@BlackBeans yeah, but that's not the final solution, that's a possible way to do the trick 'How do I do these 2 things in Python 3.x? '
1

Below solution uses Pandas & Numpy. As for example operation, below add 2 to each value of matrix here - [df.values[i:i+8]+2. Output will be same as input format CSV, including blank lines.

import pandas as pd
import numpy as np
df = pd.read_csv('Book2.csv', skip_blank_lines=False, header=None)

updated_metrcies = [np.vstack([df.values[i:i+8]+2,np.repeat(np.nan, df.shape[1])]) for i in range(0, df.shape[0], 9) if i < df.shape[0]]

pd.DataFrame(np.vstack(updated_metrcies)[:-1]).to_csv('Book4.csv', index=False, header=None)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.