0

I have a csv file with logs. I need to analyze it and select the necessary information from the file. The problem is that it has a lot of tables with headers. They don't have names. Tables are separated by empty rows and are also separated from each other. Let's say I need to select all data from the %idle column, where CPU = all

Structure:

09:20:06,CPU,%usr,%nice,%sys,%iowait,%steal,%irq,%soft,%guest,%idle
09:21:06,all,4.98,0.00,5.10,0.00,0.00,0.00,0.06,0.00,89.86
09:21:06,0,12.88,0.00,5.62,0.03,0.00,0.02,1.27,0.00,80.18

12:08:06,CPU,%usr,%nice,%sys,%iowait,%steal,%irq,%soft,%guest,%idle
12:09:06,all,5.48,0.00,5.24,0.00,0.00,0.00,0.12,0.00,89.15
12:09:06,0,18.57,0.00,5.35,0.02,0.00,0.00,3.00,0.00,73.06

09:20:06,runq-sz,plist-sz,ldavg-1,ldavg-5,ldavg-15
09:21:06,3,1444,2.01,2.12,2.15
09:22:06,4,1444,2.15,2.14,2.15

2 Answers 2

1

You can use below program to parse this csv.

result={}
with open("log.csv","r") as f:
    for table in f.read().split("\n\n"):
        rows=table.split("\n")
        header=rows[0]
        for row in rows[1:]:
            for i,j in zip(header.split(",")[1:],row.split(",")[1:]):
                if i in result:
                    result[i].append(j)
                else:
                    result[i]=[j]
print(result["%idle"])

Output (values of %idle)

['89.86', '80.18', '89.15', '73.06']

This assumes the table column and row values are in same order and no two tables have common column name.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the way, this is what I need
1

One rather dumb solution would be to use an "ordinary" file reader for the original CSV. You can read everything up to a new line break as a single CSV and then parse the text you just read in memory.

Every time you "see" a line break, you know to treat it as an entirely new CSV, so you can repeat the above procedure for it.

For example, you would have one string that contained:

09:20:06,CPU,%usr,%nice,%sys,%iowait,%steal,%irq,%soft,%guest,%idle
09:21:06,all,4.98,0.00,5.10,0.00,0.00,0.00,0.06,0.00,89.86
09:21:06,0,12.88,0.00,5.62,0.03,0.00,0.02,1.27,0.00,80.18

and then parse it in memory. Once you get to the line break after that, you would know that you needed a new string containing the following:

12:08:06,CPU,%usr,%nice,%sys,%iowait,%steal,%irq,%soft,%guest,%idle
12:09:06,all,5.48,0.00,5.24,0.00,0.00,0.00,0.12,0.00,89.15
12:09:06,0,18.57,0.00,5.35,0.02,0.00,0.00,3.00,0.00,73.06

etc. - you can just keep going like this for as many tables as you have.

7 Comments

I thought about it, but there are several tables and each is divided. As in my example, table 1 is divided into two parts and table 2 has one part. I need to somehow combine the parts together. But there may be a lot of such tables in the file, and I need to compare the headers of different tables with each other.
@DmitryDobberen Ah, I understand now. So, in your example, the first two tables should actually be parsed as a single table? Would some kind of dictionary based on the header row work? (I assume you'd have to strip out the date first if you did that, though, because that'll change - you could always use a regex for that, though).
Yes, these are two parts of the same table. You want to say that I would create a dictionary with headers and did a check? The thing is, there can be quite a lot of headers, and I don't want to find them manually
@DmitryDobberen My understanding is that the key would be the header row and the value would be the table content. Any time you encountered a new value, if the key already exists, you would append to the existing value - otherwise you'd add it entirely.
@DmitryDobberen Glad I could help. (It did just occur to me, though, one other way to strip off the date would be to find the index of the first comma and then do a substring - that would work too I think, as would a regex).
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.