Getting data arrays from CSV with loops

Question

I have a CSV that looks like this:

0.500187550,CPU1,7.93
0.500187550,CPU2,1.62
0.500187550,CPU3,7.93
0.500187550,CPU4,1.62
1.000445359,CPU1,9.96
1.000445359,CPU2,1.61
1.000445359,CPU3,9.96
1.000445359,CPU4,1.61
1.500674877,CPU1,9.94
1.500674877,CPU2,1.61
1.500674877,CPU3,9.94
1.500674877,CPU4,1.61

The first column is time, the second the CPU used and the third is energy.

As a final result I would like to have these arrays:

Time:

[0.500187550, 1.000445359, 1.500674877]

Energy (per CPU): e.g. CPU1

[7.93, 9.96, 9.94]

For parsing the CSV I'm using:

query = csv.reader(csvfile, delimiter=',', skipinitialspace=True)
#Arrays global time and power:
for row in query:
    x = row[0]
    x = float(x)
    x_array.append(x) #column 0 to array
    y = row[2]
    y = float(y)
    y_array.append(y) #column 2 to array
print x_array
print y_array

These way I get all the data from time and energy into two arrays: x_array and y_array.

Then I order the arrays:

energy_core_ord_array = []
time_ord_array = []
#Dividing array into energy and time per core:
for i in range(number_cores[0]):
    e =  0 + i
    for j in range(len(x_array)/(int(number_cores[0]))):
        time_ord = x_array[e]
        time_ord_array.append(time_ord)
        energy_core_ord = y_array[e]
        energy_core_ord_array.append(energy_core_ord)
        e = e + int(number_cores[0])

And lastly, I cut the time array into the lenghts it should have:

final_time_ord_array = []
for i in range(len(x_array)/(int(number_cores[0]))):
    final_time_ord = time_ord_array[i]
    final_time_ord_array.append(final_time_ord)

Till here, although the code is not elegant, it works. The problem comes when I try to get the array for each core.

I get it for the first core, but when I try to iterate for the next one, I don´t know how to do it, and how can I store each array in a variable with a single name for example.

final_energy_core_ord_array = []
#Trunk energy core array:
for i in range(len(x_array)/(int(number_cores[0]))):
    final_energy_core_ord = energy_core_ord_array[i]
    final_energy_core_ord_array.append(final_energy_core_ord)

Will you allow the use of Pandas for this? Or only manual processing of the csv file as you stated? — Simon
– Simon, Commented Jan 19, 2016 at 11:26
Hi, I supposed I can use Pandas, because it a personal project, I really don't know what Pandas are, will take a look. Originally I wanted to continue as I started, but have no problem in doing another way. — anexo
– anexo, Commented Jan 19, 2016 at 11:33

Simon · Accepted Answer · 2016-01-19 11:44:12Z

3

So using Pandas (library to handle dataframes in Python) you can do something like this, which is much quicker than trying to process the CSV manually like you're doing:

import pandas as pd

csvfile = "C:/Users/Simon/Desktop/test.csv"

data = pd.read_csv(csvfile, header=None, names=['time','cpu','energy'])

times = list(pd.unique(data.time.ravel()))

print times

cpuList = data.groupby(['cpu'])

cpuEnergy = {}

for i in range(len(cpuList)):
    curCPU = 'CPU' + str(i+1)
    cpuEnergy[curCPU] = list(cpuList.get_group('CPU' + str(i+1))['energy'])

for k, v in cpuEnergy.items():
    print k, v

that will give the following as output:

[0.50018755000000004, 1.000445359, 1.5006748769999998]
CPU4 [1.6200000000000001, 1.6100000000000001, 1.6100000000000001]
CPU2 [1.6200000000000001, 1.6100000000000001, 1.6100000000000001]
CPU3 [7.9299999999999997, 9.9600000000000009, 9.9399999999999995]
CPU1 [7.9299999999999997, 9.9600000000000009, 9.9399999999999995]

edited Jan 19, 2016 at 11:44

answered Jan 19, 2016 at 11:42

Simon

10.2k16 gold badges69 silver badges123 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

anexo Over a year ago

Will try and let you know! Thank you very much!

Simon Over a year ago

I changed the answer a bit to output the answer as a dictionary which is nicer than just using unnamed lists

anexo Over a year ago

It is so much cleaner I think I will be using this. Thanks!

anexo · Accepted Answer · 2016-01-19 11:55:21Z

1

Finally I got the answer, using globals.... not a great idea, but works, leave it here if someone find it useful.

    final_energy_core_ord_array = []
    #Trunk energy core array:
    a = 0
    for j in range(number_cores[0]):
        for i in range(len(x_array)/(int(number_cores[0]))):
            final_energy_core_ord = energy_core_ord_array[a + i]
            final_energy_core_ord_array.append(final_energy_core_ord)
        globals()['core%s' % j] = final_energy_core_ord_array
        final_energy_core_ord_array = []
        a = a + 12

    print 'Final time and cores:'
    print final_time_ord_array
    for j in range(number_cores[0]):
        print globals()['core%s' % j]

answered Jan 19, 2016 at 11:55

anexo

5151 gold badge11 silver badges23 bronze badges

Collectives™ on Stack Overflow

Getting data arrays from CSV with loops

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related