2

I have a CSV that looks like this:

0.500187550,CPU1,7.93
0.500187550,CPU2,1.62
0.500187550,CPU3,7.93
0.500187550,CPU4,1.62
1.000445359,CPU1,9.96
1.000445359,CPU2,1.61
1.000445359,CPU3,9.96
1.000445359,CPU4,1.61
1.500674877,CPU1,9.94
1.500674877,CPU2,1.61
1.500674877,CPU3,9.94
1.500674877,CPU4,1.61

The first column is time, the second the CPU used and the third is energy.

As a final result I would like to have these arrays:

Time:

[0.500187550, 1.000445359, 1.500674877]

Energy (per CPU): e.g. CPU1

[7.93, 9.96, 9.94]

For parsing the CSV I'm using:

query = csv.reader(csvfile, delimiter=',', skipinitialspace=True)
#Arrays global time and power:
for row in query:
    x = row[0]
    x = float(x)
    x_array.append(x) #column 0 to array
    y = row[2]
    y = float(y)
    y_array.append(y) #column 2 to array
print x_array
print y_array

These way I get all the data from time and energy into two arrays: x_array and y_array.

Then I order the arrays:

energy_core_ord_array = []
time_ord_array = []
#Dividing array into energy and time per core:
for i in range(number_cores[0]):
    e =  0 + i
    for j in range(len(x_array)/(int(number_cores[0]))):
        time_ord = x_array[e]
        time_ord_array.append(time_ord)
        energy_core_ord = y_array[e]
        energy_core_ord_array.append(energy_core_ord)
        e = e + int(number_cores[0])

And lastly, I cut the time array into the lenghts it should have:

final_time_ord_array = []
for i in range(len(x_array)/(int(number_cores[0]))):
    final_time_ord = time_ord_array[i]
    final_time_ord_array.append(final_time_ord)

Till here, although the code is not elegant, it works. The problem comes when I try to get the array for each core.

I get it for the first core, but when I try to iterate for the next one, I don´t know how to do it, and how can I store each array in a variable with a single name for example.

final_energy_core_ord_array = []
#Trunk energy core array:
for i in range(len(x_array)/(int(number_cores[0]))):
    final_energy_core_ord = energy_core_ord_array[i]
    final_energy_core_ord_array.append(final_energy_core_ord)
2
  • 1
    Will you allow the use of Pandas for this? Or only manual processing of the csv file as you stated? Commented Jan 19, 2016 at 11:26
  • 1
    Hi, I supposed I can use Pandas, because it a personal project, I really don't know what Pandas are, will take a look. Originally I wanted to continue as I started, but have no problem in doing another way. Commented Jan 19, 2016 at 11:33

2 Answers 2

3

So using Pandas (library to handle dataframes in Python) you can do something like this, which is much quicker than trying to process the CSV manually like you're doing:

import pandas as pd

csvfile = "C:/Users/Simon/Desktop/test.csv"

data = pd.read_csv(csvfile, header=None, names=['time','cpu','energy'])

times = list(pd.unique(data.time.ravel()))

print times

cpuList = data.groupby(['cpu'])

cpuEnergy = {}

for i in range(len(cpuList)):
    curCPU = 'CPU' + str(i+1)
    cpuEnergy[curCPU] = list(cpuList.get_group('CPU' + str(i+1))['energy'])

for k, v in cpuEnergy.items():
    print k, v

that will give the following as output:

[0.50018755000000004, 1.000445359, 1.5006748769999998]
CPU4 [1.6200000000000001, 1.6100000000000001, 1.6100000000000001]
CPU2 [1.6200000000000001, 1.6100000000000001, 1.6100000000000001]
CPU3 [7.9299999999999997, 9.9600000000000009, 9.9399999999999995]
CPU1 [7.9299999999999997, 9.9600000000000009, 9.9399999999999995]
Sign up to request clarification or add additional context in comments.

3 Comments

Will try and let you know! Thank you very much!
I changed the answer a bit to output the answer as a dictionary which is nicer than just using unnamed lists
It is so much cleaner I think I will be using this. Thanks!
1

Finally I got the answer, using globals.... not a great idea, but works, leave it here if someone find it useful.

    final_energy_core_ord_array = []
    #Trunk energy core array:
    a = 0
    for j in range(number_cores[0]):
        for i in range(len(x_array)/(int(number_cores[0]))):
            final_energy_core_ord = energy_core_ord_array[a + i]
            final_energy_core_ord_array.append(final_energy_core_ord)
        globals()['core%s' % j] = final_energy_core_ord_array
        final_energy_core_ord_array = []
        a = a + 12

    print 'Final time and cores:'
    print final_time_ord_array
    for j in range(number_cores[0]):
        print globals()['core%s' % j]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.