Creating array from CSV in Python while limiting columns used

Question

I am working with a CSV file with the following format,

ST  1   2   3  4
WA  10  10  5  2
OR  0   7   3  9
CA  11  5   4  12
AZ  -999    0   0 11

The first row represents # of days 1-4. I want to be able to take the data for each state, example WA, 10, 10, 5, 2 and create an array with just the numbers in that row that is sorted. If I omit the first index which is WA I can do this using.

sorted(list, key=int)

Doing so would give me a list, [2,5,10,10].

What I want to do is

Read each line of the CSV.
Create an array of numbers using the numerical data.
Run some calculations using array(Percent rank)
Combine the calculated values with the correct state fields. For instance if I want to add a value of 3 to the array for WA.
```
b.insert(list[4]), 3)
```
to get
```
[2,3,5,10,10] 
```
so I can calculate rank. (Note: I am unable to use scipy so I must calculate rank using a function which I've already figured out.)
End by writing State and rank value to new csv, something like.
```
ST  Rank
WA  30
CA  26
OR  55
```
where Rank is the rank of the given value in the array.

I am pretty new to python so any help or pointers would be greatly appreciated. I am also limited to using basic python modules.(numpy, csv....etc)

UPDATE CODE:

   with open(outputDir+"needy.csv", 'rb') as f:
   first = {row[0]: sorted(row[1:], key=int) for row in list(csv.reader(f))}

   for key, value in first.items():
        if addn in first:
            g= "yes"
            print key, addn, g
            #print d
        else:
            g= "no"
            print key, addn, g

        value.append(300)
        value.append(22)
        value = sorted(value, key=int)

        print "State:", key, value

When i do this the values I append will be prpoperly added and the dict will be properly sorted, but when I define n as a value, it will not be fouund. example below.

{'WA': ['1', '1', '1', '2', '2', '2', '3', '4', '4', '4', '5', '5', '5', '5', '6', '6', '7', '7', '8', '8', '8', '8', '9', '10', '10', '10', '10', '11', '11'}

The above line is what happens if I simply print out first. If I utilize the for loop and specify addn as 11 as a global function I get.

WA 11 no
State: WA ['1', '1', '1', '2', '2', '2', '3', '4', '4', '4', '5', '5', '5',    '5', '6', '6', '7', '7', '8', '8', '8', '8', '9', '10', '10', '10', '10', '11', '11',..]

Being that 11 is part of the key it should return yes etc.

armatita · Accepted Answer · 2016-03-21 17:58:08Z

1

You can use simple commands and a dictionary to organize your data:

fid = open('out.txt')  # Just copy what you put in your question inside a file.
l = fid.readlines()  # Read the whole file into a list.
d = {}  # create a dictionary.
for i in l:
    s = i.split()  # split the list using spaces (default)
    d[s[0]] = [int(s[j]) for j in range(1,len(s))] # list comprehension to transform string into its for you number lists.

print(d)

, the result is:

{'CA': [11, 5, 4, 12], 'ST': [1, 2, 3, 4], 'OR': [0, 7, 3, 9], 'WA': [10, 10, 5, 2], 'AZ': [-999, 0, 0, 11]}

From this point you can do whatever you wish to your entries in the dictionary including append.

 d['CA'].append(3)

EDIT: @J.R.W. building the dictionary the way I recommended, followed by your code (plus the correction I gave):

fid = open('out.txt')  # Just copy what you put in your question inside a file.
l = fid.readlines()  # Read the whole file into a list.
first = {}  # create a dictionary.
for i in l:
    s = i.split()  # split the list using spaces (default)
    first[s[0]] = [int(s[j]) for j in range(1,len(s))] # list comprehension to transform string into its for you number lists.

print(first)
addn = 11
for key, value in first.items():
    if addn in value:
        g= "yes"
        print(key, addn, g)
        #print d
    else:
        g= "no"
        print(key, addn, g)

    value.append(300)
    value.append(22)
    value = sorted(value, key=int)

    print("State:", key, value)

, results in:

{'ST': [1, 2, 3, 4], 'CA': [11, 5, 4, 12], 'OR': [0, 7, 3, 9], 'AZ': [-999, 0, 0, 11], 'WA': [10, 10, 5, 2]}
ST 11 no
State: ST [1, 2, 3, 4, 22, 300]
CA 11 yes
State: CA [4, 5, 11, 12, 22, 300]
OR 11 no
State: OR [0, 3, 7, 9, 22, 300]
AZ 11 yes
State: AZ [-999, 0, 0, 11, 22, 300]
WA 11 no
State: WA [2, 5, 10, 10, 22, 300]

, which says yes when 11 exists (your own test), and no when it doesn't.

edited Mar 21, 2016 at 17:58

answered Mar 21, 2016 at 15:33

armatita

13.6k9 gold badges54 silver badges53 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

J.R.W Over a year ago

So I've already done this much, what i don't understand is how to take that dict and pull just the numbers, run calculations and insert the calculations back into either a dict, array or list with the State Name values. I hope that makes sense.

J.R.W Over a year ago

Does the state value, example 'CA', become like a dict key?

armatita Over a year ago

It is a dict key. You call a value (list) on a dict by using it's key. So: d['CA'] (try print(d['CA'])). Just use those calls into whatever function you use to process the numbers and add the return to a new list or whatever object is convenient to you.

J.R.W Over a year ago

So, using your example if I print d I get ' {'FL,24,65,85,58,6,11,34,27,39,19,56,48 ': [], 'KY,1,0,0,10,0,14,9,12,15 :[]}'

J.R.W Over a year ago

I used my code to create a dictionary and can use the keys to enter values. When I try search for values in the keys it is not finding them as it should though, I will update my code in the question with an example.

|

Collectives™ on Stack Overflow

Creating array from CSV in Python while limiting columns used

1 Answer 1

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related