0

I am working with a CSV file with the following format,

ST  1   2   3  4
WA  10  10  5  2
OR  0   7   3  9
CA  11  5   4  12
AZ  -999    0   0 11 

The first row represents # of days 1-4. I want to be able to take the data for each state, example WA, 10, 10, 5, 2 and create an array with just the numbers in that row that is sorted. If I omit the first index which is WA I can do this using.

sorted(list, key=int) 

Doing so would give me a list, [2,5,10,10].

What I want to do is

  1. Read each line of the CSV.
  2. Create an array of numbers using the numerical data.
  3. Run some calculations using array(Percent rank)
  4. Combine the calculated values with the correct state fields. For instance if I want to add a value of 3 to the array for WA.

    b.insert(list[4]), 3)
    

    to get

    [2,3,5,10,10] 
    

    so I can calculate rank. (Note: I am unable to use scipy so I must calculate rank using a function which I've already figured out.)

  5. End by writing State and rank value to new csv, something like.

    ST  Rank
    WA  30
    CA  26
    OR  55
    

    where Rank is the rank of the given value in the array.

I am pretty new to python so any help or pointers would be greatly appreciated. I am also limited to using basic python modules.(numpy, csv....etc)

UPDATE CODE:

   with open(outputDir+"needy.csv", 'rb') as f:
   first = {row[0]: sorted(row[1:], key=int) for row in list(csv.reader(f))}

   for key, value in first.items():
        if addn in first:
            g= "yes"
            print key, addn, g
            #print d
        else:
            g= "no"
            print key, addn, g

        value.append(300)
        value.append(22)
        value = sorted(value, key=int)

        print "State:", key, value

When i do this the values I append will be prpoperly added and the dict will be properly sorted, but when I define n as a value, it will not be fouund. example below.

{'WA': ['1', '1', '1', '2', '2', '2', '3', '4', '4', '4', '5', '5', '5', '5', '6', '6', '7', '7', '8', '8', '8', '8', '9', '10', '10', '10', '10', '11', '11'}

The above line is what happens if I simply print out first. If I utilize the for loop and specify addn as 11 as a global function I get.

WA 11 no
State: WA ['1', '1', '1', '2', '2', '2', '3', '4', '4', '4', '5', '5', '5',    '5', '6', '6', '7', '7', '8', '8', '8', '8', '9', '10', '10', '10', '10', '11', '11',..]

Being that 11 is part of the key it should return yes etc.

1 Answer 1

1

You can use simple commands and a dictionary to organize your data:

fid = open('out.txt')  # Just copy what you put in your question inside a file.
l = fid.readlines()  # Read the whole file into a list.
d = {}  # create a dictionary.
for i in l:
    s = i.split()  # split the list using spaces (default)
    d[s[0]] = [int(s[j]) for j in range(1,len(s))] # list comprehension to transform string into its for you number lists.

print(d)

, the result is:

{'CA': [11, 5, 4, 12], 'ST': [1, 2, 3, 4], 'OR': [0, 7, 3, 9], 'WA': [10, 10, 5, 2], 'AZ': [-999, 0, 0, 11]}

From this point you can do whatever you wish to your entries in the dictionary including append.

 d['CA'].append(3)

EDIT: @J.R.W. building the dictionary the way I recommended, followed by your code (plus the correction I gave):

fid = open('out.txt')  # Just copy what you put in your question inside a file.
l = fid.readlines()  # Read the whole file into a list.
first = {}  # create a dictionary.
for i in l:
    s = i.split()  # split the list using spaces (default)
    first[s[0]] = [int(s[j]) for j in range(1,len(s))] # list comprehension to transform string into its for you number lists.

print(first)
addn = 11
for key, value in first.items():
    if addn in value:
        g= "yes"
        print(key, addn, g)
        #print d
    else:
        g= "no"
        print(key, addn, g)

    value.append(300)
    value.append(22)
    value = sorted(value, key=int)

    print("State:", key, value)

, results in:

{'ST': [1, 2, 3, 4], 'CA': [11, 5, 4, 12], 'OR': [0, 7, 3, 9], 'AZ': [-999, 0, 0, 11], 'WA': [10, 10, 5, 2]}
ST 11 no
State: ST [1, 2, 3, 4, 22, 300]
CA 11 yes
State: CA [4, 5, 11, 12, 22, 300]
OR 11 no
State: OR [0, 3, 7, 9, 22, 300]
AZ 11 yes
State: AZ [-999, 0, 0, 11, 22, 300]
WA 11 no
State: WA [2, 5, 10, 10, 22, 300]

, which says yes when 11 exists (your own test), and no when it doesn't.

Sign up to request clarification or add additional context in comments.

10 Comments

So I've already done this much, what i don't understand is how to take that dict and pull just the numbers, run calculations and insert the calculations back into either a dict, array or list with the State Name values. I hope that makes sense.
Does the state value, example 'CA', become like a dict key?
It is a dict key. You call a value (list) on a dict by using it's key. So: d['CA'] (try print(d['CA'])). Just use those calls into whatever function you use to process the numbers and add the return to a new list or whatever object is convenient to you.
So, using your example if I print d I get ' {'FL,24,65,85,58,6,11,34,27,39,19,56,48 ': [], 'KY,1,0,0,10,0,14,9,12,15 :[]}'
I used my code to create a dictionary and can use the keys to enter values. When I try search for values in the keys it is not finding them as it should though, I will update my code in the question with an example.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.