0

I am trying to group certain data from array "frog". Array "frog" looks something like :

285,1944,10,12,579
286,1944,11,13,540
287,1944,12,14,550
285,1945,10,12,536
286,1945,11,13,504
287,1945,12,14,508
285,1946,10,12,522
286,1946,11,13,490
287,1946,12,14,486

The order is "Day of the Year", Year, Month,"Day of Month", and money. I want to put all the "Day of the Year"s with their correct Month and "Day of the Month"s. So there are three constraints (Day of the Year, Month, Day of the Month). An example output array would be something like:

285,1944,10,12,579 
285,1945,10,12,536
285,1946,10,12,522

I am unsure how to go about this. Is there possibly a faster way than using a while loop or for loop in this situation? Please let me know if you would like me to explain more.

Thanks

1
  • Is your frog array 1 or 2d and is your output just an ordered grouping based on the "Day of the Year" and "Day of the Month" indexes? Commented Feb 25, 2015 at 1:21

2 Answers 2

1

Python has a sort function which takes a key function, which can be arbitrarily defined. In this case, we can define a simple function, or even a lambda to do what we want.

However, as @Vasif mentions, there will be issues with leap-years, because, for example, day 285 might be October 13 one year, but then October 12 in a leap year, so that makes it trickier to require that 3-tuple as a constraint...

In any event:

# let's assume you've read in your file with something like csvreader
# so you've got a list of lists, similar to what @Vasif shows
sorted_a = sorted(a, key=lambda row: (row[0], row[2], row[3]))

This will create a new array, where everything is ordered first by "Day of Year" (so all the 285's will be together), then by "Month", then by "Day".

For completeness, we can operate on the array in place:

a.sort(key=lambda row:(row[0], row[2], row[3]))

And for more complex things (not necessary here, but may be nice to see):

def keyfunc(row):
    # could do anything you want with more complex data:
    # maybe row[0] is an index into a database that you query, or 
    # a URL that you request the page of, parse, and process somehow, etc...
    return (row[0], row[2], row[3])

sorted_a = sorted(a, key=keyfunc)
## or again:
a.sort(key=keyfunc)
Sign up to request clarification or add additional context in comments.

Comments

0

I am giving you a solution below. How ever i m not sure of your output. 1944 is a leap year.

import datetime as dt

a = [[285,1944,10,12,579],
[286,1944,11,13,540],
[287,1944,12,14,550],
[285,1945,10,12,536],
[286,1945,11,13,504],
[287,1945,12,14,508],
[285,1946,10,12,522],
[286,1946,11,13,490],
[287,1946,12,14,486]]

def solution(frog):
	goodlist=[]
	for l in frog:
		if isGood(l):
			print l 
			goodlist.append(l)
		print l , 'rejected'
	return goodlist


def isGood(l):
	[days,year,month,day,money] = l



	# http://stackoverflow.com/questions/2427555/python-question-year-and-day-of-year-to-date
	date = dt.datetime(year, 1, 1) + dt.timedelta(days - 1)
	# print date.month, date.day
	if date.month == month and date.day == day :
		return True
	return False

# print isGood([285,1944,10,12,579])
print solution(a)

1 Comment

I accounted for leap year by deleting all the 2-29 within my data set. My data set is a large 2D array. That has Index (Day of the Year), Year, Month, Day ,and Money as headers. What I will need to do with the sorted data is take the averages of each days money output throughout the years. So all the October 5th's money output within my 100 plus years of data will need to be averaged. In the end I am looking for 365 averages of money output for each day over the course of 100 years. Sorry if I made things more confusing.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.