Background:
Some time ago I asked a question here regarding the most efficient method for finding the neighbours of every point in a set of data by comparing each point against every other. I've attempted to optimise the process by breaking the coordinate list into separate sets based on their x coordinates, and grouping them in overlapping sets. The neighbour finder 'pairs_3' is then used on each set. This then gives me a full set of connections between vertices. The code below is a snippet of what I have so far:
def pairs_3(coordinates, vertex_spacing): #Function from previous question
i=-1
results_2 = []
results_2 [:]= []
for (xc, yc, zc) in enumerate(coordinates):
for (xb, yb, zb) in enumerate(coordinates[ic:]):
dx = xb - xc
if sqrt(dx**2) > a:
break
dy = yb - yc
dz = zb - zc
if (sqrt(dx**2+dy**2+dz**2)) == vertex_spacing:
i+=1
results_2.append([(xc, yc, zc), (xb, yb, zb), width, dx, dy, dz, edge])
return results_2
Coordinates_xyz=sorted(set(zip(Coord_x,Coord_y,Coord_z))) #Full 3D coordinate list
Coord_x_set=sorted(set(Coord_x))
numerical_width = Coord_x.count(Coord_x_set[1]) #gets number of columns
N=numerical_width
Col_subList = [Coordinates_xyz[n:n+2*N] for n in range(0, len(Coordinates_xyz), N)] #Col_subList is a list of grouped data according to x coordinate. The data is always discrete, so this is possible.
new_p_2=[]
new_p_2[:]=[]
new_p_3=[]
new_p_3[:]=[]
#where a is vertex separation, a known value used to generate the data
for x in Col_subList:
new_p_2.append(pairs_3(x, a))
for x in new_p_2:
for y in x:
new_p_3.append(y)
p_2=new_p_3
start=column(p_2,0) #Start Vertex
end=column(p_2, 1) #End Vertex
s_i=[] #Start Vertex Index
s_i[:]=[]
c_i=[] #Connection Index
c_i[:]=[]
for i in start:
indices = [ind for ind, x in enumerate(Coordinates_xyz_final) if x == i]
for j in indices:
s_i.append(j)
for i in xrange(len(start)):
c_i.append(i)
p_2 = zip(start, end, s_i, c_i) #Final form necessary for later calculations
Question:
Is there any way to further/better optimise this code? And are there any ways to further break down the problem that aren't more computationally expensive than the function itself?
Bonus Question:
Is there a way to efficiently integrate a 'divide and conquer' into the function itself? I.e break the list into sets based on columns inside the function?