2

I'm dealing with two sets of three large lists of the same size containing longitude, latitude and altitude coordinates in UTM format (see lists below). The arrays contain overlapping coordinates (i.e. longitude and latitude values are equal). If the values in Lon are equal to Lon2 and the values in Lat are equal to Lat2 then I want to calculate the mean altitude at those indexes. However, if they're not equal then the longitude, latitude and altitude values will remain. I only want to replace the overlapping data to one set of longitude and latitude coordinates and calculate the mean at those coordinates.

This is my attempt so far

 import numpy as np

 Lon = [450000.50,459000.50,460000,470000]
 Lat = [5800000.50,459000.50,500000,470000]
 Alt = [-1,-9,-2,1]
 Lon2 = [450000.50,459000.50,460000,470000]
 Lat2 = [5800000.50,459000.50,800000,470000]
 Alt2= [-3,-1,-20,2]

 MeanAlt = []
 appendAlt = MeanAlt.append
 LonOverlap = []
 appendLon = LonOverlap.append
 LatOverlap = []
 appendLat = LatOverlap.append

 for i, a in enumerate(Lon and Lat and Alt):
     for j, b in enumerate(Lon2 and Lat2 and Alt2):
         if Lon[i]==Lon2[j] and Lat[i]==Lat2[j]:
             MeanAltData = (Alt[i]+Alt2[j])/2
             appendAlt(MeanAltData)
             LonOverlapData = Lon[i]
             appendLat(LonOverlapData)
             LatOverlapData = Lat[i]
             appendLon(LatOverlapData)

 print(MeanAlt) # correct ans should be MeanAlt = [-2.0,-5,1.5]
 print(LonOverlap)
 print(LatOverlap)

I'm working in a jupyter notebook and my laptop is rather slow so I need to make this code much more efficient. I would appreciate any help on this. Thank you :)

3
  • Why third MeanAlt is -5? Lat[2]!=Lat2[2] and so, according to your problem formulation "...if they're not equal then the longitude, latitude and altitude values will remain". What does "altitude values will remain" mean and how did this affect MeanAlt[2]? Commented Jul 21, 2017 at 15:59
  • Ah, it seems that you are dropping values corresponding to unequal Lon or Lat. Please confirm. Commented Jul 21, 2017 at 16:01
  • Why do you have import numpy as np at the beginning and you never use np anywhere in your code? Commented Jul 21, 2017 at 18:08

3 Answers 3

1

I believe your code can be improved in 2 ways:

  • Firstly, the usage of tuples instead of lists, as iterating over a tuple is generally faster than iterating over a list.
  • Secondly, your for loops can be reduced to only one loop that iterates over the indices of the tuples you are going to read. Of course, this assumption holds if and only if all your tuples contain the same amount of items (i.e.: len(Lat) == len(Lon) == len(Alt) == len(Lat2) == len(Lon2) == len(Alt2)).

Here is the improved code (I took the liberty of removing the import numpy statement as it was not being used in the piece of code you provided):

# use of tuples
Lon = (450000.50, 459000.50, 460000, 470000)
Lat = (5800000.50, 459000.50, 500000, 470000)
Alt = (-1, -9, -2, 1)
Lon2 = (40000.50, 459000.50, 460000, 470000)
Lat2 = (5800000.50, 459000.50, 800000, 470000)
Alt2 = (-3, -1, -20, 2)

MeanAlt = []
appendAlt = MeanAlt.append
LonOverlap = []
appendLon = LonOverlap.append
LatOverlap = []
appendLat = LatOverlap.append

# only one loop
for i in range(len(Lon)):
    if (Lon[i] == Lon2[i]) and (Lat[i] == Lat2[i]):
        MeanAltData = (Alt[i] + Alt2[i]) / 2
        appendAlt(MeanAltData)
        LonOverlapData = Lon[i]
        appendLat(LonOverlapData)
        LatOverlapData = Lat[i]
        appendLon(LatOverlapData)

print(MeanAlt)  # correct ans should be MeanAlt = [-2.0,-5,1.5]
print(LonOverlap)
print(LatOverlap)

I executed this program 1 million times on my laptop. Following my code, the amount of time required for all executions is: 1.41 seconds. On the other hand, with your approach the amount of time it takes is: 4.01 seconds.

Sign up to request clarification or add additional context in comments.

4 Comments

Also, I believe there is a mistake in your expected result of MeanAlt, since it should contain only 2 elements as the only items whose coordinates match are the second and the fourth.
Your code only works under the (unlikely) assumption that the overlapping lat/lon are guaranteed to be at the same position in both lists.
Hi P.Shark, thanks for answering. You're right, it was my mistake in the first value I entered for Lon2. You're also right in assuming all lat/lon are in the same position. Thank you!
@TimeExplorer glad I was of help. While it is true that my solution only works in the scenario described by knipknap, from the way your code was written I assumed that you were only interested comparing elements in the same position. In any case, for even greater performance (and if your hardware supports it) you could consider taking advantage of the multiprocessing package (link)
1

This is not 100% functionally equivalent, but I am guessing it is closer to what you actually want:

Lon = [450000.50,459000.50,460000,470000]
Lat = [5800000.50,459000.50,500000,470000]
Alt = [-1,-9,-2,1]
Lon2 = [40000.50,459000.50,460000,470000]
Lat2 = [5800000.50,459000.50,800000,470000]
Alt2= [-3,-1,-20,2]

MeanAlt = []
appendAlt = MeanAlt.append
LonOverlap = []
appendLon = LonOverlap.append
LatOverlap = []
appendLat = LatOverlap.append

ll = dict((str(la)+'/'+str(lo), al) for (la, lo, al) in zip(Lat, Lon, Alt))

for la, lo, al in zip(Lon2, Lat2, Alt2):
    al2 = ll.get(str(la)+'/'+str(lo))
    if al2:
        MeanAltData = (al+al2)/2
        appendAlt(MeanAltData)
        LonOverlapData = lo
        appendLat(LonOverlapData)
        LatOverlapData = la
        appendLon(LatOverlapData)

print(MeanAlt) # correct ans should be MeanAlt = [-2.0,-5,1.5]
print(LonOverlap)
print(LatOverlap)

Or simpler:

Lon = [450000.50,459000.50,460000,470000]
Lat = [5800000.50,459000.50,500000,470000]
Alt = [-1,-9,-2,1]

Lon2 = [40000.50,459000.50,460000,470000]
Lat2 = [5800000.50,459000.50,800000,470000]
Alt2= [-3,-1,-20,2]

ll = dict((str(la)+'/'+str(lo), al) for (la, lo, al) in zip(Lat, Lon, Alt))

result = []
for la, lo, al in zip(Lon2, Lat2, Alt2):
    al2 = ll.get(str(la)+'/'+str(lo))
    if al2:
        result.append((la, lo, (al+al2)/2))

print(result)

In practice, I would try to start with better structured input data to begin with, making the conversion to dict, or at the very least the "zip()" unnecessary.

1 Comment

Hi, thanks for answering. I like how you arranged the data at the end. However it only extracts the coordinates when both Lon and Lat are equal. I need the mean to be calculated when Lon and Lon2 are equal but can be different numbers to Lat and Lat2 which should be equal. For example: Lon = [450000.50] Lon2 = [450000.50] Lat= [5800000.50] Lat2=[5800000.50]. I'm not sure how to implement this in the code you've given hehe. Also, if you have suggestions better way of arranging the input data, please let me know. I only set it up in lists because I'm most familiar with them.
0

Use numpy to vectorize computations. For 1,000,000 long arrays execution time should be on the order of 15-25ms of microseconds if inputs are already numpy.ndarrays and ~140ms if inputs are Python lists.

import numpy as np
def mean_alt(lon, lon2, lat, lat2, alt, alt2):
    lon = np.asarray(lon)
    lon2 = np.asarray(lon2)
    lat = np.asarray(lat)
    lat2 = np.asarray(lat2)
    alt = np.asarray(alt)
    alt2 = np.asarray(alt2)
    ind = np.where((lon == lon2) & (lat == lat2))
    mean_alt = (0.5 * (alt[ind] + alt2[ind])).tolist()
    return (lon[ind].tolist(), lat[ind].tolist(), mean_alt)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.