Code optimization: How to optimize a code using scipy minimize?

Question

I am trying to optimize a code that I wrote to calculate the least square for a system of equations and returns me the optimal value for the unknowns: a1,a2,a3,z1,z2 (pottemp and zlevels are know).

The system of equation is the following (https://i.sstatic.net/qlKg2.jpg):

System of equations

I wrote the following code which works like a charm, but it isn't very efficient and you can see due to the number of for loops and if I increase the number of hstep and astep, my array gets really big and it takes a lot of time to compute the least square from leastsquared.

def leastsquared(zlevels,pottemp,z1,z2,a1,a2,a3):
    dtot=0.0
    for zi in range(len(zlevels)): #zvalue of obs points 
        if zlevels[zi]<=z1:
            F=np.tan(a1)*zlevels[zi]+pottemp[0]
        elif zlevels[zi]<=z2:
            F=(np.tan(a1)*z1+pottemp[0]) + np.tan(a2)*(zlevels[zi]-z1)
        else: #zlevels[zi]<=H
            F=((np.tan(a1)*z1+pottemp[0]) + np.tan(a2)*(z2-z1) ) + np.tan(a3)*(zlevels[zi]-z2)
        d=(pottemp[zi]-F)**2.0
        dtot+=d
    dtot=dtot/len(zlevels)
    #print("dtot is" + str(dtot))
    return dtot

def optm(hstep,astep,zlevels,pottemp):

    sqdist=np.inf
    for z1 in np.linspace(0,zlevels[-1],hstep):
        for z2 in np.linspace(0,zlevels[-1],hstep):
            for a1 in np.linspace(0,0.1,astep): # max angle
                for a2 in np.linspace(0,0.01,astep):
                    for a3 in np.linspace(0,0.01,astep):
                        sqdist_new=leastsquared(zlevels,pottemp,z1,z2,a1,a2,a3)
                        if sqdist_new<sqdist:
                            sqdist=sqdist_new
                            optimalsetting=[z1,z2,a1,a2,a3]

    return optimalsetting

The code itself is very simple but inefficient. I was trying to implement this code using scipy.optimize.minimize, but I haven't been able to run it. Does anyone have any idea how to optimize it? Perhaps, scipy.optimize.minimize would be the easiest way, but I'm not sure how to adapt my code to it.

user2653663 · Accepted Answer · 2019-09-04 10:18:35Z

1

There is definitely a more efficient way of dealing with this, since piecewise-linear problems are well studied, but the following is done with scipy.optimize as requested. Also scipy can't handle dynamic constraints, such as the requirement that z1<z2, so I swapped the order of the variables in the solver if this wasn't enforced. The optimization might be more/less stable by removing these lines of code and let the data speak for itself.

import numpy as np
import scipy.optimize

def f(zlevels, t0, a1, a2, a3, z1, z2, H):
    """
    Function to evaluate f, given a set of variables
    """
    # Check that no samples are out of bounds
    if any(zlevels < 0) or any(zlevels > H):
        quit("Values of zlevels out of bounds")

    pottemp = np.zeros(zlevels.shape)
    mask1 = zlevels <= z1
    mask2 = (z1 < zlevels) & (zlevels <= z2)
    mask3 = z2 < zlevels

    z1_arr1 = np.ones(sum(mask2))*z1
    z1_arr2 = np.ones(sum(mask3))*z1
    z2_arr = np.ones(sum(mask3))*z2

    pottemp[mask1] = zlevels[mask1] * a1 + t0
    pottemp[mask2] = z1_arr1 * a1 + t0 + a2 * (zlevels[mask2] - z1)
    pottemp[mask3] = z1_arr2 * a1 + t0 + a2 * (z2_arr - z1) + a3 * (zlevels[mask3] - z2)

    return pottemp

def obj_fun(x, zlevels, pottemp, t0, H):
    """
    Least-squares objective function for the problem
    """
    a1, a2, a3, z1, z2 = x
    # Swap order if z1 is larger than z2
    if z1 > z2:
        z1, z2 = z2, z1

    pottemp_pred = f(zlevels, t0, a1, a2, a3, z1, z2, H)
    return sum((pottemp_pred-pottemp)**2) 

def optimize(zlevels, pottemp, t0, H):
    """
    Optimize a1, a2, a3, z1, z2
    """

    starting_guess = [0,0,0,0.5,2.5]
    res = scipy.optimize.minimize(obj_fun, starting_guess, args=(zlevels, pottemp, t0, H), \
            bounds=[(None,None),(None,None),(None,None),(0,H),(0,H)])
    x = res.x
    # Swap order if z1 is larger than z2
    if x[-2] > x[-1]:
        x[-2], x[-1] = x[-1], x[-2]
    return x

# Make the true model for testing
t0 = 0.5
a1 = -1
a2 = +1
a3 = -2
z1 = 1
z2 = 2
H = 3

# Generate some variables
n = 1000
zlevels = np.random.random(n)*3
# Get values and add some random noise
pottemp = f(zlevels, t0, a1, a2, a3, z1, z2, H) + np.random.normal(0,0.1, size=n)

print("Correct values:", a1, a2, a3, z1, z2)
a1, a2, a3, z1, z2 = optimize(zlevels, pottemp, t0, H)
print("Fitted values:", a1, a2, a3, z1, z2)

edited Sep 4, 2019 at 10:18

answered Sep 3, 2019 at 11:02

user2653663

2,9481 gold badge20 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Michell Over a year ago

That's great! But I didn't understand the part where we have fvalues in optmize. Why should I give this information to the function if that's what I'm looking for? Would it be the initial guesses? Also, t0 is know, but I changed it in the code I tried

user2653663 Over a year ago

z and fvalues are the data points that you already have observed and that you're performing the fit against. So given that your form for f is correct, fvalues corresponds to f(z) plus some noise. This corresponds to zlevels and pottemp in your code, but I was slightly confused about this at the time, as you didn't specify this in the text. I have updated the naming to match your question.

Michell Over a year ago

Oh great! That's what I needed. Thanks

Ankita Mehta · Accepted Answer · 2019-09-02 11:59:10Z

0

I am not very sure of the actual variables and their values. But below is something that I tried:

import numpy as np
from scipy.optimize import minimize

def leastsquared(zlevels,pottemp,z1,z2,a1,a2,a3):
    dtot=0.0
    for zi in range(len(zlevels)): #zvalue of obs points 
        if zlevels[zi]<=z1:
            F = np.tan(a1)*zlevels[zi]+pottemp[0]
        elif zlevels[zi]<=z2:
            F = (np.tan(a1)*z1+pottemp[0]) + np.tan(a2)*(zlevels[zi]-z1)
        else: #zlevels[zi]<=H
            F=((np.tan(a1)*z1+pottemp[0]) + np.tan(a2)*(z2-z1) ) + np.tan(a3)*(zlevels[zi]-z2)
        d=(pottemp[zi]-F)**2.0
        dtot+=d
    dtot=dtot/len(zlevels)
    #print("dtot is" + str(dtot))
    return dtot

def optm(hstep,astep,zlevels,pottemp):
    steps = np.linspace(0,0.01,astep)
    x0 = np.array([steps, steps, steps]) #a1, a2, a3

    def objective(x):
        return leastsquared(zlevels,pottemp, z1, z2, x[0], x[1], x[2])

    sqdist=np.inf
    for z1 in np.linspace(0,zlevels[-1],hstep):
        for z2 in np.linspace(0,zlevels[-1],hstep):
            sol = minimize(objective, x0, method='SLSQP', options={'disp':True})
            optimalsetting=[z1,z2,sol.x[0],sol.x[1],sol.x[2]]
    return optimalsetting


# test
# some random initialization
astep = 2
steps = 3


optm(5, 2, [1, 2], [1, 2])

You might want to update the objective function and variable types here. Hope this gets you started.

edited Sep 2, 2019 at 11:59

answered Sep 2, 2019 at 11:53

Ankita Mehta

6105 silver badges19 bronze badges

1 Comment

Michell Over a year ago

It works well, but in my case, I need to find the minimum for the least square (optimal value for z1,z2,a1,a2,a3 so it fits my data). This is done in the last part of opt (if sqdist_new<sqdist:). Avoiding this part of the code makes it only loop through z1 and z2 until the end.

Collectives™ on Stack Overflow

Code optimization: How to optimize a code using scipy minimize?

2 Answers 2

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related