2

I am trying to solve a least squares problem subject to a linear system of inequality constraints in Python. I have been able to solve this problem in MatLab, but for the project I am working in all of our code-base should be in Python, so I am looking for an equivalent way to solve it, but have been unable to.

Some background on the problem:

I have images with pixel values in their raw digital number (DN) form, and I want to come up with a regression line that models the linear relationship between the DNs and the true reflectance values of the surfaces in my image.

I have 6 known reflectances and corresponding DNs, and so I make a linear system of equations:

import numpy as np
A = np.array([[1, 19039],[1, 47792], [1, 9672], [1, 32521], [1, 11409], [1, 58843]])
b = np.array([[0.05938044], [0.27213514], [0.00252875], [0.18535543], [0.01959069], [0.52605937]])
b = np.squeeze(b)

I set up the objective function to be the 2-norm of (Ax-b)*0.5

def f(, x):
    # function to minimize
    y = np.dot(A, x) - b
    return (np.dot(y, y))*.5

Then I want to add my constraints. Since surface reflectance values are bounded between 0-1, I want to ensure that the minimum DN value within my image has a reflectance value greater than 0 when converted from DN to reflectance through the estimates slope and intercept coefficients, and that the maximum DN value of the image is mapped to a reflectance of below or equal to 1.

According to the paper that I am implementing, I can split the requirement of 0 <= slope*DN + intercept <= 1 into two parts:

slope*DN_max + intercept <= 1

-slope*DN_min - intercept <= 0

Thus, I create two matrices, one containing the minimum DN value and the intercept coefficient (1), and one containing the maximum DN value and the intercept coefficient (1). I make these into matrices, because in practice I would have more than one image being calibrated and thus I would have more than two columns and two rows (I would have n by n*2 matrices, where n = number of images), but for this simplified example I will just work with one image.

img_min = 0
img_max = 65536
C_min = np.array([[-1, img_min]])
C_max = np.array([[1, img_max]])

def cons_min(x):
    # constraint function for the minimum pixel values
    return np.array((C_min @ x))

def cons_max(x):
    # constraint function for the maximum pixel values
    return np.array((C_max @ x))-1

Then I tried to use optimize.minimize to solve for the coefficients.

con1 = [{'type': 'ineq', 'fun': cons_min},
        {'type': 'ineq', 'fun': cons_max}
       ]
initial = [0, 0]
result = optimize.minimize(f, 
                           initial, 
                           method='SLSQP', 
                           constraints=con1
                           )

In MatLab using the lsqlin(A,B, C, c) function the result I get is

intercept = 0.0000000043711483450798073327817072630808
slope = 0.0000069505505573854644717126521902273

Result with MatLab

But using the optimize.minimize function I get

[-2.80380803e-17,  1.52590219e-05]

where the first value of the list is the intercept and the second is the slope. Result with Python

I think perhaps it is a problem with the constraints I am setting, but I have tried to play around with them and it hasn't improved the result in any way.

Is there a better way of solving this problem in Python? Or is there something I can improve in my current approach?

Thanks.

3
  • 1
    I would not be surprised to see that this is a scaling-problem. SLSQP is just too general (NLP) and too unrobust (compared to more specialized approaches; ah and you are doing numerical diff too) which might completely fail when the decision variables are expected to be in range 10e-5 - 10e-8. This might actually even be the range of it's first-order stopping criterions. Solver-status might indicate such problems. Apart from that, scipy might not be my first pick here as it's limited in this regard. A more potent hack would be pyomo + ipopt (but even this is less specialized than possible). Commented Aug 24, 2020 at 15:15
  • 1
    Imho you want a convex qp-solver or socp-solver (both based on interior-point methods). Sadly, there is not much good software available (when restricted by licenses or wrappers). Except you are doing (license-compatible) academic work: then just use CPLEX, Gurobi or Mosek. Edit you could also describe this with few lines of code in cvxpy and try ECOS (most accurate), OSQP or SCS (both less accurate). Scaling might still be needed. Also: if this approach fails (with ECOS), chances grow that you got some other problems (then solvers) as the solver is quite good Commented Aug 24, 2020 at 15:19
  • 1
    Thanks for the tips! It ended up working by using cvxpy and the ECOS solver! The set-up for cvxpy with the constraints was also much more straightforward than it was using scipy, so I am quite happy with this approach! If you want to make your comment into an answer then I will gladly mark it as a solution. Commented Aug 26, 2020 at 11:38

1 Answer 1

3

Thanks to the advice from sascha in the comments, I have managed to arrive at a solution:

import cvxpy as cp
import numpy as np

A = np.array([[1, 19039],[1, 47792], [1, 9672], [1, 32521], [1, 11409], [1, 58843]])
b = np.array([[0.05938044], [0.27213514], [0.00252875], [0.18535543], [0.01959069], [0.52605937]])
b = np.squeeze(b)
C_min = np.array([[-1, 0]])
C_max = np.array([[1, 65535]])

x = cp.Variable(A.shape[1])

objective = cp.Minimize(0.5 * cp.sum_squares(A@x-b))
constraints = [C_min@x <= 0, C_max@x <= 1]

prob = cp.Problem(objective, constraints)
result = prob.solve(solver=cp.ECOS)

intercept = x.value[0]
slope = x.value[1]

import matplotlib.pyplot as plt
%matplotlib inline

plt.figure()
plt.scatter(A[:, 1], b)
plt.plot(A[:, 1], np.multiply(A[:, 1], slope) + intercept)

Which gives me the best fit line based on my constraints

enter image description here

And if I check and compare the residuals between the original MatLab solution and the cvxpy solution, I see that the cvxpy solution is ever so slightly better in this example (although very minimal).

# MatLab estimated values for the slope and intercept
ML_inter = 0.0000000043711483450798073327817072630808
ML_slope = 0.0000069505505573854644717126521902273

# get the residuals for each data point
c_res = []
ml_res = []
for i in range(A.shape[0]):
    residual = (np.multiply(A[i, 1], x.value[1]) + x.value[0]) - b[i]
    c_res.append(residual)
    residual = (np.multiply(A[i, 1], ML_slope) + ML_inter) - b[i]
    ml_res.append(residual)

# calculate the sum of squares
ss_cvx = np.sum(np.array(c_res)**2)
ss_ml = np.sum(np.array(ml_res)**2)

print("Sum of squares for cvx:    ", ss_cvx)
print("Sum of squares for matlab: ", ss_ml)
print("Sum of squares is lower for CVX solution? ", ss_cvx < ss_ml)

# Sum of squares for cvx:     0.03203817995131549
# Sum of squares for matlab:  0.032038181467959566
# Sum of squares is lower for CVX solution?  True
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.