Least Squares Linear Programming with inequalities

Question

I am using Numpy np.linalg.lstsq(A, b, rcond=-1) to solve a system of equations with an approximate solution via the Least Squares method.

However I realised the approximate solution contains negative values for the coefficients of some dimensions. So I would like to add boundaries to make all the dimensions only positive. I think this means adding inequalities to the system of equations. It seems SciPy has this with from scipy.optimize import linprog - however this is requiring myself to specify a function to minimise which in my case is the "Least Squares" norm, but I don't know how to define that. AFAIU There is no default function. How do I use SciPy linprog with some sort of default least square norm as the minimisation function?

This is the code with NumPy:

import numpy as np

# This is a matrix representing fertilisers, 
# i.e. products with various chemical elements in them
A = np.array([[1.04e+01, 0.00e+00, 0.00e+00, 1.00e+00, 0.00e+00, 0.00e+00,
        3.00e+00, 1.20e+01, 9.00e+00],
       [7.00e+00, 0.00e+00, 0.00e+00, 5.00e+01, 0.00e+00, 0.00e+00,
        3.00e+01, 6.00e+00, 5.00e+00],
       [2.70e+01, 0.00e+00, 0.00e+00, 9.00e+00, 0.00e+00, 0.00e+00,
        3.00e+01, 2.90e+01, 3.90e+01],
       [0.00e+00, 0.00e+00, 1.50e+01, 2.00e+00, 1.60e+01, 0.00e+00,
        0.00e+00, 0.00e+00, 2.00e+00],
       [7.00e+00, 1.50e+01, 0.00e+00, 0.00e+00, 0.00e+00, 0.00e+00,
        0.00e+00, 7.00e+00, 0.00e+00],
       [0.00e+00, 0.00e+00, 3.00e+01, 0.00e+00, 3.00e+01, 0.00e+00,
        0.00e+00, 0.00e+00, 1.12e+01],
       [0.00e+00, 0.00e+00, 5.00e-01, 1.00e-02, 0.00e+00, 6.00e-01,
        1.00e-01, 1.00e-02, 1.00e-02],
       [0.00e+00, 0.00e+00, 0.00e+00, 2.00e-02, 0.00e+00, 5.00e-01,
        0.00e+00, 1.60e-01, 4.00e-02],
       [0.00e+00, 5.00e-01, 5.00e-01, 2.00e-03, 1.00e+00, 1.50e+00,
        1.00e-01, 4.00e-02, 2.00e-03],
       [0.00e+00, 0.00e+00, 0.00e+00, 2.00e-03, 5.00e-01, 2.00e-01,
        0.00e+00, 1.00e-02, 2.00e-03],
       [0.00e+00, 0.00e+00, 0.00e+00, 1.00e-03, 0.00e+00, 2.00e-01,
        1.60e+00, 6.00e-03, 2.00e-03],
       [0.00e+00, 5.00e-01, 5.00e-01, 1.00e-02, 1.50e+00, 2.10e+00,
        1.00e-01, 8.00e-02, 1.00e-02],
       [4.10e+00, 0.00e+00, 0.00e+00, 8.00e+00, 0.00e+00, 0.00e+00,
        3.00e+00, 0.00e+00, 1.00e+00]])

# this is a vector representing a given "recipe" 
# for desired quantities of each chemical element
b = np.array([[8.0e+01],
       [4.5e+01],
       [1.0e+02],
       [5.0e+01],
       [2.0e+02],
       [1.8e+02],
       [5.0e-01],
       [3.0e+00],
       [5.0e-01],
       [5.0e-02],
       [5.0e-02],
       [5.0e-01],
       [0.0e+00]])

# the optimisation solution
x_solution = np.linalg.lstsq(A, b, rcond=-1)
x_approx_sol = x_solution[0]
x_residuals = x_solution[1]

# the "new b" according to the optimised "x solution"
b_approx = A.dot(x_approx_sol)

As you can see x_approx_sol contains negative numbers, which means negative quantities for each fertiliser, but I would like to have only 0 or positive quantitites:

array([[ -7.06529347],
       [ 12.62140023],
       [ 16.79436939],
       [  6.67404177],
       [-13.95259075],
       [  2.26982817],
       [-11.05480032],
       [  8.58752025],
       [  8.08066117]])

So I think I should add an array of "boundaries" with 9 elements as (0, float("inf")) to the SciPy linprog. How do I define the distance norm to minimise? It should be the Euclidean 2-norm || b - a x ||

David Hoffman · Accepted Answer · 2020-08-08 18:07:15Z

2

You’re looking for https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.nnls.html which is an implementation of https://en.wikipedia.org/wiki/Non-negative_least_squares.

In this case replace lstsq with nnls (b also needs to have only one dimension):

# the optimisation solution
x_approx_sol, x_residuals = nnls(A, b.squeeze())

# the "new b" according to the optimised "x solution"
b_approx = A.dot(x_approx_sol)

assert (x_approx_sol >= 0).all()

print(x_approx_sol.reshape(-1, 1))

Gives

[[ 0.        ]
 [11.50294958]
 [ 5.44866408]
 [ 0.36370257]
 [ 0.        ]
 [ 0.        ]
 [ 0.        ]
 [ 3.84407344]
 [ 0.        ]]

edited Aug 8, 2020 at 18:07

answered Aug 8, 2020 at 15:44

David Hoffman

2,3931 gold badge18 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

TPPZ Over a year ago

Thanks for this, I am not looking for a non-negative B vector (above here b_approx, although that's implicit speaking of quantities in chemistry). I am looking for a non-negative X vector (above here this is x_approx_sol, the list of quantities of fertilisers), but I checked that output and indeed it's non-negative :)

David Hoffman Over a year ago

@TPPZ I've updated the answer to reflect your actual question 😳

sascha · Accepted Answer · 2020-08-08 17:57:02Z

Linear Programming won't help you with your non-linear objective. In regards to general-purpose constrained-optimization, you would want to target convex quadratic-programming or second-order cone programming. But no need to use those very general tools (more general = less robust).

Either go for:

David's approach which is a very good candidate for small/medium-sized dense problems
- probably the easiest approach
or use scipy.optimize.least_squares
- a bit more work due to it's support for non-linear functions
- sparse algorithms available
or build your own minimizer based on L-BFGS-B
- even more work
- has been much more efficient compared to NNLS for me when instances became large!

So I would like to add boundaries to make all the dimensions only positive. I think this means adding inequalities to the system of equations.

Keep in mind, that you want to introduce variable-bounds which could be introduced by linear-equalities, but usually these kind of constraints are enforced explicitly (e.g. modification of simplex-method; projection onto bounds etc.). The 2n and 3rd approach both support variable-bounds, but not general (or even linear) constraints: somewhat indicates, that those are not equally hard to work with.

ev-br · Accepted Answer · 2020-08-09 10:41:35Z

0

Scipy.optimize provides nnls, non-negative least squares. For general box constraints, use least_squares for non-linear optimization or lsq_linear for linear problems with box constraints.

answered Aug 9, 2020 at 10:41

ev-br

26.3k9 gold badges68 silver badges84 bronze badges

Collectives™ on Stack Overflow

Least Squares Linear Programming with inequalities

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related