2

I'm hoping to solve a matrix equation of the form Ax = b where

A = [[a,0,0],
     [0,a,0],
     [0,0,a],
     [1,1,0],
     [0,0,1]]

x = [b,c,d]

b = [0.4,0.4,0.2,0.1,0.05]

If I knew the value of a, I would do this using the numpy.linalg.lstsq module. Similarly, if I thought the solution was exact, I would use the sympy solver. However, the solution will need to be a least square fit to some data, and the sympy solver generally tells me there is no solution.

Basically what I'm looking for is a version of numpy.linalg.lstsq that allows me to declare one or more variables in A. Is there a prebuilt module which deals with these types of problems?

4
  • 1
    Why isn't the solution exact? Commented Apr 9, 2020 at 22:21
  • Thanks! Sorry, my original example was oversimplified. I want to solve in the general case where the problem may be overconstrained. See the edited version. Commented Apr 9, 2020 at 22:38
  • It's still not entirely clear to me the full extend of the problem you're trying to solve. Is it the full problem or still another simplification? Commented Apr 10, 2020 at 3:58
  • The problem is to solve a general matrix equation of the form Ax = b, where there are some number n variables within the matrix A. The matrices A and b will always have at least n additional rows, such that the problem is constrained; however, it may be overconstrained. This is meant to be the simplest example of such a problem, but in practice I want to apply it to matrices with 20, 30, etc. rows. Commented Apr 10, 2020 at 19:38

5 Answers 5

1

As you mentioned in the comments, the problem is overconstrained, therefore there is no exact solution.

On the other hand the problem with using np.linalg.lstsq directly is that you have a variable in the matrix A.

One possible workaround is to decide a range for the variable a, and find a parametric solution depending upon it, what I mean is:

import numpy as np
import pandas as pd

# Define the list of results
R = []

b = np.array([0.4, 0.4, 0.2, 0.1, 0.5])

# Here I picked a (-10,10) range for a with a step of 0.01 
for a in np.arange(-10, 10, 0.01):
    # Define A given a
    A = np.array([[a,0,0], [0,a,0], [0,0,a], [1,1,0], [0,0,1]])
    # Solve with least-squares and store both a and the result
    R.append((a, *np.linalg.lstsq(A,b)[0]))

# Convert solutions to a dataframe
results = pd.DataFrame(R, columns=list('abcd'))

results
          a         b         c         d
0    -10.00 -0.038235 -0.038235 -0.014851
1     -9.99 -0.038271 -0.038271 -0.014861
2     -9.98 -0.038307 -0.038307 -0.014871
3     -9.97 -0.038343 -0.038343 -0.014880
4     -9.96 -0.038379 -0.038379 -0.014890
    ...       ...       ...       ...
1995   9.95  0.040395  0.040395  0.024899
1996   9.96  0.040355  0.040355  0.024870
1997   9.97  0.040315  0.040315  0.024840
1998   9.98  0.040275  0.040275  0.024811
1999   9.99  0.040236  0.040236  0.024782

Once in this form, you can see how variables b, c and d vary with a by plotting the results:

results.set_index('a').plot(figsize=(16,8))

results

Remark that b==c for any a, and therefore you don't see one trace on the plot.

Sign up to request clarification or add additional context in comments.

1 Comment

This is basically what I'm looking for; I wasn't sure if there was a function integrated into numpy or sympy that could achieve this. Based on the responses, it seems there is not. Thanks for your help!
1

These are the equations you want to solve:

In [39]: a, b, c, d = symbols('a, b, c, d', real=True)                                                                            

In [40]: A = Matrix([[a,0,0], 
    ...:      [0,a,0], 
    ...:      [0,0,a], 
    ...:      [1,1,0]]) 
    ...: x = Matrix([[b],[c],[d]]) 
    ...: b = Matrix([[0.4],[0.4],[0.2],[0.1]])                                                                                    

In [41]: A*x - b                                                                                                                  
Out[41]: 
⎡ a⋅b - 0.4 ⎤
⎢           ⎥
⎢ a⋅c - 0.4 ⎥
⎢           ⎥
⎢ a⋅d - 0.2 ⎥
⎢           ⎥
⎣b + c - 0.1⎦

We can get the least squares solution using the pseudoinverse:

In [53]: A.pinv() * b                                                                                                             
Out[53]: 
⎡      ⎛ 2    ⎞                   ⎛ 2    ⎞            ⎤
⎢0.4⋅a⋅⎝a  + 1⎠     0.4⋅a     0.1⋅⎝a  + 1⎠      0.1   ⎥
⎢────────────── - ───────── + ──────────── - ─────────⎥
⎢   4      2       4      2     4      2      4      2⎥
⎢  a  + 2⋅a       a  + 2⋅a     a  + 2⋅a      a  + 2⋅a ⎥
⎢                                                     ⎥
⎢      ⎛ 2    ⎞                   ⎛ 2    ⎞            ⎥
⎢0.4⋅a⋅⎝a  + 1⎠     0.4⋅a     0.1⋅⎝a  + 1⎠      0.1   ⎥
⎢────────────── - ───────── + ──────────── - ─────────⎥
⎢   4      2       4      2     4      2      4      2⎥
⎢  a  + 2⋅a       a  + 2⋅a     a  + 2⋅a      a  + 2⋅a ⎥
⎢                                                     ⎥
⎢                         0.2                         ⎥
⎢                         ───                         ⎥
⎣                          a                          ⎦

1 Comment

This returns a matrix that I could then use to fit for a; I'm wondering if there is an all in one routine like np.linalg.lstsq which will simultaneously solve for a, b, c, and d.
1

It seems like the problem you've posted has an exact analytical solution:

ab = 0.4,
ac = 0.4,
ad = 0.2,
b + c = 0.1

Summing the first 2 equations and dividing it by the 4th results in:

(1) + (2) => a(b+c) = 0.8
(4) b+c = 0.1

=> a=8

So you can if you know the vector b you can always solve for ab, ac ad and than use the 4th to solve for a.

1 Comment

Thanks! I tried to write the simplest version of the problem and oversimplified it. I've edited the original with something more in line with what I want.
1

I came up with something you can use for a general case solution. Also you dont have to specify bounds or ranges.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html

"Solve a nonlinear least-squares problem with bounds on the variables."

#You can rewrite matrix A as

[[1, 0, 0],
 [0, 1, 0],
 [0, 0, 1],   * a + 
 [0, 0, 0],
 [0, 0, 0]]

[[0, 0, 0],
 [0, 0, 0],
 [0, 0, 0],
 [1, 1, 0],
 [0, 0, 1]]


#Define your loss function
def lossf(M):
    Y = np.array([0.4,0.4,0.2,0.1,0.05])
    a,b,c,d = M
    A = np.array([[1,0,0],[0,1,0],[0,0,1],[0,0,0],[0,0,0]])*a
    A = A + np.array([[0,0,0],[0,0,0],[0,0,0],[1,1,0],[0,0,1]])
    y = A.dot(np.array([b,c,d]).T)
    return np.sum((Y - y)**2)

#An extra function to calculate the matrix in the end
def mat(M):
    Y = np.array([0.4,0.4,0.2,0.1,0.05])
    a,b,c,d = M
    A = np.array([[1,0,0],[0,1,0],[0,0,1],[0,0,0],[0,0,0]])*a
    A = A + np.array([[0,0,0],[0,0,0],[0,0,0],[1,1,0],[0,0,1]])
    y = A.dot(np.array([b,c,d]).T)
    return y

import numpy as np
from scipy.optimize import least_squares

#this takes some time ** does 100000 rounds which can be changed
#[0,0,0,0] = random initialization of parameters
result = least_squares(lossf,[0,0,0,0], verbose=1, max_nfev = 100000)


result.x #result for parameter estimate
[6.97838023, 0.05702932, 0.05702932, 0.02908934] # run code and 

mat(result.x)
#The non-linear fit
[0.39797228, 0.39797228, 0.20299648, 0.11405864, 0.02908934]

#Orignal 
[0.4,0.4,0.2,0.1,0.05]


#Also results for other matrix
#This requires changing the loss function
#For a permanent solution you can write a flexible loss function
Y = np.array([0.4,0.4,0.2,0.1])
result = least_squares(lossf,[0,0,0,0], verbose=1, max_nfev = 10000)


result.x
[7.14833526, 0.0557289 , 0.0557289 , 0.02797854]

mat(result.x)
[0.39836889, 0.39836889, 0.2       , 0.11145781]
#The results are very close to analytical solution




Comments

0

Since the problem is non linear (include products like ab ac and ad) you'll need some sort of non-linear method. Here is one iterative method. in each iteration we'll evaluate b,c,d using linear least square, and then use the 3 remaining equation to estimate a. and we'll do it until convergence.

I'll denote variable we're trying to estimate as x1, x2 and x3 since b is the vector on the right hand side. using this notation we have

a * x1 = b[0]
a * x2 = b[1]
a * x3 = b[2]

which implies:

a = sum(b[0:3])/sum(x)  # x = [x1, x2, x3]

and the remaining equations are are a simple linear system:

A_d * x = b[2:]

The matrix A_d is the same as the matrix A without the first 3 rows, i.e. A_d = A[2:, :].

Now we can use an iterative solution:

# choose any initial values

prev = np.ones(4)  ## this includes `a` as the last element

rel_tolerance = 1e-4  # choose a number
while True:
    x_sol = np.linalg.lstsq(A_d, b[2:])
    a_sol = sum(b[0:3])/sum(x_sol)
    sol = np.append(x_sol, a_sol)  

    diff_x = np.abs(sol - prev)
    # equivalent to max relative difference but avoids 0 division
    if np.max(diff_x) < rel_tolerance * prev:  
        break   

You can choose one of many other convergence criteria.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.