Python: Solving Matrix Equation Ax = b where A contains variables

Question

I'm hoping to solve a matrix equation of the form Ax = b where

A = [[a,0,0],
     [0,a,0],
     [0,0,a],
     [1,1,0],
     [0,0,1]]

x = [b,c,d]

b = [0.4,0.4,0.2,0.1,0.05]

If I knew the value of a, I would do this using the numpy.linalg.lstsq module. Similarly, if I thought the solution was exact, I would use the sympy solver. However, the solution will need to be a least square fit to some data, and the sympy solver generally tells me there is no solution.

Basically what I'm looking for is a version of numpy.linalg.lstsq that allows me to declare one or more variables in A. Is there a prebuilt module which deals with these types of problems?

Thanks! Sorry, my original example was oversimplified. I want to solve in the general case where the problem may be overconstrained. See the edited version. — Timothy
– Timothy, Commented Apr 9, 2020 at 22:38
It's still not entirely clear to me the full extend of the problem you're trying to solve. Is it the full problem or still another simplification? — Omri Bahat Treidel
– Omri Bahat Treidel, Commented Apr 10, 2020 at 3:58
The problem is to solve a general matrix equation of the form Ax = b, where there are some number n variables within the matrix A. The matrices A and b will always have at least n additional rows, such that the problem is constrained; however, it may be overconstrained. This is meant to be the simplest example of such a problem, but in practice I want to apply it to matrices with 20, 30, etc. rows. — Timothy
– Timothy, Commented Apr 10, 2020 at 19:38

FBruzzesi · Accepted Answer · 2020-04-10 10:26:44Z

1

As you mentioned in the comments, the problem is overconstrained, therefore there is no exact solution.

On the other hand the problem with using np.linalg.lstsq directly is that you have a variable in the matrix A.

One possible workaround is to decide a range for the variable a, and find a parametric solution depending upon it, what I mean is:

import numpy as np
import pandas as pd

# Define the list of results
R = []

b = np.array([0.4, 0.4, 0.2, 0.1, 0.5])

# Here I picked a (-10,10) range for a with a step of 0.01 
for a in np.arange(-10, 10, 0.01):
    # Define A given a
    A = np.array([[a,0,0], [0,a,0], [0,0,a], [1,1,0], [0,0,1]])
    # Solve with least-squares and store both a and the result
    R.append((a, *np.linalg.lstsq(A,b)[0]))

# Convert solutions to a dataframe
results = pd.DataFrame(R, columns=list('abcd'))

results
          a         b         c         d
0    -10.00 -0.038235 -0.038235 -0.014851
1     -9.99 -0.038271 -0.038271 -0.014861
2     -9.98 -0.038307 -0.038307 -0.014871
3     -9.97 -0.038343 -0.038343 -0.014880
4     -9.96 -0.038379 -0.038379 -0.014890
    ...       ...       ...       ...
1995   9.95  0.040395  0.040395  0.024899
1996   9.96  0.040355  0.040355  0.024870
1997   9.97  0.040315  0.040315  0.024840
1998   9.98  0.040275  0.040275  0.024811
1999   9.99  0.040236  0.040236  0.024782

Once in this form, you can see how variables b, c and d vary with a by plotting the results:

results.set_index('a').plot(figsize=(16,8))

Remark that b==c for any a, and therefore you don't see one trace on the plot.

answered Apr 10, 2020 at 10:26

FBruzzesi

6,6143 gold badges19 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Timothy Over a year ago

This is basically what I'm looking for; I wasn't sure if there was a function integrated into numpy or sympy that could achieve this. Based on the responses, it seems there is not. Thanks for your help!

Oscar Benjamin · Accepted Answer · 2020-04-09 22:34:41Z

These are the equations you want to solve:

In [39]: a, b, c, d = symbols('a, b, c, d', real=True)                                                                            

In [40]: A = Matrix([[a,0,0], 
    ...:      [0,a,0], 
    ...:      [0,0,a], 
    ...:      [1,1,0]]) 
    ...: x = Matrix([[b],[c],[d]]) 
    ...: b = Matrix([[0.4],[0.4],[0.2],[0.1]])                                                                                    

In [41]: A*x - b                                                                                                                  
Out[41]: 
⎡ a⋅b - 0.4 ⎤
⎢           ⎥
⎢ a⋅c - 0.4 ⎥
⎢           ⎥
⎢ a⋅d - 0.2 ⎥
⎢           ⎥
⎣b + c - 0.1⎦

We can get the least squares solution using the pseudoinverse:

In [53]: A.pinv() * b                                                                                                             
Out[53]: 
⎡      ⎛ 2    ⎞                   ⎛ 2    ⎞            ⎤
⎢0.4⋅a⋅⎝a  + 1⎠     0.4⋅a     0.1⋅⎝a  + 1⎠      0.1   ⎥
⎢────────────── - ───────── + ──────────── - ─────────⎥
⎢   4      2       4      2     4      2      4      2⎥
⎢  a  + 2⋅a       a  + 2⋅a     a  + 2⋅a      a  + 2⋅a ⎥
⎢                                                     ⎥
⎢      ⎛ 2    ⎞                   ⎛ 2    ⎞            ⎥
⎢0.4⋅a⋅⎝a  + 1⎠     0.4⋅a     0.1⋅⎝a  + 1⎠      0.1   ⎥
⎢────────────── - ───────── + ──────────── - ─────────⎥
⎢   4      2       4      2     4      2      4      2⎥
⎢  a  + 2⋅a       a  + 2⋅a     a  + 2⋅a      a  + 2⋅a ⎥
⎢                                                     ⎥
⎢                         0.2                         ⎥
⎢                         ───                         ⎥
⎣                          a                          ⎦

This returns a matrix that I could then use to fit for a; I'm wondering if there is an all in one routine like np.linalg.lstsq which will simultaneously solve for a, b, c, and d.

Omri Bahat Treidel · Accepted Answer · 2020-04-09 22:37:48Z

1

It seems like the problem you've posted has an exact analytical solution:

ab = 0.4,
ac = 0.4,
ad = 0.2,
b + c = 0.1

Summing the first 2 equations and dividing it by the 4th results in:

(1) + (2) => a(b+c) = 0.8
(4) b+c = 0.1

=> a=8

So you can if you know the vector b you can always solve for ab, ac ad and than use the 4th to solve for a.

answered Apr 9, 2020 at 22:37

Omri Bahat Treidel

5834 silver badges11 bronze badges

1 Comment

Timothy Over a year ago

Thanks! I tried to write the simplest version of the problem and oversimplified it. I've edited the original with something more in line with what I want.

hxalchemy · Accepted Answer · 2020-04-10 23:33:34Z

I came up with something you can use for a general case solution. Also you dont have to specify bounds or ranges.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html

"Solve a nonlinear least-squares problem with bounds on the variables."

#You can rewrite matrix A as

[[1, 0, 0],
 [0, 1, 0],
 [0, 0, 1],   * a + 
 [0, 0, 0],
 [0, 0, 0]]

[[0, 0, 0],
 [0, 0, 0],
 [0, 0, 0],
 [1, 1, 0],
 [0, 0, 1]]


#Define your loss function
def lossf(M):
    Y = np.array([0.4,0.4,0.2,0.1,0.05])
    a,b,c,d = M
    A = np.array([[1,0,0],[0,1,0],[0,0,1],[0,0,0],[0,0,0]])*a
    A = A + np.array([[0,0,0],[0,0,0],[0,0,0],[1,1,0],[0,0,1]])
    y = A.dot(np.array([b,c,d]).T)
    return np.sum((Y - y)**2)

#An extra function to calculate the matrix in the end
def mat(M):
    Y = np.array([0.4,0.4,0.2,0.1,0.05])
    a,b,c,d = M
    A = np.array([[1,0,0],[0,1,0],[0,0,1],[0,0,0],[0,0,0]])*a
    A = A + np.array([[0,0,0],[0,0,0],[0,0,0],[1,1,0],[0,0,1]])
    y = A.dot(np.array([b,c,d]).T)
    return y

import numpy as np
from scipy.optimize import least_squares

#this takes some time ** does 100000 rounds which can be changed
#[0,0,0,0] = random initialization of parameters
result = least_squares(lossf,[0,0,0,0], verbose=1, max_nfev = 100000)


result.x #result for parameter estimate
[6.97838023, 0.05702932, 0.05702932, 0.02908934] # run code and 

mat(result.x)
#The non-linear fit
[0.39797228, 0.39797228, 0.20299648, 0.11405864, 0.02908934]

#Orignal 
[0.4,0.4,0.2,0.1,0.05]


#Also results for other matrix
#This requires changing the loss function
#For a permanent solution you can write a flexible loss function
Y = np.array([0.4,0.4,0.2,0.1])
result = least_squares(lossf,[0,0,0,0], verbose=1, max_nfev = 10000)


result.x
[7.14833526, 0.0557289 , 0.0557289 , 0.02797854]

mat(result.x)
[0.39836889, 0.39836889, 0.2       , 0.11145781]
#The results are very close to analytical solution

Omri Bahat Treidel · Accepted Answer · 2020-04-11 00:32:22Z

Since the problem is non linear (include products like ab ac and ad) you'll need some sort of non-linear method. Here is one iterative method. in each iteration we'll evaluate b,c,d using linear least square, and then use the 3 remaining equation to estimate a. and we'll do it until convergence.

I'll denote variable we're trying to estimate as x1, x2 and x3 since b is the vector on the right hand side. using this notation we have

a * x1 = b[0]
a * x2 = b[1]
a * x3 = b[2]

which implies:

a = sum(b[0:3])/sum(x)  # x = [x1, x2, x3]

and the remaining equations are are a simple linear system:

A_d * x = b[2:]

The matrix A_d is the same as the matrix A without the first 3 rows, i.e. A_d = A[2:, :].

Now we can use an iterative solution:

# choose any initial values

prev = np.ones(4)  ## this includes `a` as the last element

rel_tolerance = 1e-4  # choose a number
while True:
    x_sol = np.linalg.lstsq(A_d, b[2:])
    a_sol = sum(b[0:3])/sum(x_sol)
    sol = np.append(x_sol, a_sol)  

    diff_x = np.abs(sol - prev)
    # equivalent to max relative difference but avoids 0 division
    if np.max(diff_x) < rel_tolerance * prev:  
        break

You can choose one of many other convergence criteria.

Collectives™ on Stack Overflow

Python: Solving Matrix Equation Ax = b where A contains variables

5 Answers 5

1 Comment

1 Comment

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

1 Comment

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related