0

EDIT 1: This question has been completely modified for improved clarity and to better state my intent. The original post is at the end.

I think a lot of people are confused by my example code. It was meant to be a purely hypothetical example to teach me what I needed to do to set up my own constraints in a SciPy minimize procedure when the constraints I have are not around x directly, but rather some set of variables inside the objective function (which depend on x). User @Reinderien has given me a lot of insight with their (first) answer, so now I know a bit more about the problem at hand. Here is that problem in the most general sense.

I have a set of k vectors: v1, v2, ... vh, ... vk. Each has m parameters: vh[0], vh[1], ... vh[i], ... vh[m]. The i-th parameter of each vector corresponds the same physical quanity. For example, if these vectors were describing weather in different cities, v1[0] could describe humidity (parameter 0) in Houston (vector 1), and v2[0] subsequently describes humidity (parameter 0) in Austin (vector 2), and so on.

I have an input vector x of length n: x[0], x[1], ... x[j], ... x[n]. Note that n does not necessarily equal m. I don't think it is necessary to show what my specific objective function is (I am trying to keep things general), but just in case more specificity is needed: Each element i in all vh is a function of the entire vector x, where each element j in x is modified by an element ij in some "change" matrix C. (Quick Edit 2: For a better example of what is happening here, vh[i] = vh[i] + x[j] + C[i, j]] for all h, i, and j. This is in a for nested for-loop and the initial vh[i] is retrieved from data, i.e. vh = np.copy(dh) for all h prior to for-loop execution.)

Here is the crux of the optimization problem: I have vectors t, s_lb, and s_ub, all of length m, which correspond to the target value, minimum value, and maximum value for all i in all vh. I want to minimize the sum of the sum of mean squared errors (or just squared errors, I don't think there would be much difference) between t and all vh. This paragraph, written out mathematically, is as follows:

Minimize me!

... such that s_lb[i] < vh[i] < s_ub[i] for all i, for all vh.

(Quick Edit 1: In the internal summation, the term m atop the summation should actually be m-1. This is because we start the summation at index 0, not 1.)

I think that about covers it. My only question at this point is how I define matrix A in SciPy's optimize's LinearConstraint; I already have lb and ub as you can see. I've been trying to use the accepted answer here as a reference for what to do, but the nested summation is confusing me.

ORIGINAL POST:

I am restricted to a cell phone at the minute but I could use some help. Below is my best attempt at a minimal reproducible example, but I'm hoping it's not even necessary.

import numpy as np
import scipy as sp

data1 = [1.0, 2.0]
D1_0 = np.asarray(data1)
data2 = [1.5, 2.5]
D2_0 = np.asarray(data2)
specs =[1.25, 2.25]
S = np.asarray(specs)
changes = [[0.1, 0.2], [0.4, 0.5], [0.6, 0.7]]
C = np.asarray(changes)
rows_C, cols_C = C.shape

def get_ssd(D):
    ssd = np.sum((np.subtract(D, S))**2)
    return ssd

def objective(x):
    D1 = np.copy(D1_0)
    D2 = np.copy(D2_0)
    for i in range(cols_C):
        for j in range(rows_C):
            D1[i] = D1[i] + x[j] + C[j][i]
            D2[i] = D2[i] + x[j] + C[j][i]
    minim = get_ssd(D1) + get_ssd(D2)
    return minim

x0 = np.zeros(rows_C)
bnds = sp.optimize.Bounds(-1.0, 1.0)
res = sp.optimize.minimize(objective, x0, method='Powell', bounds=bnds)
print(res.x)

If you actually run this code, it's going to execute fine but it isn't really informative. My question is this: I want to incorporate a condition where D1 and D2 must be within some range for all i. I know how other optimization methods can put expression constraints on x but I need something like that for D1 and D2, not x.

My only remotely good thought for achieving this is by putting all of this code inside another optimizer (or even just a for-loop) that changes the bounds on x until the criteria is met. This seems extremely inelegant to me.

I feel like there must be a way to put constraints on variables internal to the objective function and not x. I just don't know how to do that.

8
  • Long answer short, there are two techniques, and the simpler technique is preferable for this situation since D is linear in x. I'll show both. Commented Aug 28, 2024 at 12:19
  • 1
    By the way, this problem has redundant parameters. For example, if you have a solution, you can subtract 0.1 from one parameter, and add 0.1 to any other parameter and get an equivalent solution. Commented Aug 28, 2024 at 15:59
  • Nick is correct. Further: you're trying to have x.sum() converge to S - C.sum() - D1_0, but also have x.sum() converge to S - C.sum() - D2_0. This doesn't make sense, and really seems like the problem is ill-posed. If you file a new question, be sure to include more background on what you're "actually doing", because as it stands this is x/y. Commented Aug 29, 2024 at 0:37
  • "each element i in all vh is a function of the entire vector x, where each element j in x is modified by an element ij in some change matrix C." "I think that about covers it. My only question at this point is how I define matrix A in SciPy's optimize's LinearConstraint; I already have lb and ub as you can see." This depends on how vh is defined. If vh = C @ x, then it is as simple as passing C as the A matrix. But "modified" might mean something else - if you want more specific advice about how to set up the A matrix, you need to be more specific on how vh is derived. Commented Aug 29, 2024 at 19:04
  • Alternatively, instead of finding the A matrix, if you can write a Python function which calculates vh from x, then you can return vh from that function, and use NonlinearConstraint to place a constraint on vh. Commented Aug 29, 2024 at 19:06

1 Answer 1

2

The first and most preferred method is to derive D as a linear function of x, solve for x, and constrain the result with a LinearConstraint:

import numpy as np
import scipy as sp

D1_0 = np.array((1.0, 2.0))  # data
D2_0 = np.array((1.5, 2.5))
S = np.array((1.25, 2.25))  # specs
C = np.array((
    (0.1, 0.2),
    (0.4, 0.5),
    (0.6, 0.7),
))  # changes
rows_C, cols_C = C.shape


def get_ssd(D: np.ndarray) -> float:
    """Sum of squared differences"""
    diff = D - S
    return diff.dot(diff)


def objective(x: np.ndarray) -> float:
    addend = x.sum() + C.sum(axis=0)
    '''
    For the optimal input, these are equal to (0.85, 2.15) and (1.35, 2.65) respectively
    (before constraining them)
    '''
    D1 = D1_0 + addend
    D2 = D2_0 + addend
    return get_ssd(D1) + get_ssd(D2)


# Regression test
assert np.isclose(72.59, objective(np.ones(3)))

"""
D1_0 + x0+x1+x2 + C0+C1       = D1
1.0               0.1+0.4+0.6
2.0               0.2+0.5+0.7
"""
D1_lower = np.array((0.5, 2.0))
D1_upper = np.array((0.84, 2.14))
d1_constraint = sp.optimize.LinearConstraint(
    A=np.ones((cols_C, rows_C)),
    lb=D1_lower - D1_0 - C.sum(axis=0),
    ub=D1_upper - D1_0 - C.sum(axis=0),
)

res = sp.optimize.minimize(
    fun=objective,
    x0=np.zeros(rows_C),
    bounds=sp.optimize.Bounds(lb=-1, ub=1),
    constraints=d1_constraint,
)
print(res)  # Cost of 0.34 in 1 iteration regardless of method

# For unconstrained case
# assert np.allclose(res.x, (-0.41666666, -0.41666666, -0.41666666))

Given your example, that's pretty much the only method that should be used. The spirit of the question may imply that D can be non-linear, and in that case you need to write a NonlinearConstraint whose function calculates and returns D.

Depending on context, you can also express the problem with a third strategy where D becomes a degree of freedom in the problem decision variables, the constraints are expressed as decision variable bounds, and you write a NonlinearConstraint relating x and D with an equality. The advantage is that you only need to write D's expression once. However, the impact on optimiser performance is difficult to predict.

More broadly, the entire premise of the example - which is to make D converge on S - is wrong; minimize() might not be well-applied here. Disregarding the bounds and constraints, this produces the same answer:

import numpy as np

d1_0 = np.array((1.0, 2.0))  # data
d2_0 = np.array((1.5, 2.5))
s = np.array((1.25, 2.25))  # specs
C = np.array((  # changes
    (0.1, 0.2),
    (0.4, 0.5),
    (0.6, 0.7),
))
Csum = C.sum(axis=0)
m, n = C.shape

A = np.ones((2*n, m))
b = np.array((
    s[0] - d1_0[0] - Csum[0],
    s[1] - d1_0[1] - Csum[1],
    s[0] - d2_0[0] - Csum[0],
    s[1] - d2_0[1] - Csum[1],
))
x, *_ = np.linalg.lstsq(a=A, b=b, rcond=None)
print(x)

In other words, the problem is linear (with least-squared costs) and tries to solve for the sum of x pulled in four different directions.

Sign up to request clarification or add additional context in comments.

3 Comments

This is super helpful. I understand the problem way more in depth now and I am tempted to completely rewrite the question, but it might be more appropriate to write another question instead. I have broken down my problem entirely into mathematical formalism and I think asking how to convert that to a set of LinearConstraints may be more beneficial than my psuedo-not-psuedo-code.
@kriegersan Certainly rewrite it if you want; I'll drop one more comment on this question because when I reduced the math further it doesn't make much sense to me.
The rewrite is now completed. I had some things come up last night so I could only finish it up this morning.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.