LP Solver; Setting up model constraints for large number of chained dependency variables is slow

Question

Creating the model is too slow. Solving it is not the problem here.

I have looked at a few similar questions before, but they don't have the problem from the size that I am giving it.

The constraint that is very slow to add to the model looks something like this:

x out of {0, 1} (binary)
y_0 = 0
y_i = y_(i-1) + x_i; N > i > 0

in other words a variable vector x=[1, 0, 1, 1] would give me

y_1 = 1
y_2 = 1
y_3 = 2

It's actually more complex, but I simplified it here to this minimum reproducible example. The meaning is: I don't want the value y to go above a certain threshold for all i.

I used pulp. Here is an example python script:

from pulp import LpProblem, LpVariable, LpMaximize, LpBinary, PULP_CBC_CMD, lpSum

N = 5000
max_y_val = 5
model = LpProblem(sense=LpMaximize)

print("defining X")
x = [LpVariable(name=f"x_{i}", cat=LpBinary) for i in range(N)]

print("defining Y")
y = [x[0]]
for i in range(1, N):
    print(i)
    y.append(y[i-1] + x[i])

print("add y to model") 
for i in range(N):
    print(i)
    model += y[i] <= max_y_val

print("objective")
model += lpSum(x) # ...

print("start optimizer")
status = model.solve(PULP_CBC_CMD(msg=False, timeLimit=10))

print("score:", model.objective.value())
print("x:", *[x[i].value() for i in range(N)])
print()

With N=1000 it still goes quite fast. At N=5000 it becomes very slow when adding y to the model.

I rewrote the same code in rust (using good_lp library) to see if the issue is with python, but the problem persists. So most likely the issue is the way I am writing my problem. Thus it is not a language specific question.

My suspicion is, that it might be using a matrix M * x = y. The width of the matrix with higher numbers of N would become N + 1. Probably something that looks like a triangular matrix that only has elements on the left_bottom corner.

matrix      x   y

1 0 0 0 0   0   0
1 1 0 0 0   1   1
1 1 1 0 0 * 0 = 1
1 1 1 1 0   1   2
1 1 1 1 1   1   3

Do you know how I can do this with LP solvers in general? I feel like I missing some knowledge here.

PuLP and its PULP_CBC_CMD interface are slow for large models. There are faster tools around. — Erwin Kalvelagen
– Erwin Kalvelagen, Commented Sep 3, 2024 at 23:55

AirSquid · Accepted Answer · 2024-08-22 00:48:17Z

1

Not a full answer, but too long for a comment...

Can you edit your post and more clearly explain what you want this constraint to do or provide some additional examples? Or add context that would help someone suggest some alternatives?

There is no y in your math program right now. Everything is expressed in terms of x, so it is confusing what you are trying to do other than add up a large number of binary variables.

If you print the problem in pulp you'll see.... I changed to N = 10 and you get...

print(model)

yields:

NoName:
MAXIMIZE
1*x_0 + 1*x_1 + 1*x_2 + 1*x_3 + 1*x_4 + 1*x_5 + 1*x_6 + 1*x_7 + 1*x_8 + 1*x_9 + 0
SUBJECT TO
_C1: x_0 <= 5

_C2: x_0 + x_1 <= 5

_C3: x_0 + x_1 + x_2 <= 5

_C4: x_0 + x_1 + x_2 + x_3 <= 5

_C5: x_0 + x_1 + x_2 + x_3 + x_4 <= 5

_C6: x_0 + x_1 + x_2 + x_3 + x_4 + x_5 <= 5

_C7: x_0 + x_1 + x_2 + x_3 + x_4 + x_5 + x_6 <= 5

_C8: x_0 + x_1 + x_2 + x_3 + x_4 + x_5 + x_6 + x_7 <= 5

_C9: x_0 + x_1 + x_2 + x_3 + x_4 + x_5 + x_6 + x_7 + x_8 <= 5

_C10: x_0 + x_1 + x_2 + x_3 + x_4 + x_5 + x_6 + x_7 + x_8 + x_9 <= 5

VARIABLES
0 <= x_0 <= 1 Integer
0 <= x_1 <= 1 Integer
0 <= x_2 <= 1 Integer
0 <= x_3 <= 1 Integer
0 <= x_4 <= 1 Integer
0 <= x_5 <= 1 Integer
0 <= x_6 <= 1 Integer
0 <= x_7 <= 1 Integer
0 <= x_8 <= 1 Integer
0 <= x_9 <= 1 Integer

Which has the trivial solution of 5

Further: If this really is your intent, then all you need is the last constraint. The other N-1 of them are superfluous.

edited Aug 22, 2024 at 0:48

answered Aug 21, 2024 at 15:25

AirSquid

12k2 gold badges11 silver badges40 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

GRASBOCK Over a year ago

+1 for showing me, that I can actually print the the model 😅 My actual model is actually far more complex, but I wanted to evaluate where it would break. This chaining is actually one such point. The triangle in your output shows it very well, that indeed it's going to be extremely large. I also didn't want to burst the scope of this question.

GRASBOCK · Accepted Answer · 2024-08-21 03:35:53Z

0

This other post gave me the idea, that the space and time needed might just be extremely large.

The model does not store zero values in a matrix, but it still needs to store all the constraints. The number of entries is ~(N+1)²/2. So for a number of scale N=10000 this quickly becomes 100 million entries. Consider the RAM usage of however big each entry stored in the model would be, this could easily become a GB of memory. Plus the overhead of memory copies and potentially writing it to a file, that the solver can use.

I probably need to rethink my problem and optimize.

answered Aug 21, 2024 at 3:35

GRASBOCK

7377 silver badges20 bronze badges

3 Comments

sascha Over a year ago

Memory is your smallest issue. Simplex-style algorithms are in trouble with many constraints (less so: variables). Simplex-style solvers with more than 100k constraints is a very very bad idea. For those instances either go for interior-point solvers (where memory will be relevant again) or if nothing else works: first-order based solvers (e.g. PDLP @ ortools). Not sure though if there is any chance to switch away from simplex using pulp: compare with this table: not sure if Clps IPM for example is accessible though pulp.

sascha Over a year ago

On further look: you say LP but you mean MILP (cat=LpBinary). Well then... not much you can do as IPMs as well as FOLPs are not helping you then due to non-competitive warm-start behaviour (needed in BnB-based MILP solvers). MILPs are NP-hard anyway.

GRASBOCK Over a year ago

Thank you Sascha for the hints. It's a bit above my understanding, but your suggestions are leads on what else to try. I am not actually bound to pulp. pulp is just a way for me to gauge the capabilities of the algorithm. While a large number of constraints might be problematic for the solver, in this particular case just creating the model itself fails.

Collectives™ on Stack Overflow

LP Solver; Setting up model constraints for large number of chained dependency variables is slow

2 Answers 2

1 Comment

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related