0

Creating the model is too slow. Solving it is not the problem here.

I have looked at a few similar questions before, but they don't have the problem from the size that I am giving it.

The constraint that is very slow to add to the model looks something like this:

x out of {0, 1} (binary)
y_0 = 0
y_i = y_(i-1) + x_i; N > i > 0

in other words a variable vector x=[1, 0, 1, 1] would give me

y_1 = 1
y_2 = 1
y_3 = 2

It's actually more complex, but I simplified it here to this minimum reproducible example. The meaning is: I don't want the value y to go above a certain threshold for all i.

I used pulp. Here is an example python script:

from pulp import LpProblem, LpVariable, LpMaximize, LpBinary, PULP_CBC_CMD, lpSum

N = 5000
max_y_val = 5
model = LpProblem(sense=LpMaximize)

print("defining X")
x = [LpVariable(name=f"x_{i}", cat=LpBinary) for i in range(N)]

print("defining Y")
y = [x[0]]
for i in range(1, N):
    print(i)
    y.append(y[i-1] + x[i])

print("add y to model") 
for i in range(N):
    print(i)
    model += y[i] <= max_y_val

print("objective")
model += lpSum(x) # ...

print("start optimizer")
status = model.solve(PULP_CBC_CMD(msg=False, timeLimit=10))

print("score:", model.objective.value())
print("x:", *[x[i].value() for i in range(N)])
print()

With N=1000 it still goes quite fast. At N=5000 it becomes very slow when adding y to the model.

I rewrote the same code in rust (using good_lp library) to see if the issue is with python, but the problem persists. So most likely the issue is the way I am writing my problem. Thus it is not a language specific question.

My suspicion is, that it might be using a matrix M * x = y. The width of the matrix with higher numbers of N would become N + 1. Probably something that looks like a triangular matrix that only has elements on the left_bottom corner.

matrix      x   y

1 0 0 0 0   0   0
1 1 0 0 0   1   1
1 1 1 0 0 * 0 = 1
1 1 1 1 0   1   2
1 1 1 1 1   1   3

Do you know how I can do this with LP solvers in general? I feel like I missing some knowledge here.

1
  • PuLP and its PULP_CBC_CMD interface are slow for large models. There are faster tools around. Commented Sep 3, 2024 at 23:55

2 Answers 2

1

Not a full answer, but too long for a comment...

Can you edit your post and more clearly explain what you want this constraint to do or provide some additional examples? Or add context that would help someone suggest some alternatives?

There is no y in your math program right now. Everything is expressed in terms of x, so it is confusing what you are trying to do other than add up a large number of binary variables.

If you print the problem in pulp you'll see.... I changed to N = 10 and you get...

print(model)

yields:

NoName:
MAXIMIZE
1*x_0 + 1*x_1 + 1*x_2 + 1*x_3 + 1*x_4 + 1*x_5 + 1*x_6 + 1*x_7 + 1*x_8 + 1*x_9 + 0
SUBJECT TO
_C1: x_0 <= 5

_C2: x_0 + x_1 <= 5

_C3: x_0 + x_1 + x_2 <= 5

_C4: x_0 + x_1 + x_2 + x_3 <= 5

_C5: x_0 + x_1 + x_2 + x_3 + x_4 <= 5

_C6: x_0 + x_1 + x_2 + x_3 + x_4 + x_5 <= 5

_C7: x_0 + x_1 + x_2 + x_3 + x_4 + x_5 + x_6 <= 5

_C8: x_0 + x_1 + x_2 + x_3 + x_4 + x_5 + x_6 + x_7 <= 5

_C9: x_0 + x_1 + x_2 + x_3 + x_4 + x_5 + x_6 + x_7 + x_8 <= 5

_C10: x_0 + x_1 + x_2 + x_3 + x_4 + x_5 + x_6 + x_7 + x_8 + x_9 <= 5

VARIABLES
0 <= x_0 <= 1 Integer
0 <= x_1 <= 1 Integer
0 <= x_2 <= 1 Integer
0 <= x_3 <= 1 Integer
0 <= x_4 <= 1 Integer
0 <= x_5 <= 1 Integer
0 <= x_6 <= 1 Integer
0 <= x_7 <= 1 Integer
0 <= x_8 <= 1 Integer
0 <= x_9 <= 1 Integer

Which has the trivial solution of 5

Further: If this really is your intent, then all you need is the last constraint. The other N-1 of them are superfluous.

Sign up to request clarification or add additional context in comments.

1 Comment

+1 for showing me, that I can actually print the the model 😅 My actual model is actually far more complex, but I wanted to evaluate where it would break. This chaining is actually one such point. The triangle in your output shows it very well, that indeed it's going to be extremely large. I also didn't want to burst the scope of this question.
0

This other post gave me the idea, that the space and time needed might just be extremely large.

The model does not store zero values in a matrix, but it still needs to store all the constraints. The number of entries is ~(N+1)²/2. So for a number of scale N=10000 this quickly becomes 100 million entries. Consider the RAM usage of however big each entry stored in the model would be, this could easily become a GB of memory. Plus the overhead of memory copies and potentially writing it to a file, that the solver can use.

I probably need to rethink my problem and optimize.

3 Comments

Memory is your smallest issue. Simplex-style algorithms are in trouble with many constraints (less so: variables). Simplex-style solvers with more than 100k constraints is a very very bad idea. For those instances either go for interior-point solvers (where memory will be relevant again) or if nothing else works: first-order based solvers (e.g. PDLP @ ortools). Not sure though if there is any chance to switch away from simplex using pulp: compare with this table: not sure if Clps IPM for example is accessible though pulp.
On further look: you say LP but you mean MILP (cat=LpBinary). Well then... not much you can do as IPMs as well as FOLPs are not helping you then due to non-competitive warm-start behaviour (needed in BnB-based MILP solvers). MILPs are NP-hard anyway.
Thank you Sascha for the hints. It's a bit above my understanding, but your suggestions are leads on what else to try. I am not actually bound to pulp. pulp is just a way for me to gauge the capabilities of the algorithm. While a large number of constraints might be problematic for the solver, in this particular case just creating the model itself fails.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.