1

I am working on optimizing the decision threshold of a trained decision model for a binary case (targets 0 and 1). I want to optimize the decision threshold of the model, i.e. the point at which the model's probability prediction is interpreted as positive. Example:

prediction is 0.7, default threshold is 0.5 -> positive prediction because 0.7 > 0.5

I know that I can use techniques such as looking at the false-positive ratio to determine a good threshold value. However, I want to do it using constraints.

Specifically, I want to optimize the following equation and corresponding constraints:

Equations showing cost minimization target over the number of false positives and false negatives

Here, c_fp are the costs for a false positive count, I is an indicator function, and tau is the decision threshold. threshold_fp is the total allowed costs for false positive predictions. Likewise for the false negatives (fn). f(x_i) is my model's f prediction, and y_i is the ground truth for input sample x_i.

I am struggling to express this as an optimization problem in cvxpy.

The main challenge for me is: how can I express the total count of false positives (likewise for the false negatives)? This essentially boils down to how to encode a false positive? Here's what I came up with so far:

import numpy as np
import cvxpy as cp

predicted_probabilities = np.array([0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9])
y_test = np.array([0, 0, 1, 0, 1, 0, 1])

c_fp = 100
c_fn = 1

false_positives = np.sum((predicted_probabilities >= 0.5) & (y_test == 0))
false_negatives = np.sum((predicted_probabilities < 0.5) & (y_test == 1))

current_cost = c_fp * false_positives + c_fn * false_negatives
print(f"Current cost: {current_cost}")

threshold = cp.Variable()

false_positives = cp.sum(cp.multiply((predicted_probabilities >= threshold), (y_test == 0)))

false_negatives = cp.sum(cp.multiply((predicted_probabilities < threshold), (y_test == 1)))

cost = c_fp * false_positives + c_fn * false_negatives

problem = cp.Problem(cp.Minimize(cost))

problem.solve()

print(f"Optimal threshold: {threshold.value}")

But I am always coming back to this error: TypeError: float() argument must be a string or a real number, not 'Inequality'

Searching SO, I found a question about this TypeError in CVXPY: float() argument must be a string or a number, not 'Inequality'

This however deals with a different problem.

After doing some experimenting with this, I appreciate input on how I can model this as an optimization problem. I am aware of the rules of expressing AND, OR as constraints (https://cs.stackexchange.com/questions/12102/express-boolean-logic-operations-in-zero-one-integer-linear-programming-ilp/43884#43884), but I can't get my head around how to connect that to use in my optimization case.

Your support here is appreciated.

==============

Edit: Thanks to Alex Kemper's answer, I was able to solve this optimization problem using z3py:

#  https://ericpony.github.io/z3py-tutorial/guide-examples.htm
from copy import copy
from z3 import z3
import numpy as np

predicted_probabilities = np.array([0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9])
y_test = np.array([0, 0, 1, 0, 1, 0, 1])

c_fp = 100
c_fn = 10

false_positives = np.sum((predicted_probabilities >= 0.5) & (y_test == 0))
false_negatives = np.sum((predicted_probabilities < 0.5) & (y_test == 1))

current_cost = c_fp * false_positives + c_fn * false_negatives
print(f"Current cost: {current_cost}")
print(f"False positives: {false_positives}")
print(f"False negatives: {false_negatives}")

n = len(y_test)
iRange = range(n)

threshold = z3.Real("threshold")

false_positives = z3.Sum([(predicted_probabilities[i] >= threshold) for i in iRange if (y_test[i] == 0)])
false_negatives = z3.Sum([(predicted_probabilities[i] < threshold) for i in iRange if y_test[i] == 1])
cost = c_fp * false_positives + c_fn * false_negatives

opt = z3.Optimize()
opt.minimize(cost)
if opt.check() == z3.sat:
    m = opt.model()
    
    t = m[threshold].as_decimal(10)
    t = float(t)
    print(f"Optimal threshold: {t}")

    false_positives = np.sum((predicted_probabilities >= t) & (y_test == 0))
    false_negatives = np.sum((predicted_probabilities < t) & (y_test == 1)) 
    cost = c_fp * false_positives + c_fn * false_negatives
    print(f"Optimal cost: {cost}")
    print(f"False positives: {false_positives}")
    print(f"False negatives: {false_negatives}")

1 Answer 1

2

Switching variables map inequalities to Booleans

My solution uses a Boolean array above[] as auxiliary variable to tell, which probabilities are above the threshold. This allows to count the false positives and false negatives.

import numpy as np
import cvxpy as cp

predicted_probabilities = np.array([0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9])
y_test = np.array([0, 0, 1, 0, 1, 0, 1])

c_fp = 100
c_fn = 1

false_positives = np.sum((predicted_probabilities >= 0.5) & (y_test == 0))
false_negatives = np.sum((predicted_probabilities < 0.5) & (y_test == 1))

current_cost = c_fp * false_positives + c_fn * false_negatives
print(f"Current cost: {current_cost}")

threshold = cp.Variable()

n = len(y_test)
iRange = range(n)

#  store the differences between probabilities and threshold as array
comparisons = [ (predicted_probabilities[i] - threshold) for i in iRange ]
#  see https://stackoverflow.com/a/79083660/1911064
M = 10000 # big M - a number that is larger than any possible value of comparisons[]
above = cp.Variable(n, boolean=True)

# above[i] can only be zero if comparisons[i] is non-positive
constraints = [cp.pos(comparisons[i]) <= M * above[i] for i in iRange]
# above[i] can only be one if comparisons[i] is non_negative
constraints += [cp.neg(comparisons[i]) <= M * (1-above[i]) for i in iRange] 

false_positives = cp.sum([above[i] for i in iRange if (y_test[i] == 0)])
false_negatives = cp.sum([1 - above[i] for i in iRange if (y_test[i] == 1)])
cost = c_fp * false_positives + c_fn * false_negatives

objective = cp.Minimize(cost)                         
problem = cp.Problem(objective, constraints)

problem.solve()

print(f"Optimal threshold: {threshold.value}")


I wonder if the constraints can be written more elegantly by using element-wise functions.


z3py as alternative to cvxpy

Using z3py, the following short model would be sufficient:

#  https://ericpony.github.io/z3py-tutorial/guide-examples.htm
from z3 import z3

predicted_probabilities = [0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9]
y_test =                  [  0,   0,   1,   0,   1,   0,   1]

c_fp = 100
c_fn = 1

n = len(y_test)
iRange = range(n)

threshold = z3.Real("threshold")

false_positives = z3.Sum([(predicted_probabilities[i] > threshold) for i in iRange if (y_test[i] == 0)])
false_negatives = z3.Sum([(predicted_probabilities[i] <= threshold) for i in iRange if y_test[i] == 1])
cost = c_fp * false_positives + c_fn * false_negatives

opt = z3.Optimize()
opt.minimize(cost)
if opt.check() == z3.sat:
    m = opt.model()
    print(f"Optimal threshold: {m[threshold]}")

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.