1

I am using pyomo to generate my model which depends on input data from a pandas data frame. In the model, I need to add a binary variable for every pair (column, row) in the data frame for which the corresponding entry is greater than zero.

So far I am doing the following (which works) but it generates too many variables, obviously.

df = pd.read_csv(...)    
cols = df.columns.tolist()
rows = df.index.tolist()

model = ConcreteModel()
model.myVars = Var(cols, rows, within=Binary, name="myVars")

What's the easiest and most elegant way to only generate a variable in myVars for c in cols and r in rows if df[c][r] > 0?

1 Answer 1

0

If I understood correctly, you need to find the indexes of the non-zero elements. I usually solve such tasks using numpy.where() function.

Consider the following dataset:

d = np.zeros((10,5))
rows = np.array([1,5,8,5])
cols = np.array([0,1,4,3])
d[rows,cols] = 0.5
df = pd.DataFrame(d,columns=["a","b","c","d","e"])

Where the df is all filled with zeros except few elements.

Finding the indexes of non-zero elements:

M = df.to_numpy()
rows, cols = np.where(M>0)

Then extracting the dataframe row/column names:

col_names = df.columns[cols].to_list()
row_names = df.index[rows].to_list()

If you need to find unique pairs then you can put the col_names and row_names into a dataframe and drop duplicates:

df = pd.DataFrame({"r":row_names,"c":col_names})
df_unique = df.drop_duplicates(subset = ['r','c'], keep=False)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.