1

I want to generate 2x6 dataframe which represents a Rack.Half of this dataframe are filled with storage items and the other half is with retrieval items. I want to do is random chosing half of these 12 items and say that they are storage and others are retrieval. How can I randomly choose?

I tried random.sample but this chooses random columns.Actually I want to choose random items individually.

1
  • Providing text instead of images helps to get faster recommendations from the community Commented Mar 29, 2022 at 19:46

1 Answer 1

1

Assuming this input:

   0  1  2  3   4   5
0  0  1  2  3   4   5
1  6  7  8  9  10  11

You can craft a random numpy array to select/mask half of the values:

a = np.repeat([True,False], df.size//2)
np.random.shuffle(a)
a = a.reshape(df.shape)

Then select your two groups:

df.mask(a)
     0   1    2    3   4     5
0  NaN NaN  NaN  3.0   4   NaN
1  6.0 NaN  8.0  NaN  10  11.0

df.where(a)
     0  1    2    3   4    5
0  0.0  1  2.0  NaN NaN  5.0
1  NaN  7  NaN  9.0 NaN  NaN

If you simply want 6 random elements, use nummy.random.choice:

np.random.choice(df.to_numpy(). ravel(), 6, replace=False)

Example:

array([ 4,  5, 11,  7,  8,  3])
Sign up to request clarification or add additional context in comments.

6 Comments

df.mask and df.where are the exact solutions I've been looking for, thanks a lot. But for the dataframe operations how can I get rid of NaN values?
@GTek what would you want as output? You can pick any fill value in mask/where, for example df.mask(a, -999)
I want to calculate index and column distances for each items in df.mask(a) and df.where(a) dataframes.For example 4 is in (0,4) in df.mask(a) and 1 is in (0,1) in df.where(a). index distance= 0-0=0 and column distance=4-1=3. I want to calculate this type of operations for each pair
Ok, then you can stack, this will get rid of the NaNs and you'll get the row/col as MultiIndex
When I stack, then there is a warning like that Series' object has no attribute 'DataFrame' when I try an operation like: for r in df.mask(a).index: for c in df.mask(a).columns : g = dfmask.at[r,c] #dict3 = { i : [r,c] } for r2 in df.where(a).index: for c2 in df.where(a).columns: t = dfwhere.at[r2,c2] #dict2 = { j : [int(r2),int(c2)] } if r>=r2: VD.at[g,t]= int(r)-int(r2)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.