2

I have a binary array. I want to shuffle it such that n percent of random elements stay in the same place and the rest get shuffled. Another way of putting it is, I have a binary array and I want to create a second array that is of same length and n percent similar when compared to the first using difflib

I'm using random.shuffle to shuffle the array but can't find info on the percent part of my question.

import random

array = [1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0]
random.shuffle(array)
print(array)
0

2 Answers 2

1

If you are open to using numpy, you can create a boolean mask to pull approximately n-fraction of the arr, shuffle the pulled sample, then put the shuffled result back into the masked locations.

import numpy as np

n = 0.2
arr = np.array([1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0])

# create boolean mask of about fraction n
ix = np.random.choice([True, False], size=arr.size, replace=True, p=[n, 1-n])

# pull the masked portion and shuffle
arr_shuff = arr[ix]
np.random.shuffle(arr_shuff)

# reinsert
arr[ix] = arr_shuff

arr
# returns:
array([1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0])
Sign up to request clarification or add additional context in comments.

4 Comments

i get the error AttributeError: module 'numpy' has no attribute 'shuffle' when testing this code but np.random.shuffle seems to work
oops. it is supposed to be np.random.shuffle. fixed now
This may be a separate question, but if I want to use this code to return several shuffled arrays (all different, but based on the same initial array) how might I do it? When using a for loop I get the same shuffled array several times for i in range(10): print(arr)
Wrap up the code into a function that takes an array as an input. The first line of function would be to create a copy of the array using arr.copy(), then perform the operations on the copy so as to not change the original array
0

I don't think you would find a standard function for this. You can swap n * len(array) with other items in the array:

import random

array = [1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0]
n = 0.2

for i in range(int(n*len(array))):
    ri = random.randrange(len(array))
    array[i], array[ri] = array[ri], array[i]

print(array)

however, if the array is filled with only zero's or one's you will get no diff.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.