I want to evaluate a binary function at M different locations. The binary functions is of dimension D so the function space to evaluate it is exponential 2^D (i.e. its inputs are all [0,1]^D "strings"). I want its evaluation at M unique/different points that are evenly/randomly distributed in the input space. I know how to generate all permutations if I wanted (essentially just tack a 0 or 1 to all previous permutations that have already been evaluated. (Pseudo)code in the appendix). However, thats exponential and I only need a fraction M points, not all 2^D. Is there a quick way to do it? I thought of generating the binary arrays randomly and then if there was a collision to try another value until it was unique. As in:
while i < n:
b_D = randomly_generate_bit_array_length(D) # simply selects each bit randomly with 1/2 prob of choosing 0 or 1.
if not b_n in HashTable[b_D]:
HashTable[b_D] = b_D
i++
however, according to the answer I got here that algorithm seems exponential. Is there a better way to do this where the runtime of my algorithm is only M instead of something that depends exponentially on D?
Notice that its not an acceptable answer to try to generate all permutations until we have M points because I need the permutations to be randomly distributed among the binary arrays. i.e. if we start from [0,0,...,0] then generate [1,0,...,0] until we have M is not what I want because most of my arrays will end with 0's and I want to get a unbiased distribution of them.
To generate all permutations:
def _generate_input_data_full(D):
'''
full means generate all 2^D.
'''
D = D-1
prev_all_permutations = [ [-1], [1] ]
current_all_permutations = prev_all_permutations
for k in range(D):
current_all_permutations = []
for prev_perm in prev_all_permutations:
#print('prev_perm: ', prev_perm)
for b in [-1,1]:
new_perm = prev_perm + [b]
current_all_permutations.append(new_perm)
prev_all_permutations = current_all_permutations
D=8,M=60,000but I might increase it toD=256andM=120,000.D=8,M=60,000doesn't even work (you can't find 60000 different points when there are only 256). And withD=256andM=120,000you're practically guaranteed to not have any collisions so it's O(MD). It only degenerates to being really bad if you're somewhat close to exhausting the space.2^DI tend to forget that smallDactually lead reasonable numbers (was thinking to much about the rate of growth).random.samplesource code. It picks one of two different algorithms, depending on how many elements are wanted and on how large the population is.