I've got what I think is a somewhat interesting problem, even just from a programming exercise point of view.
I have a long list of binary patterns that I want to reduce into a more compact form to present to users. The notation to be followed is that a '-' can represent either a '1' or a '0', so ['1011','1010'] could be represented by ['101-'] and
['1100', '1000', '0100', '0000', '1111', '1011', '0111', '0011']
could be represented by ['--00', '--11']. Note all patterns are always the same length (though quite possibly longer than 4 bits).
Expanding the patterns is fairly trivial, reducing them is a bit trickier.
I've come up with some code that accomplishes this, but it is long, slow, and kind of hard to read.
def reducePatterns(patterns):
'''Reduce patterns into compact dash notation'''
newPatterns = [] #reduced patterns
matched = [] #indexes with a string that was already matched
for x,p1 in enumerate(patterns): #pattern1
if x in matched: continue #skip if this pattern has already been matched
for y,p2 in enumerate(patterns[x+1:],1):
if x+y in matched: continue #skip if this pattern has already been matched
diffs=0 # number of differences found
for idx,bit in enumerate(zip(p1,p2)):
if bit[0] != bit [1]: #count the number of bits that a different
diffs += 1
dbit = idx
if diffs >1:break
if diffs ==1: #if exactly 1 bit is different between the two, they can be compressed together
newPatterns.append(p1[:dbit]+'-'+p1[dbit+1:])
matched+=[x,x+y]
break
if x not in matched: newPatterns.append(p1) #if the pattern wasn't matched, just append it as is.
if matched: #if reductions occured on this run, then call again to check if more are possible.
newPatterns = reducePatterns(newPatterns)
return newPatterns
Does anyone out there have suggestions for a better/more efficient way to do this? More effective looping/use of iterators? Regex magic? Some bitwise manipulation package I've been missing? something a little bit more readable at least?
----:P'----'would also include things like'1110'that aren't in the original set.reducePatterns(['1100', '1000', '0100', '0000', '1111', '1011', '0111, '0011'])give me the output:['10--', '1100']. But it should be['10--', '1_00']