Reducing binary patterns in Python

Question

I've got what I think is a somewhat interesting problem, even just from a programming exercise point of view.

I have a long list of binary patterns that I want to reduce into a more compact form to present to users. The notation to be followed is that a '-' can represent either a '1' or a '0', so ['1011','1010'] could be represented by ['101-'] and

['1100', '1000', '0100', '0000', '1111', '1011', '0111', '0011']

could be represented by ['--00', '--11']. Note all patterns are always the same length (though quite possibly longer than 4 bits).

Expanding the patterns is fairly trivial, reducing them is a bit trickier.

I've come up with some code that accomplishes this, but it is long, slow, and kind of hard to read.

def reducePatterns(patterns):
    '''Reduce patterns into compact dash notation'''
    newPatterns = []  #reduced patterns
    matched = []      #indexes with a string that was already matched
    for x,p1 in enumerate(patterns):    #pattern1
        if x in matched: continue       #skip if this pattern has already been matched
        for y,p2 in enumerate(patterns[x+1:],1):
            if x+y in matched: continue #skip if this pattern has already been matched
            diffs=0     # number of differences found
            for idx,bit in enumerate(zip(p1,p2)):
                if bit[0] != bit [1]:     #count the number of bits that a different
                    diffs += 1
                    dbit  = idx
                if diffs >1:break
            if diffs ==1:   #if exactly 1 bit is different between the two, they can be compressed together
                newPatterns.append(p1[:dbit]+'-'+p1[dbit+1:])
                matched+=[x,x+y]
                break
        if x not in matched: newPatterns.append(p1) #if the pattern wasn't matched, just append it as is.

    if matched:         #if reductions occured on this run, then call again to check if more are possible.
        newPatterns = reducePatterns(newPatterns)

    return newPatterns

Does anyone out there have suggestions for a better/more efficient way to do this? More effective looping/use of iterators? Regex magic? Some bitwise manipulation package I've been missing? something a little bit more readable at least?

@Doorknob, that would match all 16 possible 4-bit strings, but there are only eight in the list given. — senderle
– senderle, Commented Feb 13, 2013 at 19:10
for the second example? not quite, as '----' would also include things like '1110' that aren't in the original set. — TheONP
– TheONP, Commented Feb 13, 2013 at 19:11
I had the same problem. I tried your code and it doesn't do the job 100% correctly. Like for example: reducePatterns(['1100', '1000', '0100', '0000', '1111', '1011', '0111, '0011']) give me the output: ['10--', '1100']. But it should be ['10--', '1_00'] — HighwayJohn
– HighwayJohn, Commented Sep 22, 2016 at 8:50
I wanted to write: reducePatterns(['1000', '1010', '1100', '1001', '1011']) — HighwayJohn
– HighwayJohn, Commented Sep 22, 2016 at 11:19

Community · Accepted Answer · 2017-05-23 12:08:23Z

5

What you are looking for is Quine–McCluskey algorithm implementation in Python.

A quick google took me to this SO Page Quine-McCluskey algorithm in Python

edited May 23, 2017 at 12:08

CommunityBot

11 silver badge

answered Feb 13, 2013 at 19:17

Abhijit

64k20 gold badges143 silver badges209 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

TheONP Over a year ago

Neat. I guess I should have figured there was a formal name for this process, there's no way I would have stumbled across it on my own though.

Zero Piraeus · Accepted Answer · 2013-02-13 20:28:31Z

1

I haven't tested this thoroughly (and it uses recursion in a probably-not-very-efficient way), but it seems to work, and at least meets your "more readable" criterion:

from itertools import combinations

def collapse(binaries):
    result = set(binaries)
    for b1, b2 in combinations(result, 2): 
        for i in range(len(b1)):
            if b1[:i] + b1[i+1:] == b2[:i] + b2[i+1:]:
                merged = "-".join([b1[:i], b1[i+1:]])
                return sorted(collapse(result ^ {b1, b2, merged}))
    return sorted(result)

Examples:

>>> collapse(['1100', '1000', '0100', '0000', '1111', '1011', '0111', '0011'])
['--00', '--11']

>>> collapse(["00", "01", "10", "11"])
['--']

>>> collapse(["011", "101", "111"])
['-11', '101']

edited Feb 13, 2013 at 20:28

answered Feb 13, 2013 at 20:22

Zero Piraeus

59.7k28 gold badges158 silver badges164 bronze badges

7 Comments

TheONP Over a year ago

Definitely nicer to look at, I'll definitely be borrowing a few of your ideas. I have to agree with you about the way you've done the recursion though...there's a lot of wasted work in there.

HighwayJohn Over a year ago

I had the same problem. I tried your code and it doesn't do the job 100% correctly. Like for example: collapse(['1100', '1000', '0100', '0000', '1111', '1011', '0111, '0011']) give me the output: ['10--', '1100']. But it should be ['10--', '1_00'] Otherwise your code is very short and looks nice. I dont get so far how your code is doing it, but perhaps you have an Idea what the mistake could be?

Zero Piraeus Over a year ago

@HighwayJohn Did you paste the wrong input into your comment? What you've shown is one of the examples from my answer, and (once the missing quote is fixed) continues to correctly return ['--00', '--11'] for me.

HighwayJohn Over a year ago

Oh, you are right. I wanted to write: collapse(['1000', '1010', '1100', '1001', '1011'])

Zero Piraeus Over a year ago

Well, ['10--', '1100'] is a correct representation of that input. ['10--', '1-00'] results in '1000' being represented twice.

|

Collectives™ on Stack Overflow

Reducing binary patterns in Python

2 Answers 2

1 Comment

7 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

7 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related