I got a waveform object, define as following:
class wfm:
"""Class defining a waveform characterized by:
- A name
- An electrode configuration
- An amplitude (mA)
- A pulse width (microseconds)"""
def __init__(self, name, config, amp, width=300):
self.name = name
self.config = config
self.amp = amp
self.width = width
def __eq__(self, other):
return type(other) is self.__class__ and other.name == self.name and other.config == self.config and other.amp == self.amp and other.width == self.width
def __ne__(self, other):
return not self.__eq__(other)
Through parsing, I get a list called waveforms with 770 instance of wfm in it. There is a lot of duplicate, and I need to delete them.
My idea was to get the ID of equivalent object, store the largest ID in a list, and then loop on all the waveforms from the end while popping out each duplicate.
Code:
duplicate_ID = []
for i in range(len(waveforms)):
for j in range(i+1, len(waveforms)):
if waveforms[i] == waveforms[j]:
duplicate_ID.append(waveforms.index(waveforms[j]))
print ('{} eq {}'.format(i, j))
duplicate_ID = list(set(duplicate_ID)) # If I don't do that; 17k IDs
Turns out (thx to the print) that I have duplicates that don't appear into the ID list, for instance 750 is a duplicate of 763 (print says it; test too) and yet none of this 2 IDs appears in my duplicate list.
I'm quite sure there is a better solution that this method (which doesn't yet work), and I would be glad to hear it. Thanks for the help!
EDIT: More complicated scenario
I've got a more complicated scenario. I got 2 classes, wfm (see above) and stim:
class stim:
"""Class defining the waveform used for a stimultion by:
- Duration (milliseconds)
- Frequence Hz)
- Pattern of waveforms"""
def __init__(self, dur, f, pattern):
self.duration = dur
self.fq = f
self.pattern = pattern
def __eq__(self, other):
return type(other) is self.__class__ and other.duration == self.duration and other.fq == self.fq and other.pattern == self.pattern
def __ne__(self, other):
return not self.__eq__(other)
I parse my files to fill a dict: paradigm. It looks like that:
paradigm[file name STR] = (list of waveforms, list of stimulations)
# example:
paradigm('myfile.xml') = ([wfm1, ..., wfm10], [stim1, ..., stim5])
Once again, I want to delete the duplicates, i.e. I want to keep only the data where:
- Waveforms are the same
- And stim is the same
Example:
file1 has 10 waveforms and file2 has the same 10 waveforms.
file1 has stim1 and stim2 ; file2 has stim3, sitm 4 and stim 5.
stim1 and stim3 are the same; so since the waveforms are also the same, I want to keep:
file1: 10 waveforms and stim1 and stim2
file2: 10 waveforms and stim 4 and stim5
The correlation is kinda messy in my head, so I got a few difficulties finding the right storage solution for wave forms and stimulation in order to compare them easly. If you got any idea, I'd be glad to hear it. Thanks!
widtha default argument but placed the value300in a wrong place, right?class.WFM = namedtuple('WFM', ['name', 'config', 'amp', 'width']). Then callingset()on a list of instances ofWFMwill remove all the duplicates.