Simplify the use of multiple hashings in Python

Question

I have a CSV file with about 700 rows and 3 columns, containing label, rgb and string information, e.g.:

str;      rgb;                   label;         color
bones;    "['255','255','255']"; 2;             (241,214,145)
Aorta;    "['255','0','0']";     17;            (216,101,79)
VenaCava; "['0','0','255']";     16;            (0,151,206)

I'd like to create a simple method to convert one unique input to one unique output.

One solution would be to hash all ROIDisplayColor entries with corresponding label entries as dictionary e.g. rgb2label:

with open("c:\my_file.csv") as csv_file:
    rgb2label, label2rgb = {}, {} # rgb2str, label2str, str2label...
    for row in csv.reader(csv_file):
        rgb2label[row[1]] = row[2]
        label2rgb[row[2]] = row[1]

This could simply be used as follows:

>>> rgb2label[ "['255','255','255']"]
'2'
>>> label2rgb['2']
"['255','255','255']"

The application is sumple but requires an unique unique dictionary for every relation (rgb2label,rgb2str,str2rgb,str2label, etc...).

Does a more compact solution with the same ease of use exist?

A hash does not guarantee that a != b -> hash[a] != hash[b], only that a == b -> hash[a] == hash[b]. — chepner
– chepner, Commented Feb 14, 2019 at 14:33
@amanb I've added a sample of three rows with headers to the question. — M.G.Poirot
– M.G.Poirot, Commented Feb 14, 2019 at 14:58
@chepner thanks for your response. I know, but I happen to know my data is presented this way. — M.G.Poirot
– M.G.Poirot, Commented Feb 14, 2019 at 14:59

djoffe · Accepted Answer · 2019-02-14 15:46:23Z

1

Here you're limiting yourself to one-to-one dictionaries, so you end up with loads of them (4^2=16 here).

You could instead use one-to-many dictionaries, so you'll have only 4:

for row in csv.reader(csv_file):
    rgb[row[1]] = row
    label[row[2]] = row

That you would use like this:

>>> rgb[ "['255','255','255']"][2]
'2'
>>> label['2'][1]
"['255','255','255']"

You could make this clearer by turning your row into a dict as well:

for row in csv.reader(csv_file):
    name, rgb, label, color = row
    d = {"rgb": rgb, "label": label}
    rgb[row[1]] = d
    label[row[2]] = d

That you would use like this:

>>> rgb[ "['255','255','255']"]["label"]
'2'
>>> label['2']["rgb"]
"['255','255','255']"

answered Feb 14, 2019 at 15:46

djoffe

1129 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

M.G.Poirot Over a year ago

Thanks for your response. I had though about each dict entry being a list, but not of being another dict. Your proposed "dict-of-dicts" would be clearer in use than a "dict-of-list", thanks, I'll consider using it.

Collectives™ on Stack Overflow

Simplify the use of multiple hashings in Python

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related