The idea is to create an array, where the values in the first raw correspond to the IDs of the administrative units. The first column corresponds to the tags to the images which are within this administrative unit. Every image has several tags. So the idea is to check if any of the tags are already appended to the array, if they appear for the first time then I append them. If the tag has appeared before than the element on the intersection of this tag and the ID of the administrative unit should increase by 1 (also in the case if they appear for the first time). I have already stacked on this part. So let's say the IDs of the administrative units are 1, 36, 15, 20, 16, 3. And I know that now I analyse the image with tags 'lion,cow,cat,panda' in the administrative unit with the ID = 36. And somewhere before I has the tag 'door', which appeared several times in different administrative units. So I would like to have an array, which will look like:
[0, 1, 36, 15, 20, 16, 3],
['door', 5, 0, 0, 4, 0, 1],
['lion', 0, 1, 0, 0, 0, 0],
['cow', 0, 1, 0, 0, 0, 0],
['cat', 0, 1, 0, 0, 0, 0],
['panda', 0, 1, 0, 0, 0, 0]
So far I have the spatial part and that:
import numpy as np
tags_array = []
np.asarray(tags_array)
tags_array[0:] = [1, 36, 15, 20, 16, 3]
tags = 'lion,cow,cat,panda'
tags_sep = tags.split(',')
my_id = 36
for tag in tags_sep:
#if tag is not yet in the array
tags_array.append(tag) #to the first column
tags_array.append(1) #to the column with the first row equal to 36
#else add +1 to the element in the column 36 and row of the tag
Any hints are really appreciated!
numpyrequired? There are simpler solutions with Python dictionaries, if you just want a count per tag and ID. Alternatively: I would start with Python dictionaries, and format it as lists at the end, if necessary.tags_array = [1, 36, 15, 20, 16, 3] tags = 'lion,cow,cat,panda' tags_sep = tags.split(',') my_id = 36 tagsDict = dict() for tag in tags_sep: if tag not in tagsDict: tagsDict[tag] = {} if my_id not in tagsDict[tag]: tagsDict[tag][my_id] = 1 else: tagsDict[tag][my_id] = tagsDict[tag][my_id] + 1This will produce:{'lion': {36: 1}, 'cow': {36: 1}, 'cat': {36: 1}, 'panda': {36: 1}}