1

First list : contains the list indexes of corresponding category name

Second list : contains the category names as string

Intervals=[[Indexes_Cat1],[Indexes_Cat2],[Indexes_Cat3], ...]

Category_Names=["cat1","cat2","cat3",...]

Desired Output:

list=["cat1", "cat1","cat2","cat3","cat3"]

where indexes of any element in output list is placed using Intervals list.

Ex1:

Intervals=[[0,4], [2,3] , [1,5]]
Category_Names=["a","b","c"]

Ex: Output1

["a","c","b","b","a","c"]

Edit: More Run Cases

Ex2:

Intervals=[[0,1], [2,3] , [4,5]]
Category_Names=["a","b","c"]

Ex: Output2

["a","a","b","b","c","c"]

Ex3:

Intervals=[[3,4], [1,5] , [0,2]]
Category_Names=["a","b","c"]

Ex: Output3

["c","b","c","a","a","b"]

My solution:

Create any empty array of size n.

Run a for loop for each category.

output=[""]*n
for i in range(len(Category_Names)):
    for index in Intervals[I]:
       output[index]=Categories[i]  

Is there a better solution, or a more pythonic way? Thanks

2
  • Do you have an example that others can actually run please? Commented Mar 10, 2019 at 11:24
  • @Paddy3118 I added 2 more cases Commented Mar 10, 2019 at 11:29

3 Answers 3

2
def categorise(Intervals=[[0,4], [2,3] , [1,5]],
               Category_Names=["a","b","c"]):
    flattened = sum(Intervals, [])
    answer = [None] * (max(flattened) + 1)
    for indices, name in zip(Intervals, Category_Names):
        for i in indices:
            answer[i] = name
    return answer

assert categorise() == ['a', 'c', 'b', 'b', 'a', 'c']
assert categorise([[3,4], [1,5] , [0,2]], 
                  ["a","b","c"]) == ['c', 'b', 'c', 'a', 'a', 'b']

Note that in this code you will get None values in the answer if the "intervals" don't cover all integers from zero to the max interval number. It is assumed that the input is compatable.

Sign up to request clarification or add additional context in comments.

Comments

2

I am not sure if there is a way to avoid the nested loop (I can't think of any right now) so it seems your solution is good.

A way you could do it a bit better is to construct the output array with one of the categories:

output = [Category_Names[0]]*n

and then start the iteration skipping that category:

for i in range(1, len(Category_Names)):

If you know there is a category that appears more than the others then you should use that as the one initializing the array.

I hope this helps!

4 Comments

I actually thought your solution but couldn't make sure creating an empty array or with default values runs faster. Thanks ,Appreciated
Well, in your solution you were already creating the n-size array but with empty strings (output=[""]*n) so using one of the categories saves some time as you don't need to rewrite those ones. I am glad that my answer was helpful.
BTW does the n come as an input or do you need to calculate it?
@jpmontoya n is known, it is known (a O(1) look up). Thanks
1

You can reduce the amount of strings created and use enumerate to avoid range(len(..)) for indexing.

Intervals=[[0,4], [2,3] , [1,5]]
Category_Names=["a","b","c"]

n = max(x for a in Intervals for x in a) + 1

# do not construct strings that get replaced anyhow    
output=[None] * n

for i,name in enumerate(Category_Names):
    for index in Intervals[i]:
       output[index]=name

print(output)

Output:

["a","c","b","b","a","c"]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.