Looking for alternative to nested loops in Python

Question

I have developed the following code to check if groups of three people are conected at the same time

import pandas as pd
from itertools import combinations

data = {
    'User': ['Esther','Jonh', 'Ann', 'Alex', 'Jonh', 'Alex', 'Ann', 'Beatrix'],
    'InitialTime': ['01/01/2023  00:00:00','01/01/2023  00:00:00', '01/01/2023  00:00:05', '01/01/2023  00:00:07', '01/01/2023  00:00:12', '01/01/2023  00:00:14', '01/01/2023  00:00:15', '01/01/2023  00:00:16'],
    'FinalTime': ['01/01/2023  00:10:00','01/01/2023  00:00:10', '01/01/2023  00:00:12', '01/01/2023  00:00:12','01/01/2023  00:00:16', '01/01/2023  00:00:16', '01/01/2023  00:00:17', '01/01/2023  00:00:17']
}
df=pd.DataFrame(data)

def calculate_overlapped_time(df):
    df['InitialTime'] = pd.to_datetime(df['InitialTime'], format='%d/%m/%Y %H:%M:%S')
    df['FinalTime'] = pd.to_datetime(df['FinalTime'], format='%d/%m/%Y %H:%M:%S')

    overlapped_time = {}

    for i, row_i in df.iterrows():
        for j, row_j in df.iterrows():
            for k, row_k in df.iterrows():
                if i != j and i != k and j != k:
                    initial_time = max(row_i['InitialTime'], row_j['InitialTime'], row_k['InitialTime'])
                    final_time = min(row_i['FinalTime'], row_j['FinalTime'], row_k['FinalTime'])
                    superposicion = max(0, (final_time - initial_time).total_seconds())

                    clave = f"{row_i['User']}-{row_j['User']}-{row_k['User']}"
                    if clave not in overlapped_time:
                        overlapped_time[clave] = 0
                    overlapped_time[clave] += superposicion

    results = pd.DataFrame(list(overlapped_time.items()), columns=['Group', 'OverlappingTime'])
    results['OverlappingTime'] = results['OverlappingTime'].astype(int)

    return results

results_df = calculate_overlapped_time(df)

I want to calculate the overlaping time for groups of roughly 10 people, thus, a code with so many overlapping loops becomes impractical.

Can somebody please tell me if there is an alternative to make this code more scalable to be able to find groups of a bigger size without for loops?

You'd want an index (not necessarily Pandas) over the time periods, and then check their overlaps. — AKX
– AKX, Commented Dec 11, 2023 at 15:22
An approach would be to sort the list of intervals (which have the user as an associated property) and then iterate over the list. It'll be a bit tricky, but doable. It should have a complexity of O(n log(n)) (for the sort) and O(n) (which vanishes because of the sort) to find the triples. — Ronald
– Ronald, Commented Dec 11, 2023 at 15:32

Michael Cao · Accepted Answer · 2023-12-11 15:18:11Z

1

Looks like you're just pulling up combinations of rows from the same Dataframe. In that case, you can just itertools.combination and use only one loop:

import itertools as it
for [i, row_i], [j, row_j], [k, row_k] in it.combinations(df.iterrows(), 3):
    # Loop code here

answered Dec 11, 2023 at 15:18

Michael Cao

3,7511 gold badge3 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

AKX Over a year ago

This still does the same thing internally, it's not any more efficient.

Collectives™ on Stack Overflow

Looking for alternative to nested loops in Python

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related