Construct a new dataframe using np.repeat and np.range.
n = df.sum(1).max()
df_out = pd.DataFrame(np.repeat([np.arange(1,n+1)], len(df), axis=0), columns=np.arange(1,n+1))
df_out = (df_out.ge(df.a, axis=0) & df_out.le(df.sum(1), axis=0)).astype(int)
Out[233]:
1 2 3 4 5
0 1 1 1 0 0
1 0 0 1 1 1
2 1 1 1 1 0
Timing:
Surprisingly, it is faster than get_dummies on dataframe with big number of rows.
Sample:
df = pd.concat([df]*10000, ignore_index=True)
In [190]: %timeit df.apply(lambda x: '|'.join(map(str, range(x['a'], x['a'] + x['b'] + 1))), axis=1).str.get_dummies()
845 ms ± 3.02 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [244]: %%timeit
...: n = df.sum(1).max()
...: df_out = pd.DataFrame(np.repeat([np.arange(1,n+1)], len(df), axis=0), columns=np.arange(1,n+1))
...: (df_out.ge(df.a, axis=0) & df_out.le(df.sum(1), axis=0)).astype(int)
...:
...:
3.35 ms ± 5.95 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
@Shubham solution:
In [240]: %%timeit
...: m = np.arange(1, df.max().sum())
...: a = np.tile(m, (len(df), 1))
...: pd.DataFrame((df.to_numpy()[:, 0, None] <= a) &
...: (a <= df.sum(1).to_numpy()[:, None]), dtype='int', columns=m)
...:
1.79 ms ± 1.72 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)