I've got some data that looks like this:
>>> print totals.sample(4)
start end \
time region_type
2016-01-24 02:17:10.238 STACK GUARD 79940452352 79940665344
2016-01-23 20:14:17.043 MALLOC metadata 64688259072 64688996352
2016-01-22 23:20:53.752 IOKit 47857778688 47861174272
2016-01-23 08:17:06.561 __DATA 3711964667904 3711979212800
vsize rsdnt dirty swap
time region_type
2016-01-24 02:17:10.238 STACK GUARD 212992 0 0 0
2016-01-23 20:14:17.043 MALLOC metadata 737280 81920 81920 8192
2016-01-22 23:20:53.752 IOKit 3395584 24576 24576 3371008
2016-01-23 08:17:06.561 __DATA 14544896 4907008 618496 4780032
I want to know the region_type for any row where dirty+swap is greater than 1e7:
This works, but it seems pretty verbose:
>>> print totals[(totals.dirty + totals.swap) > 1e7].groupby(level='region_type').\
apply(lambda x: 'lol').index.tolist()
['MALLOC_NANO', 'MALLOC_SMALL']
Is there a better way?
I would have thought this would work, but it gives all the region_types in the data set, not the ones I selected:
totals[(totals.dirty + totals.swap) > 1e7].index.levels[1].tolist()