How to make Slices from a Dataframe where Column Equals a Value

Question

I have two sets of csv data. One contains two columns (time and a boolean flag) and another data set which contains some info I have some graphing functions Id like to visually display. The data is sampled at different frequencies so the number of rows may not match for the datasets. How do I plot individual graphs for a range of data where the boolean is true?

Here is what the contact data looks like:

INDEX | TIME | CONTACT
0 | 240:18:59:31.750 | 0
1 | 240:18:59:32.000 | 0
2 | 240:18:59:32.250 | 0
........
1421 | 240:19:05:27.000 | 1
1422 | 240:19:05:27.250 | 1

The other (Vehicle) data isnt really important but contains values like Weight, Speed (MPH), Pedal Position etc.

I have many seperate large excel files and because the shapes do not match I am unsure how to slice the data using the time flags so I made a function below to create the ranges but I am thinking this can be done in an easier manner.

Here is the working code (with output below). In short, is there an easier way to do this?

def determineContactSlices(data):
contactStart = None
contactEnd = None
slices = pd.DataFrame([])
for index, row in data.iterrows():
    if row['CONTACT'] == 1:
        # begin slice
        if contactStart is None:
            contactStart = index
            continue
        else:
            # still valid, move onto next
            continue
    elif row['CONTACT'] == 0:
        if contactStart is not None:
            contactEnd = index - 1
            # create slice and add the df to list
            slice = data[contactStart:contactEnd]
            print(slice)
            slices = slices.append(slice)
             # then reset everything
            slice = None
            contactStart = None
            contactEnd = None
            continue
        else:
            # move onto next row
            continue
return slices

Output: ([15542 rows x 2 columns])

Index Time  CONTACT
1421   240:19:05:27.000        1
1422   240:19:05:27.250        1
1423   240:19:05:27.500        1
1424   240:19:05:27.750        1
1425   240:19:05:28.000        1
1426   240:19:05:28.250        1

            ...      ...
56815  240:22:56:15.500        1
56816  240:22:56:15.750        1
56817  240:22:56:16.000        1
56818  240:22:56:16.250        1
56819  240:22:56:16.500        1

With this output I intend to loop through each time slice and display the Vehicle Data in subplots.

Any help or guidance would be much appreciated (:

UPDATE:

I believe I can just do filteredData = vehicleData[contactData['CONTACT'] == 1] but then I am faced with how to go about graphing individually when there is a disconnect. For example if there are 7 connections at various times and lengths, I woud like to have 7 individual plots to graph.

If I understand you correctly, do you want new dfs, 1 with all contact=1 and one with all contact=0? — d_kennetz
– d_kennetz, Commented Dec 19, 2018 at 0:10
Sorry for not being clear. I am interested in the seperate index ranges for when contact is 1. I do not want one large dataframe where contact == 1. For example if the first contact was at 12 seconds and then disconnected at 15 seconds, the nanother from 30 seconds to 45 seconds then disconnected until 50 seconds to 55 seconds. The output time slice would be something line slices = [data[12:15], data[30:45], data[50:55]]. or something like that so I can iterrate through the time slices to output 3 graphs of the other dater within this timeframe — Minutia
– Minutia, Commented Dec 19, 2018 at 14:57
but my question is, do these correlate to contact == 1 or no? I do not know what a disconnect looks like in your data. If a disconnect occurs where contact changes from 1 to 0 and is disconnected up until contact == 1 again this is a fairly easy solution that can be done in a line — d_kennetz
– d_kennetz, Commented Dec 19, 2018 at 15:03
The other (vehicle) data does not directly correlate with the contact. Your description about the contact data is absolutely correct. — Minutia
– Minutia, Commented Dec 19, 2018 at 15:09

Allen P. · Accepted Answer · 2018-12-19 00:21:37Z

1

I think what you are trying to do is relatively simple, although I am not sure if I understand the output that you want or what you want to do with it after you have it. For example:

contact_df = data[data['CONTACT'] == 1]
non_contact_df = data[data['CONTACT'] == 0]

If this isn't helpful, please provide some additional details as to what the output should look like and what you plan to do with it after it is created.

answered Dec 19, 2018 at 0:21

Allen P.

231 silver badge3 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Minutia Over a year ago

This is similar to what I want to do. But instead of one large dataframe, Id like individual time slices. My other comment may have a better explanation for the expected output.

Leo Ma · Accepted Answer · 2022-10-12 18:30:08Z

0

Old question but why not:

sliceStart_index = df[ df["date"]=="2012-12-28" ].index.tolist()[0]
sliceEnd_index = df[ df["date"]=="2013-01-10" ].index.tolist()[0]

this_is_your_slice = df.iloc[sliceStart_index  : sliceEnd_index]

first two lines actually get you a list of indexes where the condition is met, I just chose the first ones for example.

answered Oct 12, 2022 at 18:30

Leo Ma

1,1613 gold badges14 silver badges16 bronze badges

Collectives™ on Stack Overflow

How to make Slices from a Dataframe where Column Equals a Value

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related