1

Here is my dataframe:

structure(list(a = c(1, 1, -1, 1, 1, 1, -1, 1, 1, 1, 1)), .Names = "a", row.names = c(NA, 
-11L), class = c("tbl_df", "tbl", "data.frame"))

Now I want to add an identification column that will act like index:

I mean that I want to add a column that will start from id = 1 and each time there is -1 to set it to be id = 2 and so on: Expected:

structure(list(a = c(1, 1, -1, 1, 1, 1, -1, 1, 1, 1, 1), b = c(1, 
1, 2, 2, 2, 2, 3, 3, 3, 3, 3)), .Names = c("a", "b"), row.names = c(NA, 
-11L), class = c("tbl_df", "tbl", "data.frame"))

Using the solution from R add index column to data frame based on row values didn't work for my needs.

2 Answers 2

3

You can also do it like this. Just cumsum the logical vector created by a==-1 and add one to the result of that:

library(dplyr)

df1 %>%
  mutate(b = cumsum(a == -1) + 1)

or with Base R:

df1$b = cumsum(df1$a == -1) + 1

Result:

# A tibble: 11 x 2
       a     b
   <dbl> <dbl>
 1     1     1
 2     1     1
 3    -1     2
 4     1     2
 5     1     2
 6     1     2
 7    -1     3
 8     1     3
 9     1     3
10     1     3
11     1     3

Data:

df1 = structure(list(a = c(1, 1, -1, 1, 1, 1, -1, 1, 1, 1, 1)), .Names = "a", row.names = c(NA, 
-11L), class = c("tbl_df", "tbl", "data.frame"))
Sign up to request clarification or add additional context in comments.

2 Comments

Please correct me if I am wrong, cumsum will sum the -1 occurrences and and keep it same until another -1 comes in making idx be greated by 1? @useR
@steves Correct. a==-1 creates a logical vector of TRUE when a==-1 and FALSE when a != -1. When you apply cumsum to it, it starts with 0 because the first row is not -1 and only adds one if it encounters a TRUE. Since you wanted b to start with 1, I added 1 to the vector so all values are increased by 1
1

You can do it like this:

  1. create a new helper column, which has the value 1 in the first row and every time there is a -1.

  2. create the index column by using the cumsum function and delete the helper column

    library(dplyr)
    
    df %>%
      mutate(helper = ifelse(row_number()==1, 1, 
        ifelse(a == -1, 1, 0))) %>% 
      mutate(index = cumsum(helper)) %>%
      select(-helper)
    

1 Comment

BRAVO! SIMPLE, GENIUS STRAIGHT 2 THE POINT!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.