-1

I have a dataset "mapping_grouped" , in this data there is a column TrendIdentifier. Now for every entry in this dataset I need to filter the dataset RawDataSplit. Only few columns will be filtered column numbers are mentioned in the object code_match. To do this I am trying the below code, but I am doing something wrong while putting one loop into the other. I am not able to figure out the problem in this.

for (r in 1:nrow(mapping_grouped))
{
  current<-list()

  L1<-mapping_grouped[["TrendIdentifier"]][r]

  L1<-unlist(L1, use.names = FALSE)

  #code_match <- match(names(mastercodes), names(RawDataSplit))
  mcols<-code_match
  #mcols<-c(code_match[1]:code_match[ncol(mastercodes)])

  results_filter<-list()

  for (i in mcols) 
  { 
    filterdata<- RawDataSplit%>% filter(RawDataSplit[[i]]%in% L1)

    name_data<- paste("filterdata",i, sep = "_")
    results_filter[[name_data]] <- filterdata
  }

  filter_data<-Reduce(rbind,results_filter)

  filter_data$new_mastercode<- mapping_grouped[["Identifier"]][r]}

the datasets are:

> dput(mapping_grouped)
structure(list(Identifier = c("1000000", "1000076", "1000078", 
"1000079", "1000080", "1000081", "1000082", "1000083", "1000084", 
"1000085"), TrendIdentifier = list("1000000", "1000000", c("1001329", 
"1001340"), "1001340", "1000003", "1001126", "1001241", "1001348", 
    "1000310", "1000013")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -10L))

> dput(code_match)
8:10

> dput(RawDataSplit)
structure(list(identifier = c(9012286L, 9012294L, 9012296L, 9012297L, 
9012298L, 9012299L, 9012300L, 9012301L, 9012302L, 9012303L), 
    QID_1 = c(4L, 4L, 3L, 5L, 4L, 3L, 4L, 4L, 4L, 4L), QID_2 = c(4L, 
    2L, 1L, 2L, 4L, 1L, 4L, 4L, 2L, 1L), QID_3 = c(4L, 5L, 4L, 
    4L, 5L, 4L, 4L, 2L, 5L, 4L), QID_4 = c(4L, 4L, 4L, 4L, 4L, 
    4L, 4L, 4L, 4L, 1L), unitlevel = c(7, 5, 6, 5, 6, 7, 7, 6, 
    7, 5), mastercode_1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA), mastercode_2 = c(1000000L, 1000000L, 1000000L, 1000000L, 
    1000000L, 1000000L, 1000000L, 1000000L, 1000000L, 1000000L
    ), mastercode_3 = c(1001414L, 1000013L, 1001126L, 1001126L, 
    1000435L, 1000435L, 1000435L, 1000435L, 1000435L, 1000435L
    ), mastercode_4 = c(1001473L, 1000035L, 1001209L, 1001128L, 
    1000739L, 1000739L, 1000799L, 1000799L, 1000799L, 1000715L
    )), row.names = c(NA, -10L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x0000000000101ef0>)
1
  • Can you show your expected output? Commented Aug 20, 2019 at 12:33

1 Answer 1

1

Running your code I get:

Error in .subset2(x, i, exact = exact) : subscript out of bounds

Your inner loop is iterated over i in mcols, where mcols is assigned as mcols<-code_match and in the data you provided beforehand you have code_match <- 77:84 which means mcols is 77:84.

In the line

    filterdata<- RawDataSplit%>% filter(RawDataSplit[[i]]%in% L1)

you are then subsetting RawDataSplit with the elements from mcol but the data frame has just 10 Columns, so the error is created by trying to get a column that doesn't exist.

Sign up to request clarification or add additional context in comments.

1 Comment

Sorry, that's my fault. I shared a part of the Rawsplitdata and did not change code_match. Rawsplitdata is a very big file, so I shared a aprt of that. we can now change the code_match to c(8,9,10). Idea is we need to filter the mastercodes columns of Rawdatsplit, column numbers are in code_match.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.