1

I have the following list of data frames.

list(structure(list(SrlNo = c(248L, 273L, 282L, 284L), VendorCode = c("V001889", 
"V000590", "V001578", "V001836"), SiteDetails = c("value add sites as per annexure", 
"milan et mega bus shelter", "backdrop", "black drop with black masking"
), City = c("", "", "", "")), .Names = c("SrlNo", "VendorCode", 
"SiteDetails", "City"), row.names = c("248", "273", "282", "284"
), class = "data.frame"), structure(list(SrlNo = 59135:59136, 
    VendorCode = c("V000072", "V000072"), SiteDetails = c("transportation charges-gst- rece cost for 24 paragana(south)", 
    "transportation charges-gst- rece  cost for 24 paragana (south)"
    ), City = c("24 Paragana(South)", "24 Paragana(South)")), .Names = c("SrlNo", 
"VendorCode", "SiteDetails", "City"), row.names = c("59127", 
"59128"), class = "data.frame"), structure(list(SrlNo = c(34595L, 
34609L, 34661L, 34678L), VendorCode = c("V002446", "V000931", 
"V000094", "V002240"), SiteDetails = c("taki road", "barasat flyover", 
"madhyamgram flyover fcg chowrasta", "madhyamgram bt college"
), City = c("24 Pargana North", "24 Pargana North", "24 Pargana North", 
"24 Pargana North")), .Names = c("SrlNo", "VendorCode", "SiteDetails", 
"City"), row.names = c("34587", "34601", "34653", "34670"), class = "data.frame"))

Using agrep I am trying to group similar sounding site detail together. I am using the following code:

for (e in ern) {
  x <- e$SiteDetails
  x <- x[x!=""]
  groups <- list()
  i <- 1
  while(length(x) > 0)
  {
    id <- agrep(x[1], x, ignore.case = TRUE, max.distance = 0.001)
    groups[[i]] <- x[id]
    x <- x[-id]
    i <- i + 1
  }
  Indx <- 1:length(groups)
  aa <- with(e, rep(Indx, vapply(groups, length, 1L)))
  bb <- unlist(groups)
  cc <- data.frame(aa,bb)
  #cbind(e, group=cc$aa[match(e$SiteDetails, cc$bb)])
  e$group <- cc$aa[match(e$SiteDetails, cc$bb)]
  #print(cc$aa[match(e$SiteDetails, cc$bb)])
  #print(e$VendorCode)
}

Using the above code, I am iterating through each of the data frames in the list, performing the grouping of site details column and I am able print the grouping values. However, when I want to attach the group back to the dataframe, there is no response / nor error. I am not able to create a new column named group from inside the for loop.

e$group <- cc$aa[match(e$SiteDetails, cc$bb)]

I have tried various combinations of this above line such as ern[[e]] and cbind but they are not working.

1 Answer 1

1

The e in your for loop has no connection with original ern list hence, it is not possible to add any new information in the list. You should iterate over the index of the list instead.

for (e in seq_along(ern)) {
  x <- ern[[e]]$SiteDetails
  x <- x[x!=""]
  groups <- list()
  i <- 1
  while(length(x) > 0)
  {
    id <- agrep(x[1], x, ignore.case = TRUE, max.distance = 0.001)
    groups[[i]] <- x[id]
    x <- x[-id]
    i <- i + 1
  }
  Indx <- 1:length(groups)
  aa <- with(ern[[e]], rep(Indx, vapply(groups, length, 1L)))
  bb <- unlist(groups)
  cc <- data.frame(aa,bb)
  ern[[e]]$group <- cc$aa[match(ern[[e]]$SiteDetails, cc$bb)]
}

This then returns a new column called group in each dataframe of ern.

ern
#[[1]]
#    SrlNo VendorCode                     SiteDetails City group
#248   248    V001889 value add sites as per annexure          1
#273   273    V000590       milan et mega bus shelter          2
#282   282    V001578                        backdrop          3
#284   284    V001836   black drop with black masking          4

#[[2]]
#      SrlNo VendorCode                                                    SiteDetails               City group
#59127 59135    V000072   transportation charges-gst- rece cost for 24 paragana(south) 24 Paragana(South)     1
#59128 59136    V000072 transportation charges-gst- rece  cost for 24 paragana (south) 24 Paragana(South)     2

#[[3]]
#      SrlNo VendorCode                       SiteDetails             City group
#34587 34595    V002446                         taki road 24 Pargana North     1
#34601 34609    V000931                   barasat flyover 24 Pargana North     2
#34653 34661    V000094 madhyamgram flyover fcg chowrasta 24 Pargana North     3
#34670 34678    V002240            madhyamgram bt college 24 Pargana North     4
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.