-1

I need to compare two columns which are in resulting data frame and those two columns are coming from a separate sources.

Now, I would like to compare them and have a resulting (tag) column based on their match.(YES or NO)

There are e_phone and Phones.

(ex. e_phone="881234567" and Phones="881234567; 665553331;")

They both can have multiple phone numbers separated by ";". Sometimes e_phone can have numbers like e_phone="0992767312; +550661234567". (which are more then 9 digit length)

  • I need to compare last nine digits of every of available phone number from e_phone in the Phones.

I already cleared Phones column because it comes from the sql, but I am new at the R so I really appreciate anyone's help.

At the end I would like to have add to the resulting data frame just a column called e_phone_match with values "YES" (if any number from e_phone exists in Phones) and "NO" if there isn't any match.

something like

Phones e_phone e_phone_match
881234567; 665553331; 881234567 YES

2 Answers 2

0

Using separate_rows():

    library(tidyr)
    library(dplyr)
    library(stringr)
    df <- tibble(
      Phones = c("728125443; 967720806;"),
      e_phone = c("524545312; +882728125443")
    )
    
    df |> 
      separate_rows(c(Phones), sep = ";", convert = TRUE) |> 
      filter(!is.na(Phones)) |> 
      separate_rows(c(e_phone), sep = ";", convert = TRUE) |> 
      mutate(across(c(Phones, e_phone), ~str_sub(as.character(.),-9)),
             match = Phones == e_phone)
      

Output:

# A tibble: 4 × 3
  Phones    e_phone   match
  <chr>     <chr>     <lgl>
1 728125443 524545312 FALSE
2 728125443 728125443 TRUE 
3 967720806 524545312 FALSE
4 967720806 728125443 FALSE
Sign up to request clarification or add additional context in comments.

4 Comments

And in case that ' Phones: "728125443; 967720806;" e_phone: "524545312; +882728125443" '
I adjusted the code according to your new example data
since I have the data frame that should have distinct rows, how can I incorporate it so I just add match column in it, since it already has phones and e_phone columns?
maybe add another filter? (filter(match == "TRUE"))
0

Thank you Julian for you help it gave me idea on how to proceed.

here is how I managed it

# Split the e_phone column into a list of vectors
e_phone_split <- strsplit(as.character(res$e_phone), ";")

# Clean and process phone numbers
clean_phone_number <- function(phone) {
  if (!is.na(phone) && phone != "") {
    cleaned <- gsub("[^0-9]", "", phone)  # Remove non-numeric characters
    if (nchar(cleaned) >= 9 && nchar(cleaned) <= 12) {
      cleaned <- substr(cleaned, nchar(cleaned) - 8, nchar(cleaned))  # Take the last nine figures
    } else {
      cleaned <- ""  # Empty string for invalid numbers
    }
    return(cleaned)
  } else {
    return("")
  }
}

# Apply the cleaning function to each element of the list
e_phone_cleaned <- lapply(e_phone_split, function(phone_list) sapply(phone_list, clean_phone_number))

# Create a new column 'e_phone_cleaned' in the original data frame
res$e_phone_cleaned <- sapply(e_phone_cleaned, function(phone_list) paste(phone_list, collapse = ";"))

# Remove ";" from the Phones column for easier comparison
res$Phones <- gsub(";", "", as.character(res$Phones))

# Initialize an empty vector to store match results
match_results <- numeric(nrow(res))

# Loop through each row and compare cleaned phone numbers
for (i in 1:nrow(res)) {
  if (is.na(res$e_phone[i]) || is.na(res$e_phone_cleaned[i]) || (res$Phones[i] == "" && res$e_phone_cleaned[i] == "")) {
    match_results[i] <- 0  # Don't match for NA or both empty
  } else {
    match_results[i] <- grepl(res$e_phone_cleaned[i], res$Phones[i], ignore.case = TRUE)
  }
}

# Add the match results to the data frame
res$enq_match_nr <- as.numeric(match_results)

# Add a column e_phoneone_match based on enq_match_nr
res$e_phoneone_match <- ifelse(res$enq_match_nr == 1, "YES", "NO")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.